Uplink 2.0 Upgrade Plan¶
Goal: Modernize the Uplink infrastructure with minimal disruption to production services.
Approach: Incremental upgrades inspired by the Greenfrog project architecture.
Table of Contents¶
- Current State
- Target State
- Upgrade Strategy
- Plan Revision: Why Dockerize First
- Phase 1: Containerization with uv (COMBINED)
- Phase 2: Python & Django Upgrades
- Phase 2.5: Security Hardening
- Phase 3: Web Server & Reverse Proxy (nginx)
- Phase 4: Process Management (systemd)
- Phase 5: VM Migration (Ubuntu 24.04 LTS)
- Phase 6: Monitoring & Observability
- Phase 7: Health Checks & System Monitoring
- Rollback Plans
- Timeline & Dependencies
- Appendix A: CI/CD Pipeline Setup
- Appendix B: Environment Strategy
- Appendix C: Testing Strategy
- Appendix D: Deployment Procedures
- Appendix E: New Server Setup Log
Current State¶
Infrastructure¶
- Operating System: Ubuntu 18.04 LTS (outdated, EOL May 2023)
- Python Version: 3.9.x
- Django Version: 3.2.x LTS
- Dependency Management: Pipenv
- Process Management: CRON for scheduled tasks
- Deployment: Manual git pull + restart
- Containerization: None
Pain Points¶
- Manual deployment process prone to errors
- Outdated OS lacks security updates
- CRON lacks robust logging and error handling
- No containerization makes environment replication difficult
- Pipenv slower than modern alternatives
Target State¶
Infrastructure (Greenfrog-Inspired)¶
- Operating System: Ubuntu 24.04 LTS
- Python Version: 3.12+ (latest stable)
- Django Version: 5.x (latest LTS)
- Dependency Management:
uv(fast, modern Python package manager) - Process Management: systemd services for web server, workers, scheduled tasks
- Deployment: Automated shell scripts + Docker
- Containerization: Docker with docker-compose
- CI/CD: Automated testing and deployment scripts
Benefits¶
- ✅ Faster deployments with automated scripts
- ✅ Better reliability with systemd instead of CRON
- ✅ Easier development with Docker containers
- ✅ Security updates from Ubuntu 24.04 LTS (supported until 2029)
- ✅ Performance improvements from Python 3.12+ and modern Django
- ✅ Simplified onboarding for new developers
Upgrade Strategy¶
Key Principles¶
- Incremental changes - One phase at a time
- Test extensively - Each phase tested in development before production
- Minimal disruption - Most changes can be done during normal hours
- Quick rollback - Each phase has a clear rollback procedure
- Weekend VM migration - Final infrastructure switch during low-traffic period
Testing Environment¶
Before any production deployment: 1. Test in local development environment 2. Test in staging/QA environment (if available) 3. Document any issues and solutions 4. Create rollback plan 5. Get team approval
Plan Revision: Why Dockerize First¶
Date: February 2, 2026
Decision: Combine Phase 1 (uv) and Phase 3 (Docker) into a single phase
Problem Identified¶
During Phase 1 implementation, we encountered critical issues:
- System-wide Python package installations interfering with virtual environments
- Package version conflicts between pipenv and uv environments
- Django detecting phantom model changes due to package version mismatches
- /usr/local/lib/python3.10/dist-packages/ pollution breaking venv isolation
Why Docker Solves This¶
- Clean Environment: Docker starts from a fresh base image, eliminating all system package pollution
- Dev/Production Parity: Test the exact same environment locally that runs in production
- No Virtual Environment Conflicts: Docker containers are already isolated
- Reproducible Builds: Lock file ensures identical dependencies every time
- Better Testing: Can fully test uv before production deployment
Revised Approach¶
Instead of:
We now do:
Benefits: - Skip painful venv troubleshooting - Test uv in production-like environment from day one - Start with current Python 3.9 + Django 3.2 (safe) - Upgrade Python/Django later inside container (controlled) - Matches Greenfrog architecture from the start
Phase 1: Containerization with uv (COMBINED)¶
Goal: Dockerize the application using uv for dependency management
Timeline: 2-3 weeks
Can be done: Anytime during normal hours
Status: 🔄 IN PROGRESS (as of Feb 2, 2026)
Why This Phase is First¶
Containerization provides: - Clean slate for uv migration (no system package conflicts) - Identical dev/prod environments - Easy rollback (just switch Docker tags) - Foundation for all future upgrades
Steps¶
Steps¶
1.1 Prepare pyproject.toml¶
# Ensure pyproject.toml exists with all dependencies
# Should be based on existing Pipfile but with exact versions pinned
[project]
name = "uplink"
version = "2.0.0"
requires-python = ">=3.9"
dependencies = [
"django~=4.2",
"mysqlclient",
"requests",
"django-environ",
"django-bootstrap-v5",
"django-money",
"django-extensions",
"django-debug-toolbar",
"django-import-export",
"huey",
"redis",
"djangorestframework",
"markdown",
"ppf-datamatrix",
"channels~=3.0",
"channels-redis",
"django-admin-sortable2",
"django-filter",
"sentry-sdk",
"weasyprint==52.5",
"sync",
]
[project.optional-dependencies]
dev = [
"pre-commit",
"honcho",
"black",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
1.2 Create Dockerfile (Production-Ready)¶
Create Dockerfile in project root:
# Dockerfile
FROM python:3.9-slim
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
UV_SYSTEM_PYTHON=1
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
default-libmysqlclient-dev \
pkg-config \
curl \
&& rm -rf /var/lib/apt/lists/*
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
# Set working directory
WORKDIR /app
# Copy dependency files first (for layer caching)
COPY pyproject.toml ./
COPY .python-version* ./
# Install Python dependencies using uv
RUN uv pip install --system -e .
# Copy application code
COPY . .
# Collect static files (will be overridden by volume in dev)
RUN python manage.py collectstatic --noinput --clear || true
# Create directory for media files
RUN mkdir -p /app/media
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD python manage.py check --deploy || exit 1
# Default command (override in docker-compose)
CMD ["gunicorn", "uplink.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "3", "--timeout", "60"]
1.3 Create docker-compose.yml (Development)¶
Create docker-compose.yml:
version: '3.8'
services:
db:
image: mysql:8.0
container_name: uplink_db
environment:
MYSQL_DATABASE: ${DB_NAME:-uplink}
MYSQL_USER: ${DB_USER:-uplink}
MYSQL_PASSWORD: ${DB_PASSWORD:-uplinkpass}
MYSQL_ROOT_PASSWORD: ${DB_ROOT_PASSWORD:-rootpass}
volumes:
- mysql_data:/var/lib/mysql
ports:
- "3306:3306"
networks:
- uplink_network
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
container_name: uplink_redis
ports:
- "6379:6379"
networks:
- uplink_network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
web:
build:
context: .
dockerfile: Dockerfile
container_name: uplink_web
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/media
ports:
- "8000:8000"
env_file:
- .env
environment:
- DEBUG=True
- DATABASE_HOST=db
- REDIS_URL=redis://redis:6379/0
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
networks:
- uplink_network
restart: unless-stopped
daphne:
build:
context: .
dockerfile: Dockerfile
container_name: uplink_daphne
command: daphne -b 0.0.0.0 -p 9000 uplink.asgi:application
volumes:
- .:/app
ports:
- "9000:9000"
env_file:
- .env
environment:
- DEBUG=True
- DATABASE_HOST=db
- REDIS_URL=redis://redis:6379/0
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
networks:
- uplink_network
restart: unless-stopped
huey:
build:
context: .
dockerfile: Dockerfile
container_name: uplink_huey
command: python manage.py run_huey
volumes:
- .:/app
env_file:
- .env
environment:
- DEBUG=True
- DATABASE_HOST=db
- REDIS_URL=redis://redis:6379/0
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
networks:
- uplink_network
restart: unless-stopped
volumes:
mysql_data:
static_volume:
media_volume:
networks:
uplink_network:
driver: bridge
1.4 Create docker-compose.prod.yml (Production Override)¶
Create docker-compose.prod.yml:
version: '3.8'
services:
web:
command: gunicorn uplink.wsgi:application --bind 0.0.0.0:8000 --workers 4 --timeout 60 --access-logfile - --error-logfile -
environment:
- DEBUG=False
volumes:
# Remove source code volume mount in production
- static_volume:/app/static
- media_volume:/app/media
restart: always
daphne:
command: daphne -b 0.0.0.0 -p 9000 uplink.asgi:application --access-log - --proxy-headers
environment:
- DEBUG=False
volumes:
# Remove source code volume mount in production
- static_volume:/app/static
- media_volume:/app/media
restart: always
huey:
environment:
- DEBUG=False
volumes:
# Remove source code volume mount in production
- /app/media:/app/media
restart: always
1.5 Create .dockerignore¶
Create .dockerignore:
# Git
.git
.gitignore
# Python
__pycache__
*.pyc
*.pyo
*.pyd
.Python
*.so
*.egg
*.egg-info
dist
build
# Virtual environments
.venv
venv/
ENV/
env/
.virtualenv
# Environment files
.env.local
.env.*.local
# Database
db.sqlite3
*.db
# Logs
*.log
logs/
# OS files
.DS_Store
Thumbs.db
# IDE
.vscode/
.idea/
*.swp
*.swo
# Node modules (if any)
node_modules/
# Test coverage
htmlcov/
.coverage
.coverage.*
.pytest_cache/
# Docker
Dockerfile*
docker-compose*.yml
.dockerignore
# Deployment scripts
deploy.sh
1.6 Create docker-entrypoint.sh¶
Create docker-entrypoint.sh:
#!/bin/bash
set -e
echo "Waiting for database..."
while ! nc -z $DATABASE_HOST 3306; do
sleep 0.1
done
echo "Database is ready!"
echo "Running migrations..."
python manage.py migrate --noinput
echo "Collecting static files..."
python manage.py collectstatic --noinput --clear
# Execute the main command
exec "$@"
Make it executable:
1.7 Update .env.example¶
Create .env.example for documentation:
# Django Core
DEBUG=False
SECRET_KEY=change-me-to-a-random-50-character-string
ALLOWED_HOSTS=localhost,127.0.0.1,uplink.sensational.systems
# Database (for Docker)
DB_NAME=uplink
DB_USER=uplink
DB_PASSWORD=secure-password-here
DB_ROOT_PASSWORD=secure-root-password-here
DATABASE_HOST=db
DATABASE_PORT=3306
# Redis
REDIS_URL=redis://redis:6379/0
# Sentry
SENTRY_DSN=https://your-sentry-dsn-here
# Email (Mailgun)
MAILGUN_URL=https://api.mailgun.net/v3/your-domain
MAILGUN_API_KEY=your-mailgun-api-key
MAILGUN_REPLY_TO_EMAIL=noreply@yourdomain.com
# External APIs
FEDEX_PROD_CLIENT_ID=
FEDEX_PROD_CLIENT_SECRET=
FEDEX_PROD_ACCOUNT_NO=
PRESTASHOP_BASE_URL=
PRESTASHOP_API_KEY=
[build-system] requires = ["hatchling"] build-backend = "hatchling.build"
#### 1.3 Create Virtual Environment with uv
```bash
# Create virtual environment
uv venv
# Activate virtual environment
source .venv/bin/activate
# Install dependencies
uv pip install -e ".[dev]"
1.4 Update Documentation¶
- Update README with uv installation instructions
- Update DEPLOYMENT.md with new setup commands
- Create
.python-versionfile for version tracking
1.5 Test Thoroughly¶
# Run migrations
python manage.py migrate
# Run tests
python manage.py test
# Start development server
python manage.py runserver
# Verify all functionality works
1.6 Update .gitignore¶
Rollback Plan¶
- Keep Pipfile and Pipfile.lock in repository
- If issues arise, run
pipenv installto revert - Document any incompatibilities found
Completion Checklist¶
- [ ] All dependencies install successfully with uv
- [ ] Application runs without errors
- [ ] All tests pass
- [ ] Development team can use uv successfully
- [ ] Documentation updated
Phase 2: Python & Django Upgrades¶
Goal: Upgrade to Python 3.12.3 and Django 5.2.12 LTS
Timeline: 3-4 weeks
Can be done: Anytime, but requires extensive testing
Current State (March 2026): - Local dev: Python 3.10.12, Django 3.2.25 - Production: Python 3.12.3, Django 3.2.x - Docker: Python 3.12.3, Django 3.2.25 (after Phase 1 updates)
Target State: - All environments: Python 3.12.3, Django 5.2.12 LTS
Note: Python 3.14 doesn't exist yet. Python 3.12.3 is the latest stable 3.12.x release and is already running on production, making it the ideal standardization target.
Phase 2A: Python 3.12.3 Alignment (All Environments)¶
Phase 2A: Python 3.12.3 Alignment (All Environments)¶
Goal: Standardize all environments on Python 3.12.3 (already on production)
2A.1 Update Configuration Files (DONE)¶
Already updated:
- ✅ .python-version → 3.12.3
- ✅ pyproject.toml → requires-python = ">=3.12"
- ✅ Dockerfile → FROM python:3.12-slim (uses latest 3.12.x, e.g., 3.12.3+)
2A.2 Install Python 3.12.3 on Local Development Machine¶
# On Ubuntu/Debian
sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.12 python3.12-venv python3.12-dev
# Verify installation
python3.12 --version # Should show: Python 3.12.3
# Set as default (optional)
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1
2A.3 Recreate Virtual Environment with Python 3.12.3¶
# Remove old virtual environment
rm -rf .venv
# Create new venv with Python 3.12.3
uv venv --python 3.12.3
# Activate
source .venv/bin/activate
# Install current dependencies (still Django 3.2)
uv pip install -e ".[dev]"
2A.4 Test Python 3.12.3 with Django 3.2¶
# Run migrations
python manage.py migrate
# Check for deprecation warnings
python -Wd manage.py check
# Run full test suite
python manage.py test
# Test development server
python manage.py runserver
# Test Docker build
docker compose -f docker-compose.yml -f docker-compose.dev.yml build
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
docker compose -f docker-compose.yml -f docker-compose.dev.yml exec web python manage.py check
2A.5 Fix Python 3.12.3 Compatibility Issues¶
Common issues upgrading from 3.9/3.10 to 3.12:
1. Removed distutils module
# Old (Python ≤3.11)
from distutils.version import StrictVersion
# New (Python 3.12+)
from packaging.version import Version
# Add to pyproject.toml if not present
dependencies = [
# ... existing deps
"packaging",
]
2. Updated deprecation warnings
# Check for warnings
python -Wd manage.py test 2>&1 | grep DeprecationWarning
# Common patterns to fix:
# - datetime.utcnow() → datetime.now(timezone.utc)
# - asyncore/asynchat removed → use asyncio
3. Type hinting improvements
# Old (can fail in 3.12)
from typing import List, Dict
def process(items: List[str]) -> Dict[str, int]:
pass
# New (preferred in 3.12)
def process(items: list[str]) -> dict[str, int]:
pass
Phase 2B: Django 4.2 LTS Upgrade (Bridge Version)¶
2B.1 Update Django Version¶
2B.2 Review Django 4.x Release Notes¶
2B.3 Common Upgrade Tasks¶
# Update settings.py
ALLOWED_HOSTS = ['localhost', '127.0.0.1', 'your-domain.com']
# Update CSRF_TRUSTED_ORIGINS for Django 4.0+
CSRF_TRUSTED_ORIGINS = ['https://your-domain.com']
# Check for removed features
# - django.conf.urls.url() → django.urls.re_path()
# - Update any deprecated imports
2B.4 Update URL Patterns¶
# Old (Django 3.2)
from django.conf.urls import url
urlpatterns = [
url(r'^articles/(?P<year>[0-9]{4})/$', views.year_archive),
]
# New (Django 4.2)
from django.urls import re_path
urlpatterns = [
re_path(r'^articles/(?P<year>[0-9]{4})/$', views.year_archive),
]
2B.5 Test Django 4.2¶
# Run migrations
python manage.py migrate
# Check for migration issues
python manage.py makemigrations --check
# Run tests
python manage.py test
# Test in development
python manage.py runserver
2B.6 Third-Party Library Compatibility & Dependency Conflicts¶
Critical Consideration: Django version upgrades often require coordinated updates of third-party packages to avoid version conflicts.
Python Version Requirements¶
| Django Version | Python Support | Notes |
|---|---|---|
| Django 3.2 LTS | 3.6, 3.7, 3.8, 3.9, 3.10 | Current version |
| Django 4.2 LTS | 3.8, 3.9, 3.10, 3.11, 3.12 | Target intermediate |
| Django 5.0+ | 3.10, 3.11, 3.12 | Final target |
Key Takeaway: Django 4.2 is the bridge version that supports both Python 3.9 (current) and Python 3.12 (target).
Common Third-Party Package Compatibility Issues¶
Based on Django documentation and the Uplink project's dependencies, here are the critical packages to monitor:
1. Database Drivers
| Package | Django 3.2 | Django 4.2 | Django 5.0+ | Notes |
|---|---|---|---|---|
mysqlclient |
≥1.4.0 | ≥1.4.3 | ≥1.4.3 | Required update |
psycopg2 |
≥2.8 | ≥2.8 | ≥2.8 | If using PostgreSQL |
psycopg (v3) |
Not supported | ≥3.1.8 | ≥3.1.8 | Django 4.2+ recommended |
Current Action: Uplink uses MySQL. Ensure mysqlclient is ≥1.4.3 before Django 4.2 upgrade.
2. Django REST Framework
| Package | Django 3.2 | Django 4.2 | Django 5.0+ | Notes |
|---|---|---|---|---|
djangorestframework |
≥3.12 | ≥3.14 | ≥3.14 | Required update |
Known Issue: DRF < 3.14 has compatibility issues with Django 4.2's new features.
3. Channels & Async Support
| Package | Django 3.2 | Django 4.2 | Django 5.0+ | Notes |
|---|---|---|---|---|
channels |
~=3.0 | ~=4.0 | ~=4.0 | Breaking changes |
asgiref |
≥3.2.10 | ≥3.6.0 | ≥3.7.0 | Required update |
Breaking Change: Channels 4.0 requires significant code changes for consumer patterns.
4. Password Hashing
| Package | Django 3.2 | Django 4.2 | Django 5.0+ | Notes |
|---|---|---|---|---|
argon2-cffi |
≥19.1.0 | ≥19.2.0 | ≥19.2.0 | Required update |
5. Other Critical Dependencies
| Package | Django 3.2 | Django 4.2 | Django 5.0+ | Uplink Status |
|---|---|---|---|---|
Pillow |
≥6.2.0 | ≥6.2.1 | ≥6.2.1 | Used for ImageField |
sqlparse |
≥0.2.2 | ≥0.3.1 | ≥0.3.1 | SQL formatting |
redis |
≥3.0.0 | ≥3.4.0 | ≥3.4.0 | Huey backend |
jinja2 |
≥2.9.2 | ≥2.11.0 | ≥2.11.0 | If using Jinja templates |
Deprecated Packages in Django 5.0+¶
The following packages/features are removed in Django 5.0:
pytztimezone support - Removed in Django 5.0- Action Required: Migrate to
zoneinfo(Python 3.9+) orbackports.zoneinfo -
Impact: All timezone-aware datetime handling
-
python-memcached- Deprecated in Django 4.2 - Action Required: Switch to
pymemcacheorpylibmc -
Impact: If using MemcachedCache backend
-
Legacy hash algorithms - Removed in Django 5.0
SHA1PasswordHasher,UnsaltedSHA1PasswordHasher,UnsaltedMD5PasswordHasher- Action Required: Use PBKDF2 or Argon2
Will uv Resolve These Conflicts?¶
Yes, but with caveats:
What uv WILL do:
- ✅ Fast dependency resolution - uv resolves dependencies 10-100x faster than pip
- ✅ Conflict detection - Immediately identifies incompatible version requirements
- ✅ Lock file integrity - uv.lock ensures reproducible builds
- ✅ Automatic constraint solving - Finds compatible versions across all packages
What uv CANNOT do:
- ❌ Fix breaking API changes - If Channels 4.0 changes consumer syntax, you must update code manually
- ❌ Resolve impossible constraints - If Package A requires Django <4.0 and Package B requires Django >=5.0, no tool can resolve this
- ❌ Auto-migrate deprecated features - Code using pytz must be manually updated to zoneinfo
How uv Helps with Upgrades:
-
Pre-upgrade audit:
-
Incremental testing:
-
Lock file strategy:
Pre-Upgrade Dependency Audit¶
Before upgrading to Django 4.2, run this audit:
# 1. Check current versions
uv pip list | grep -E "django|mysqlclient|channels|asgiref|argon2|rest_framework"
# 2. Check for deprecated packages
python manage.py check --deploy
# 3. Test Django 4.2 compatibility in isolated environment
uv venv --python 3.10 .venv-test
source .venv-test/bin/activate
uv pip install django~=4.2
uv pip install -e ".[dev]"
# Review any conflicts reported by uv
# 4. Check for pytz usage (must remove before Django 5.0)
grep -r "import pytz" --include="*.py" .
grep -r "from pytz" --include="*.py" .
# 5. Check for deprecated password hashers
grep -r "SHA1PasswordHasher\|UnsaltedMD5PasswordHasher" --include="*.py" .
Recommended Upgrade Path (UPDATED for 3.12.3 → Django 5.2.12)¶
Phase 2A: Python 3.12.3 Alignment - ✅ Production already on Python 3.12.3 - ✅ Low risk: Python 3.12 + Django 3.2 is fully compatible - ✅ Standardizes all environments - ✅ Easy rollback
Phase 2B: Python 3.12.3 + Django 4.2 LTS (Bridge Version)
- ⚠️ Moderate risk for conflicts
- Update these packages before upgrading Django:
1. mysqlclient to ≥1.4.3
2. djangorestframework to ≥3.14
3. asgiref to ≥3.6.0
4. channels to ~=4.0 (test thoroughly, breaking changes)
5. sqlparse to ≥0.3.1
6. redis to ≥3.4.0
- Use uv pip compile to generate compatible lock file
- Must be stable in production for 2+ weeks before Phase 2C
Phase 2C: Python 3.12.3 + Django 5.2.12 LTS (Final Target)
- ⚠️ Critical: Remove pytz before this phase
- ⚠️ Critical: Update password hashers
- ⚠️ Test Channels 4.x thoroughly with Django 5.2
- Update packages:
1. asgiref to ≥3.7.0
2. channels - verify Django 5.2 compatibility
3. All Django plugins must support Django 5.2
- Django 5.2 is LTS (supported until April 2029)
- Use uv's lock file for reproducible builds
Action Items¶
Before Phase 2B: - [ ] Audit current package versions against Django 4.2 requirements - [ ] Identify packages that need updates - [ ] Test Channels 4.0 upgrade in isolation (major breaking changes) - [ ] Create uv lock file for Django 4.2 environment - [ ] Document any custom code that uses deprecated features
Before Phase 2C:
- [ ] Replace all pytz usage with zoneinfo
- [ ] Update password hashers configuration
- [ ] Verify all packages support Python 3.12 + Django 5.2
- [ ] Create uv lock file for Django 5.2.12 + Python 3.12.3
- [ ] Test Channels 4.x with Django 5.2 in development
Continuous Monitoring: - [ ] Check Django release notes for third-party compatibility updates - [ ] Monitor package changelogs on PyPI - [ ] Test lock file resolution weekly during upgrade phases
Phase 2C: Django 5.2.12 LTS Upgrade (Final Target)¶
Goal: Upgrade to Django 5.2.12 LTS with Python 3.12.3
Prerequisites: Phase 2B stable in production for 2+ weeks
Timeline: 2-3 weeks
Django 5.2 LTS Support: Until April 2029 (3 years of support)
2C.1 Pre-Upgrade Checklist¶
Critical Removals (Django 5.0+):
# 1. Check for pytz usage (REMOVED in Django 5.0)
grep -r "import pytz" --include="*.py" .
grep -r "from pytz" --include="*.py" .
# 2. Check for legacy password hashers (REMOVED in Django 5.0)
grep -r "SHA1PasswordHasher\|UnsaltedMD5PasswordHasher" --include="*.py" .
# 3. Check for old URL patterns (should be done in Phase 2B)
grep -r "django.conf.urls.url" --include="*.py" .
Replace pytz with zoneinfo:
# Old (Django ≤4.2)
import pytz
from django.utils import timezone
tz = pytz.timezone('America/New_York')
dt = timezone.now().astimezone(tz)
# New (Django 5.0+)
from zoneinfo import ZoneInfo
from django.utils import timezone
tz = ZoneInfo('America/New_York')
dt = timezone.now().astimezone(tz)
2C.2 Update Django to 5.2.12¶
# Install Django 5.2.12 in development
uv pip install "django~=5.2"
# Check what version was installed
python -m django --version # Should show: 5.2.12
# Verify all dependencies resolve
uv pip install -e ".[dev]"
2C.3 Review Django 5.x Release Notes¶
Required reading: - Django 4.2 to 5.0 upgrade guide - Django 5.0 to 5.1 upgrade guide - Django 5.1 to 5.2 upgrade guide - Django 5.2 release notes
2C.4 Major Changes in Django 5.0+¶
1. Database Backend Requirements
# settings.py - Update if using custom database settings
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'OPTIONS': {
'charset': 'utf8mb4',
'use_unicode': True,
'init_command': "SET sql_mode='STRICT_TRANS_TABLES'",
# Django 5.0+ recommended settings
'isolation_level': 'read committed',
}
}
}
2. Model Field Changes
# Django 5.0+ uses db_default for database-level defaults
from django.db import models
from django.db.models.functions import Now
class MyModel(models.Model):
# Old (Django ≤4.2)
created = models.DateTimeField(auto_now_add=True)
# New (Django 5.0+) - more explicit
created = models.DateTimeField(db_default=Now())
3. ASGI/Channels Changes
# Check asgi.py for Django 5.x compatibility
# Django 5.0+ ASGI application
import os
from django.core.asgi import get_asgi_application
from channels.routing import ProtocolTypeRouter, URLRouter
from channels.auth import AuthMiddlewareStack
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'uplink.settings')
django_asgi_app = get_asgi_application()
# Ensure correct import paths
from devices import routing
application = ProtocolTypeRouter({
"http": django_asgi_app,
"websocket": AuthMiddlewareStack(
URLRouter(routing.websocket_urlpatterns)
),
})
4. Admin Site Changes
# Django 5.0+ admin improvements
from django.contrib import admin
class MyModelAdmin(admin.ModelAdmin):
# New: list_display_links=None makes entire row clickable
list_display_links = None
# New: show_facets for better filtering
show_facets = admin.ShowFacets.ALWAYS
2C.5 Update Third-Party Packages for Django 5.2¶
# Check current versions
uv pip list | grep -E "django-|channels|drf"
# Expected updates needed:
# - djangorestframework >= 3.14 (already done in Phase 2B)
# - channels >= 4.0 (already done in Phase 2B)
# - django-bootstrap-v5 - check compatibility
# - django-money - check compatibility
# - django-extensions - check compatibility
# - django-debug-toolbar >= 4.0
# - django-import-export >= 3.0
# - django-admin-sortable2 - check compatibility
# - django-filter >= 22.1
# - django-storages >= 1.14
# Test installation
uv pip install -e ".[dev]"
2C.6 Run Django System Checks¶
# Check for issues
python manage.py check
# Check for deployment issues
python manage.py check --deploy
# Check for migration issues
python manage.py makemigrations --check --dry-run
# Look for deprecation warnings
python -Wd manage.py check
2C.7 Test Migrations¶
# Create a database backup first!
# Then test migrations
# Show migration plan
python manage.py showmigrations
# Run migrations
python manage.py migrate
# Check for any issues
python manage.py migrate --plan
2C.8 Update settings.py for Django 5.2¶
# In settings.py - Django 5.2 specific settings
# Forms rendering (Django 5.0+)
FORM_RENDERER = 'django.forms.renderers.DjangoTemplates'
# Security settings (enhanced in Django 5.x)
if not DEBUG:
SECURE_CONTENT_TYPE_NOSNIFF = True
SECURE_BROWSER_XSS_FILTER = True
X_FRAME_OPTIONS = 'DENY'
# Django 5.0+ HTTPS settings
SECURE_SSL_REDIRECT = True
SESSION_COOKIE_SECURE = True
CSRF_COOKIE_SECURE = True
# Django 5.0+ HSTS settings
SECURE_HSTS_SECONDS = 31536000 # 1 year
SECURE_HSTS_INCLUDE_SUBDOMAINS = True
SECURE_HSTS_PRELOAD = True
# Database connection pooling (Django 5.0+)
CONN_MAX_AGE = 600 # 10 minutes
# Logging improvements (Django 5.0+)
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'verbose': {
'format': '{levelname} {asctime} {module} {message}',
'style': '{',
},
},
# ... rest of logging config
}
2C.9 Test Thoroughly¶
# Run full test suite
python manage.py test
# Test specific apps
python manage.py test catalogue
python manage.py test orders
python manage.py test stock
# Test in Docker
docker compose -f docker-compose.yml -f docker-compose.dev.yml build
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
docker compose -f docker-compose.yml -f docker-compose.dev.yml exec web python manage.py migrate
docker compose -f docker-compose.yml -f docker-compose.dev.yml exec web python manage.py test
# Test admin interface
python manage.py runserver
# Navigate to /admin and test all models
# Test API endpoints
# Test WebSocket connections (devices app)
# Test background tasks (Huey)
2C.10 Update Documentation¶
# Update README.md with new Django version
# Update requirements or dependencies documentation
# Update deployment guides
# Update API documentation if needed
Rollback Plan (Phase 2)¶
Git Branches:
- main - Current stable (Django 3.2.25, Python 3.9/3.10/3.12.3 mixed)
- upgrade-python-3.12 - Phase 2A (Python 3.12.3, Django 3.2.25)
- upgrade-django-4.2 - Phase 2B (Python 3.12.3, Django 4.2.x)
- upgrade-django-5.2 - Phase 2C (Python 3.12.3, Django 5.2.12)
Rollback Procedure:
From Phase 2C (Django 5.2) back to Phase 2B (Django 4.2):
# 1. Stop all services
docker compose -f docker-compose.yml -f docker-compose.prod.yml down
# 2. Checkout previous branch
git checkout upgrade-django-4.2
# 3. Restore database backup
mysql -u uplink -p uplink < backup-before-django-52.sql
# 4. Rebuild containers
docker compose -f docker-compose.yml -f docker-compose.prod.yml build
# 5. Start services
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# 6. Verify
docker compose -f docker-compose.yml -f docker-compose.prod.yml exec web python -m django --version
From Phase 2B (Django 4.2) back to Phase 2A (Django 3.2):
# Same procedure as above, but:
git checkout upgrade-python-3.12
mysql -u uplink -p uplink < backup-before-django-42.sql
Database Backup Schedule: - Before each phase upgrade, create timestamped backup - Keep backups for 30 days minimum - Test restore procedure before upgrade
Completion Checklist (Phase 2)
Phase 2A (Python 3.12.3): - [ ] All tests pass on Python 3.12.3 locally - [ ] Docker builds successfully with Python 3.12.3 - [ ] Application runs without errors in development - [ ] No Python 3.12 deprecation warnings - [ ] Production deployment successful - [ ] Stable in production for 1+ weeks
Phase 2B (Django 4.2 LTS): - [ ] All third-party packages updated to Django 4.2 compatible versions - [ ] All tests pass on Django 4.2 LTS locally - [ ] Migrations run successfully - [ ] No Django 4.2 deprecation warnings - [ ] Admin interface fully functional - [ ] API endpoints working correctly - [ ] WebSockets/Channels working correctly - [ ] Background tasks (Huey) working correctly - [ ] Docker builds and runs successfully - [ ] Production deployment successful - [ ] Stable in production for 2+ weeks before Phase 2C
Phase 2C (Django 5.2.12 LTS):
- [ ] All pytz usage replaced with zoneinfo
- [ ] All deprecated features from Django 4.2 updated
- [ ] All third-party packages support Django 5.2
- [ ] All tests pass on Django 5.2.12 locally
- [ ] Migrations run successfully
- [ ] No Django 5.2 deprecation warnings
- [ ] Admin interface fully functional with Django 5.2 features
- [ ] API endpoints working correctly
- [ ] WebSockets/Channels working with Django 5.2
- [ ] Background tasks (Huey) working correctly
- [ ] Static files serving correctly
- [ ] Media files uploading/serving correctly
- [ ] Docker builds and runs successfully
- [ ] Production deployment successful
- [ ] Performance testing shows no regressions
- [ ] Stable in production for 2+ weeks before next phase
Phase 2.5: Security Hardening¶
Goal: Implement production-grade security settings and practices
Timeline: 1 week
Can be done: Anytime during normal hours
Prerequisites: Phase 2C complete (Django 5.x)
Why This Phase is Critical¶
Before containerization and production deployment, we must secure the application against common vulnerabilities. Django provides extensive security features, but they must be properly configured.
2.5.1 Move SECRET_KEY to Environment Variables¶
Current Issue: SECRET_KEY is hardcoded in settings.py
# In settings.py - BEFORE (INSECURE)
SECRET_KEY = "django-insecure-rg_u_!f9&!x2m_pe-%o$ih_x-seh@q#0j)wam3!j(quitkb$6="
# In settings.py - AFTER (SECURE)
SECRET_KEY = env("SECRET_KEY")
# Fail fast if SECRET_KEY is not set
if not SECRET_KEY:
raise ValueError("SECRET_KEY environment variable must be set")
# Generate a new SECRET_KEY
python -c 'from django.core.management.utils import get_random_secret_key; print(get_random_secret_key())'
# Add to .env file
SECRET_KEY=your-new-generated-secret-key-here
2.5.2 Configure HTTPS/SSL Security Settings¶
# In settings.py - Add production security settings
# Only enforce HTTPS in production
if not DEBUG:
# Redirect all HTTP traffic to HTTPS
SECURE_SSL_REDIRECT = True
# Use secure cookies
SESSION_COOKIE_SECURE = True
CSRF_COOKIE_SECURE = True
# HSTS (HTTP Strict Transport Security)
# Start with 1 hour, gradually increase to 1 year
SECURE_HSTS_SECONDS = 3600 # 1 hour initially
SECURE_HSTS_INCLUDE_SUBDOMAINS = True
SECURE_HSTS_PRELOAD = True
# Content Security
SECURE_CONTENT_TYPE_NOSNIFF = True
SECURE_BROWSER_XSS_FILTER = True
X_FRAME_OPTIONS = 'DENY'
# Proxy headers (for nginx)
SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')
2.5.3 Update ALLOWED_HOSTS and CSRF Configuration¶
# In settings.py
# Strict ALLOWED_HOSTS in production
if DEBUG:
ALLOWED_HOSTS = ['*'] # Development only
else:
ALLOWED_HOSTS = [
'uplink.sensational.systems',
]
# CSRF_TRUSTED_ORIGINS already correct
CSRF_TRUSTED_ORIGINS = [
'https://uplink.sensational.systems',
]
# Additional CSRF settings for production
if not DEBUG:
CSRF_COOKIE_HTTPONLY = True
CSRF_COOKIE_SAMESITE = 'Strict'
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_SAMESITE = 'Strict'
2.5.4 Database Connection Security¶
# In settings.py - Add to DATABASES configuration
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': env('DATABASE_NAME'),
'USER': env('DATABASE_USER'),
'PASSWORD': env('DATABASE_PASS'),
'HOST': env('DATABASE_HOST'),
'PORT': env('DATABASE_PORT'),
'OPTIONS': {
'init_command': "SET sql_mode='STRICT_TRANS_TABLES'",
'charset': 'utf8mb4',
# Connection pooling (for production)
'connect_timeout': 10,
'read_timeout': 30,
'write_timeout': 30,
},
'CONN_MAX_AGE': 600 if not DEBUG else 0, # Connection pooling
}
}
2.5.5 Configure Content Security Policy (CSP)¶
# In settings.py
INSTALLED_APPS = [
# ... existing apps
]
MIDDLEWARE = [
'django.middleware.security.SecurityMiddleware',
'csp.middleware.CSPMiddleware', # Add CSP middleware
# ... other middleware
]
# Content Security Policy
if not DEBUG:
CSP_DEFAULT_SRC = ("'self'",)
CSP_SCRIPT_SRC = ("'self'", "'unsafe-inline'") # Adjust as needed
CSP_STYLE_SRC = ("'self'", "'unsafe-inline'") # Adjust as needed
CSP_IMG_SRC = ("'self'", 'data:', 'https:')
CSP_FONT_SRC = ("'self'", 'data:')
CSP_CONNECT_SRC = ("'self'",)
CSP_FRAME_ANCESTORS = ("'none'",)
2.5.6 Password Validation Enhancement¶
# In settings.py - Strengthen password validators
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
'OPTIONS': {
'min_length': 12, # Increased from default 8
}
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
# Use Argon2 for password hashing (most secure)
PASSWORD_HASHERS = [
'django.contrib.auth.hashers.Argon2PasswordHasher',
'django.contrib.auth.hashers.PBKDF2PasswordHasher',
'django.contrib.auth.hashers.PBKDF2SHA1PasswordHasher',
'django.contrib.auth.hashers.BCryptSHA256PasswordHasher',
]
2.5.7 API Security (Django REST Framework)¶
# In settings.py - Secure DRF settings
REST_FRAMEWORK = {
'DEFAULT_AUTHENTICATION_CLASSES': [
'rest_framework.authentication.TokenAuthentication',
'rest_framework.authentication.SessionAuthentication',
],
'DEFAULT_PERMISSION_CLASSES': [
'rest_framework.permissions.IsAuthenticated',
],
'DEFAULT_THROTTLE_CLASSES': [
'rest_framework.throttling.AnonRateThrottle',
'rest_framework.throttling.UserRateThrottle',
],
'DEFAULT_THROTTLE_RATES': {
'anon': '100/hour',
'user': '1000/hour',
},
'DEFAULT_RENDERER_CLASSES': [
'rest_framework.renderers.JSONRenderer',
],
# Disable browsable API in production
'DEFAULT_RENDERER_CLASSES': [
'rest_framework.renderers.JSONRenderer',
] if not DEBUG else [
'rest_framework.renderers.JSONRenderer',
'rest_framework.renderers.BrowsableAPIRenderer',
],
}
2.5.8 Logging Configuration¶
# In settings.py - Enhanced logging
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'verbose': {
'format': '{levelname} {asctime} {module} {process:d} {thread:d} {message}',
'style': '{',
},
'simple': {
'format': '{levelname} {message}',
'style': '{',
},
},
'filters': {
'require_debug_false': {
'()': 'django.utils.log.RequireDebugFalse',
},
'require_debug_true': {
'()': 'django.utils.log.RequireDebugTrue',
},
},
'handlers': {
'console': {
'level': 'INFO',
'class': 'logging.StreamHandler',
'formatter': 'simple',
},
'file': {
'level': 'WARNING',
'class': 'logging.handlers.RotatingFileHandler',
'filename': '/var/log/uplink/django.log',
'maxBytes': 1024 * 1024 * 10, # 10MB
'backupCount': 5,
'formatter': 'verbose',
},
'security_file': {
'level': 'WARNING',
'class': 'logging.handlers.RotatingFileHandler',
'filename': '/var/log/uplink/security.log',
'maxBytes': 1024 * 1024 * 10, # 10MB
'backupCount': 5,
'formatter': 'verbose',
},
},
'loggers': {
'django': {
'handlers': ['console', 'file'],
'level': 'INFO',
'propagate': False,
},
'django.security': {
'handlers': ['security_file'],
'level': 'WARNING',
'propagate': False,
},
'uplink': {
'handlers': ['console', 'file'],
'level': 'DEBUG' if DEBUG else 'INFO',
'propagate': False,
},
},
}
2.5.9 Security Checklist & Testing¶
# Run Django security check
python manage.py check --deploy
# Expected output: System check identified no issues (0 silenced).
2.5.10 Environment-Specific Settings Template¶
Create .env.example for documentation:
# .env.example - Template for environment variables
# Django Core
DEBUG=False
SECRET_KEY=change-me-to-a-random-50-character-string
# Database
DATABASE_HOST=localhost
DATABASE_PORT=3306
DATABASE_NAME=uplink
DATABASE_USER=uplink
DATABASE_PASS=secure-password-here
# Sentry (Error Tracking)
SENTRY_DSN=https://your-sentry-dsn-here
# Redis (Huey & Channels)
REDIS_URL=redis://localhost:6379/0
# Email (Mailgun)
MAILGUN_URL=https://api.mailgun.net/v3/your-domain
MAILGUN_API_KEY=your-mailgun-api-key
MAILGUN_REPLY_TO_EMAIL=noreply@yourdomain.com
# External APIs
FEDEX_PROD_CLIENT_ID=
FEDEX_PROD_CLIENT_SECRET=
FEDEX_PROD_ACCOUNT_NO=
PRESTASHOP_BASE_URL=
PRESTASHOP_API_KEY=
Rollback Plan¶
- Security settings are mostly additive
- Can disable HTTPS redirect if issues:
SECURE_SSL_REDIRECT = False - Keep old settings.py in git history
- Test thoroughly in development before production
Completion Checklist¶
- [ ]
python manage.py check --deploypasses with no warnings - [ ] SECRET_KEY moved to environment variable
- [ ] HTTPS redirect working (test with curl)
- [ ] Secure cookies enabled in production
- [ ] CSP headers present (check with browser dev tools)
- [ ] Logging to file working
- [ ] No security vulnerabilities in dependency scan
- [ ] API rate limiting functional
- [ ] All tests still pass
Phase 3: Containerization (Docker)¶
Goal: Dockerize the application for easier deployment and development
Timeline: 2-3 weeks
Can be done: Anytime during normal hours (development setup)
3.1 Create Dockerfile¶
# Dockerfile
FROM python:3.12-slim
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
default-libmysqlclient-dev \
pkg-config \
&& rm -rf /var/lib/apt/lists/*
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
# Set working directory
WORKDIR /app
# Copy dependency files
COPY pyproject.toml ./
# Install Python dependencies
RUN uv pip install --system -e ".[dev]"
# Copy application code
COPY . .
# Collect static files
RUN python manage.py collectstatic --noinput
# Expose port
EXPOSE 8000
# Run migrations and start server
CMD ["sh", "-c", "python manage.py migrate && python manage.py runserver 0.0.0.0:8000"]
3.2 Create docker-compose.yml¶
version: '3.8'
services:
db:
image: mysql:8.0
environment:
MYSQL_DATABASE: uplink
MYSQL_USER: uplink
MYSQL_PASSWORD: ${DB_PASSWORD}
MYSQL_ROOT_PASSWORD: ${DB_ROOT_PASSWORD}
volumes:
- mysql_data:/var/lib/mysql
ports:
- "3306:3306"
redis:
image: redis:7-alpine
ports:
- "6379:6379"
web:
build: .
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/media
ports:
- "8000:8000"
env_file:
- .env
depends_on:
- db
- redis
huey:
build: .
command: python manage.py run_huey
volumes:
- .:/app
env_file:
- .env
depends_on:
- db
- redis
volumes:
mysql_data:
static_volume:
media_volume:
3.3 Create .dockerignore¶
.git
.gitignore
__pycache__
*.pyc
*.pyo
*.pyd
.Python
.venv
venv/
ENV/
.env.local
db.sqlite3
*.log
.DS_Store
node_modules/
3.4 Development Workflow¶
# Build containers
docker-compose build
# Start services
docker-compose up -d
# Run migrations
docker-compose exec web python manage.py migrate
# Create superuser
docker-compose exec web python manage.py createsuperuser
# View logs
docker-compose logs -f web
# Stop services
docker-compose down
3.5 Production Dockerfile¶
# Dockerfile.prod
FROM python:3.12-slim AS builder
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
RUN apt-get update && apt-get install -y \
gcc \
default-libmysqlclient-dev \
pkg-config \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
COPY pyproject.toml ./
RUN uv pip install --system -e .
# Final stage
FROM python:3.12-slim
RUN apt-get update && apt-get install -y \
default-libmysqlclient-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Copy dependencies from builder
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
# Copy application
COPY . .
# Create non-root user
RUN useradd -m -u 1000 uplink && chown -R uplink:uplink /app
USER uplink
RUN python manage.py collectstatic --noinput
EXPOSE 8000
CMD ["gunicorn", "uplink.wsgi:application", "--bind", "0.0.0.0:8000", "--workers", "4"]
Rollback Plan¶
- Docker is development-only initially
- Production still uses traditional deployment
- Easy to remove Docker files if issues arise
Completion Checklist¶
- [ ] Docker containers build successfully
- [ ] Application runs in Docker container
- [ ] Database connections work in Docker
- [ ] All developers can use Docker for development
- [ ] Documentation updated with Docker instructions
Phase 3.5: Web Server & Reverse Proxy (nginx)¶
Goal: Set up nginx as a reverse proxy for gunicorn and handle static files, SSL, and WebSockets
Timeline: 1 week
Can be done: During normal hours (development), weekend for production cutover
Prerequisites: Phase 3 (Docker) complete
Why nginx is Essential¶
Gunicorn is a Python WSGI server, not a web server. For production, you need nginx to:
- Serve static files efficiently (CSS, JS, images)
- Handle SSL/TLS termination (HTTPS)
- Act as reverse proxy to gunicorn
- Buffer slow clients (prevent tying up gunicorn workers)
- Upgrade WebSocket connections for Channels/Daphne
- Rate limiting and DDoS protection
- Load balancing across multiple gunicorn workers
3.5.1 Install nginx¶
# On Ubuntu 24.04
sudo apt update
sudo apt install nginx
# Start and enable nginx
sudo systemctl start nginx
sudo systemctl enable nginx
# Check status
sudo systemctl status nginx
3.5.2 Configure nginx for Uplink¶
Create /etc/nginx/sites-available/uplink:
# Upstream for gunicorn (WSGI)
upstream uplink_wsgi {
server unix:/opt/uplink/gunicorn.sock fail_timeout=0;
}
# Upstream for daphne (ASGI - WebSockets)
upstream uplink_asgi {
server unix:/opt/uplink/daphne.sock fail_timeout=0;
}
# Redirect HTTP to HTTPS
server {
listen 80;
listen [::]:80;
server_name uplink.sensational.systems;
# Let's Encrypt challenge
location /.well-known/acme-challenge/ {
root /var/www/certbot;
}
# Redirect everything else to HTTPS
location / {
return 301 https://$host$request_uri;
}
}
# HTTPS server
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name uplink.sensational.systems;
# SSL certificates (Let's Encrypt)
ssl_certificate /etc/letsencrypt/live/uplink.sensational.systems/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/uplink.sensational.systems/privkey.pem;
ssl_trusted_certificate /etc/letsencrypt/live/uplink.sensational.systems/chain.pem;
# SSL configuration (Mozilla Intermediate)
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;
# OCSP stapling
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
# Security headers
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
add_header X-Frame-Options "DENY" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
# Max upload size (adjust as needed)
client_max_body_size 100M;
# Logging
access_log /var/log/nginx/uplink_access.log;
error_log /var/log/nginx/uplink_error.log;
# Static files
location /static/ {
alias /opt/uplink/static/;
expires 30d;
add_header Cache-Control "public, immutable";
}
# Media files
location /media/ {
alias /opt/uplink/media/;
expires 7d;
add_header Cache-Control "public";
}
# Favicon
location = /favicon.ico {
alias /opt/uplink/static/favicon.ico;
access_log off;
log_not_found off;
}
# Robots.txt
location = /robots.txt {
alias /opt/uplink/static/robots.txt;
access_log off;
log_not_found off;
}
# WebSocket connections (for Django Channels)
location /ws/ {
proxy_pass http://uplink_asgi;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect off;
proxy_buffering off;
}
# All other requests to gunicorn
location / {
proxy_pass http://uplink_wsgi;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect off;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
}
3.5.3 Enable nginx Configuration¶
# Create symbolic link to enable site
sudo ln -s /etc/nginx/sites-available/uplink /etc/nginx/sites-enabled/
# Remove default site
sudo rm /etc/nginx/sites-enabled/default
# Test nginx configuration
sudo nginx -t
# Reload nginx
sudo systemctl reload nginx
3.5.4 Configure SSL with Let's Encrypt¶
# Install certbot
sudo apt install certbot python3-certbot-nginx
# Obtain SSL certificate
sudo certbot --nginx -d uplink.sensational.systems
# Certbot will automatically:
# 1. Obtain certificates
# 2. Update nginx configuration
# 3. Set up auto-renewal
# Test auto-renewal
sudo certbot renew --dry-run
# Auto-renewal is handled by systemd timer
sudo systemctl status certbot.timer
3.5.5 Update Gunicorn Configuration¶
Create /etc/systemd/system/uplink-web.service (update from Phase 4):
[Unit]
Description=Uplink Gunicorn WSGI Server
After=network.target mysql.service redis.service
Requires=mysql.service redis.service
[Service]
Type=notify
User=uplink
Group=www-data
WorkingDirectory=/opt/uplink
Environment="PATH=/opt/uplink/.venv/bin"
# Gunicorn using Unix socket
ExecStart=/opt/uplink/.venv/bin/gunicorn uplink.wsgi:application \
--bind unix:/opt/uplink/gunicorn.sock \
--workers 4 \
--worker-class sync \
--timeout 120 \
--access-logfile /var/log/uplink/gunicorn_access.log \
--error-logfile /var/log/uplink/gunicorn_error.log \
--log-level info
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
3.5.6 Configure Daphne for WebSockets (Channels)¶
Create /etc/systemd/system/uplink-daphne.service:
[Unit]
Description=Uplink Daphne ASGI Server (WebSockets)
After=network.target redis.service
Requires=redis.service
[Service]
Type=simple
User=uplink
Group=www-data
WorkingDirectory=/opt/uplink
Environment="PATH=/opt/uplink/.venv/bin"
# Daphne using Unix socket
ExecStart=/opt/uplink/.venv/bin/daphne \
-u /opt/uplink/daphne.sock \
--access-log /var/log/uplink/daphne_access.log \
uplink.asgi:application
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
3.5.7 Set Up ASGI Configuration¶
Create/update uplink/asgi.py:
"""
ASGI config for uplink project.
Exposes the ASGI callable as a module-level variable named ``application``.
"""
import os
from django.core.asgi import get_asgi_application
from channels.routing import ProtocolTypeRouter, URLRouter
from channels.auth import AuthMiddlewareStack
import devices.routing # Your WebSocket routing
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'uplink.settings')
django_asgi_app = get_asgi_application()
application = ProtocolTypeRouter({
"http": django_asgi_app,
"websocket": AuthMiddlewareStack(
URLRouter(
devices.routing.websocket_urlpatterns
)
),
})
3.5.8 File Permissions¶
# Create log directory
sudo mkdir -p /var/log/uplink
sudo chown uplink:www-data /var/log/uplink
# Set permissions for socket files
sudo mkdir -p /opt/uplink
sudo chown uplink:www-data /opt/uplink
sudo chmod 755 /opt/uplink
# Ensure nginx can access static/media files
sudo chmod -R 755 /opt/uplink/static
sudo chmod -R 755 /opt/uplink/media
3.5.9 Firewall Configuration¶
# Install ufw (Uncomplicated Firewall)
sudo apt install ufw
# Allow SSH (important - don't lock yourself out!)
sudo ufw allow 22/tcp
# Allow HTTP and HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
# Enable firewall
sudo ufw enable
# Check status
sudo ufw status verbose
3.5.10 Testing nginx Setup¶
# Start services
sudo systemctl start uplink-web
sudo systemctl start uplink-daphne
sudo systemctl reload nginx
# Test HTTP to HTTPS redirect
curl -I http://uplink.sensational.systems
# Should return 301 redirect to https://
# Test HTTPS
curl -I https://uplink.sensational.systems
# Should return 200 OK
# Test static files
curl -I https://uplink.sensational.systems/static/
# Check SSL configuration
sudo openssl s_client -connect uplink.sensational.systems:443 -servername uplink.sensational.systems
# Test WebSocket (if applicable)
# Use browser dev tools or wscat
npm install -g wscat
wscat -c wss://uplink.sensational.systems/ws/devices/
Rollback Plan¶
- Keep old configuration files
- Can switch back to direct gunicorn access
- nginx configuration can be disabled:
sudo rm /etc/nginx/sites-enabled/uplink - Revert firewall rules if needed
Completion Checklist¶
- [ ] nginx installed and running
- [ ] HTTP automatically redirects to HTTPS
- [ ] SSL certificate valid and auto-renewing
- [ ] Static files served by nginx
- [ ] Application accessible via nginx
- [ ] WebSocket connections working (if using Channels)
- [ ] Security headers present (check with securityheaders.com)
- [ ] Firewall configured correctly
- [ ] Logs accessible and readable
- [ ] Performance equal or better than before
Phase 4: Process Management (systemd)¶
Goal: Replace CRON with systemd for better reliability and logging
Timeline: 1-2 weeks
Can be done: Anytime, but should be tested thoroughly first
4.1 Identify Current CRON Jobs¶
# List current cron jobs
crontab -l
# Document each job:
# - What it does
# - When it runs
# - Dependencies
# - Expected output
4.2 Create systemd Service Files¶
Web Application Service¶
# /etc/systemd/system/uplink-web.service
[Unit]
Description=Uplink Web Application
After=network.target mysql.service redis.service
Requires=mysql.service redis.service
[Service]
Type=notify
User=uplink
Group=uplink
WorkingDirectory=/opt/uplink
Environment="PATH=/opt/uplink/.venv/bin"
ExecStart=/opt/uplink/.venv/bin/gunicorn uplink.wsgi:application \
--bind 0.0.0.0:8000 \
--workers 4 \
--timeout 120 \
--access-logfile /var/log/uplink/access.log \
--error-logfile /var/log/uplink/error.log
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Huey Worker Service¶
# /etc/systemd/system/uplink-huey.service
[Unit]
Description=Uplink Huey Task Queue Worker
After=network.target mysql.service redis.service
Requires=mysql.service redis.service
[Service]
Type=simple
User=uplink
Group=uplink
WorkingDirectory=/opt/uplink
Environment="PATH=/opt/uplink/.venv/bin"
ExecStart=/opt/uplink/.venv/bin/python manage.py run_huey
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Scheduled Task Service (Timer-based)¶
# /etc/systemd/system/uplink-cleanup.service
[Unit]
Description=Uplink Cleanup Task
[Service]
Type=oneshot
User=uplink
Group=uplink
WorkingDirectory=/opt/uplink
Environment="PATH=/opt/uplink/.venv/bin"
ExecStart=/opt/uplink/.venv/bin/python manage.py cleanup_old_data
# /etc/systemd/system/uplink-cleanup.timer
[Unit]
Description=Run Uplink cleanup daily
[Timer]
OnCalendar=daily
OnCalendar=02:00
Persistent=true
[Install]
WantedBy=timers.target
4.3 Create Deployment Scripts¶
deploy.sh¶
#!/bin/bash
# deploy.sh - Automated deployment script
set -e # Exit on error
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
echo -e "${GREEN}Starting Uplink deployment...${NC}"
# Navigate to application directory
cd /opt/uplink
# Backup database
echo -e "${YELLOW}Creating database backup...${NC}"
./scripts/backup-db.sh
# Pull latest code
echo -e "${YELLOW}Pulling latest code from git...${NC}"
git fetch origin
git pull origin main
# Activate virtual environment
source .venv/bin/activate
# Install/update dependencies
echo -e "${YELLOW}Installing dependencies...${NC}"
uv pip install -e .
# Run migrations
echo -e "${YELLOW}Running database migrations...${NC}"
python manage.py migrate --noinput
# Collect static files
echo -e "${YELLOW}Collecting static files...${NC}"
python manage.py collectstatic --noinput
# Restart services
echo -e "${YELLOW}Restarting services...${NC}"
sudo systemctl restart uplink-web
sudo systemctl restart uplink-huey
# Check service status
echo -e "${YELLOW}Checking service status...${NC}"
sudo systemctl status uplink-web --no-pager
sudo systemctl status uplink-huey --no-pager
echo -e "${GREEN}Deployment complete!${NC}"
backup-db.sh¶
#!/bin/bash
# backup-db.sh - Database backup script
set -e
BACKUP_DIR="/opt/uplink/backups"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="$BACKUP_DIR/uplink_$TIMESTAMP.sql"
# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"
# Dump database
mysqldump -u uplink -p uplink > "$BACKUP_FILE"
# Compress backup
gzip "$BACKUP_FILE"
# Keep only last 30 days of backups
find "$BACKUP_DIR" -name "uplink_*.sql.gz" -mtime +30 -delete
echo "Backup created: ${BACKUP_FILE}.gz"
rollback.sh¶
#!/bin/bash
# rollback.sh - Rollback to previous version
set -e
echo "Starting rollback..."
cd /opt/uplink
# Get previous commit
PREVIOUS_COMMIT=$(git rev-parse HEAD~1)
echo "Rolling back to commit: $PREVIOUS_COMMIT"
# Checkout previous commit
git checkout "$PREVIOUS_COMMIT"
# Activate virtual environment
source .venv/bin/activate
# Install dependencies
uv pip install -e .
# Run migrations (in case of database changes)
python manage.py migrate --noinput
# Restart services
sudo systemctl restart uplink-web
sudo systemctl restart uplink-huey
echo "Rollback complete!"
4.4 systemd Management Commands¶
# Enable services to start on boot
sudo systemctl enable uplink-web
sudo systemctl enable uplink-huey
sudo systemctl enable uplink-cleanup.timer
# Start services
sudo systemctl start uplink-web
sudo systemctl start uplink-huey
sudo systemctl start uplink-cleanup.timer
# Stop services
sudo systemctl stop uplink-web
sudo systemctl stop uplink-huey
# Restart services
sudo systemctl restart uplink-web
sudo systemctl restart uplink-huey
# Check status
sudo systemctl status uplink-web
sudo systemctl status uplink-huey
# View logs
sudo journalctl -u uplink-web -f
sudo journalctl -u uplink-huey -f
sudo journalctl -u uplink-cleanup -f
# List all timers
sudo systemctl list-timers
Rollback Plan¶
- Keep CRON jobs active initially
- Run systemd services in parallel with CRON
- Monitor for issues for 1 week
- Disable CRON jobs once systemd proven stable
Completion Checklist¶
- [ ] All systemd services start successfully
- [ ] Logs are accessible and readable
- [ ] Scheduled tasks run on time
- [ ] No missed tasks or duplicates
- [ ] Easy to restart/reload services
- [ ] Boot persistence works correctly
Phase 5: VM Migration (Ubuntu 24.04 LTS)¶
Goal: Migrate to new Ubuntu 24.04 LTS VM
Timeline: 1 day (weekend deployment)
Must be done: Weekend or low-traffic period
5.1 Pre-Migration Preparation¶
Two Weeks Before¶
- [ ] Provision new Ubuntu 24.04 LTS VM
- [ ] Document current server configuration
- [ ] Create detailed migration checklist
- [ ] Test backup and restore procedures
- [ ] Prepare rollback plan
One Week Before¶
- [ ] Set up new VM with Docker
- [ ] Install all required system packages
- [ ] Configure networking and firewall
- [ ] Set up SSL certificates
- [ ] Test deployment scripts on new VM
- [ ] Perform full dry-run migration
One Day Before¶
- [ ] Final database backup
- [ ] Freeze code deployments
- [ ] Notify users of maintenance window
- [ ] Prepare monitoring dashboard
- [ ] Brief team on rollback procedures
5.2 Migration Day Checklist¶
Phase 1: Preparation (30 minutes)¶
# On OLD server
# 1. Create final database backup
sudo ./scripts/backup-db.sh
# 2. Put application in maintenance mode
sudo systemctl stop uplink-web
sudo systemctl stop uplink-huey
# 3. Export final database dump
mysqldump -u root -p uplink > /tmp/final_migration.sql
# 4. Sync media files
rsync -avz /opt/uplink/media/ user@new-server:/opt/uplink/media/
Phase 2: New Server Setup (1 hour)¶
# On NEW server (Ubuntu 24.04)
# 1. Clone repository
cd /opt
git clone https://github.com/SensationalSystems/uplink.git
cd uplink
# 2. Set up environment
uv venv
source .venv/bin/activate
uv pip install -e .
# 3. Configure environment variables
cp .env.example .env
# Edit .env with production values
# 4. Import database
mysql -u root -p uplink < /tmp/final_migration.sql
# 5. Run migrations
python manage.py migrate
# 6. Collect static files
python manage.py collectstatic --noinput
# 7. Set up systemd services
sudo cp scripts/systemd/*.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable uplink-web uplink-huey
Phase 3: Testing (30 minutes)¶
# Start services
sudo systemctl start uplink-web
sudo systemctl start uplink-huey
# Check status
sudo systemctl status uplink-web
sudo systemctl status uplink-huey
# Test application
curl http://localhost:8000/
curl http://localhost:8000/api/
# Test database connectivity
python manage.py shell
>>> from contacts.models import Contact
>>> Contact.objects.count()
# Test background tasks
python manage.py shell
>>> from huey.contrib.djhuey import HUEY
>>> HUEY.pending()
# Check logs
sudo journalctl -u uplink-web -n 100
sudo journalctl -u uplink-huey -n 100
Phase 4: DNS/Load Balancer Update (15 minutes)¶
- Update DNS records to point to new server
- Or update load balancer configuration
- Wait for DNS propagation (if applicable)
- Monitor traffic shifting to new server
Phase 5: Monitoring (2 hours)¶
- Monitor application logs
- Monitor system resources (CPU, memory, disk)
- Monitor database performance
- Monitor error rates
- Monitor user reports
- Verify all functionality works
5.3 Post-Migration Tasks¶
Immediate (Same Day)¶
- [ ] Verify all critical functionality
- [ ] Check scheduled tasks are running
- [ ] Monitor error logs
- [ ] Test user workflows
- [ ] Update documentation with new server details
Next Week¶
- [ ] Monitor performance metrics
- [ ] Collect user feedback
- [ ] Optimize any slow queries
- [ ] Fine-tune systemd service configurations
- [ ] Update monitoring/alerting
One Month Later¶
- [ ] Decommission old server
- [ ] Remove old DNS entries
- [ ] Archive old server backups
- [ ] Document lessons learned
5.4 Rollback Plan¶
If Issues Within First Hour¶
# 1. Update DNS/load balancer to point back to old server
# 2. Old server should still be running in maintenance mode
# 3. Remove maintenance mode
sudo systemctl start uplink-web
sudo systemctl start uplink-huey
# 4. Verify old server is working
# 5. Investigate issues on new server
If Issues After DNS Propagation¶
# 1. Restore database from backup
mysql -u root -p uplink < /opt/uplink/backups/latest.sql
# 2. Roll back code if needed
git checkout <previous-stable-commit>
# 3. Restart services
sudo systemctl restart uplink-web uplink-huey
# 4. Monitor and debug
Completion Checklist¶
- [ ] All services running on new VM
- [ ] Database migrated successfully
- [ ] All functionality working
- [ ] No data loss
- [ ] Performance equal or better than old server
- [ ] No critical bugs reported within 48 hours
Phase 6: Monitoring & Observability¶
Goal: Implement comprehensive monitoring, logging, and alerting for production
Timeline: 2 weeks
Can be done: After Phase 5 (VM Migration) is stable
Prerequisites: Phase 5 complete, application running in production for 1-2 weeks
Why Monitoring is Critical¶
Without proper monitoring, you're flying blind. You need to know: - Is the application up and responding? - Are there errors happening? - Is performance degrading? - Are resources (CPU, memory, disk) running low? - What caused that outage last night?
6.1 Application Performance Monitoring (APM)¶
6.1.1 Enhance Sentry Configuration¶
Already using Sentry, but let's optimize it:
# In settings.py - Enhanced Sentry config
import sentry_sdk
from sentry_sdk.integrations.django import DjangoIntegration
from sentry_sdk.integrations.redis import RedisIntegration
from sentry_sdk.integrations.logging import LoggingIntegration
if not DEBUG and env("SENTRY_DSN"):
sentry_sdk.init(
dsn=env("SENTRY_DSN"),
integrations=[
DjangoIntegration(),
RedisIntegration(),
LoggingIntegration(
level=logging.INFO,
event_level=logging.ERROR
),
],
# Performance monitoring
traces_sample_rate=0.1, # 10% of transactions
profiles_sample_rate=0.1, # 10% profiling
# Send personal data for better debugging
send_default_pii=True,
# Environment tracking
environment=env("ENVIRONMENT", default="production"),
# Release tracking
release=env("GIT_COMMIT", default="unknown"),
# Filter out sensitive data
before_send=lambda event, hint: event if not is_sensitive(event) else None,
)
def is_sensitive(event):
\"\"\"Filter out events with sensitive data\"\"\"
# Implement your filtering logic
return False
6.1.2 Health Check Endpoint¶
# In settings.py
INSTALLED_APPS = [
# ... other apps
'health_check',
'health_check.db',
'health_check.cache',
'health_check.storage',
]
# In urls.py
from django.urls import path, include
urlpatterns = [
# ... other URLs
path('health/', include('health_check.urls')),
]
Test: curl https://uplink.sensational.systems/health/
6.2 Server Monitoring with Prometheus & Grafana¶
6.2.1 Install and Configure Prometheus Stack¶
Option A: Docker Compose (Recommended)
Create docker-compose.monitoring.yml:
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml
- ./monitoring/alert_rules.yml:/etc/prometheus/alert_rules.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
ports:
- '9090:9090'
restart: unless-stopped
grafana:
image: grafana/grafana:latest
container_name: grafana
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=change_this_password
ports:
- '3000:3000'
restart: unless-stopped
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
command:
- '--path.rootfs=/host'
volumes:
- '/:/host:ro,rslave'
ports:
- '9100:9100'
restart: unless-stopped
alertmanager:
image: prom/alertmanager:latest
container_name: alertmanager
volumes:
- ./monitoring/alertmanager.yml:/etc/alertmanager/alertmanager.yml
- alertmanager_data:/alertmanager
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
ports:
- '9093:9093'
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
alertmanager_data:
Create monitoring/prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- 'alert_rules.yml'
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'uplink'
static_configs:
- targets: ['host.docker.internal:8000']
Create monitoring/alertmanager.yml:
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'email'
receivers:
- name: 'email'
email_configs:
- to: 'ops@sensational.systems'
from: 'alertmanager@sensational.systems'
smarthost: 'smtp.mailgun.org:587'
auth_username: 'postmaster@yourdomain.com'
auth_password: '${MAILGUN_PASSWORD}'
headers:
Subject: '🚨 [Uplink] {{ .GroupLabels.alertname }}'
Create monitoring/alert_rules.yml:
groups:
- name: uplink_alerts
interval: 30s
rules:
- alert: UplinkDown
expr: up{job=\"uplink\"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: \"Uplink application is down\"
- alert: HighMemoryUsage
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: \"High memory usage detected\"
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes{mountpoint=\"/\"} / node_filesystem_size_bytes{mountpoint=\"/\"}) < 0.1
for: 5m
labels:
severity: warning
annotations:
summary: \"Disk space running low\"
6.3 Uptime Monitoring¶
Use a third-party service (choose one):
- UptimeRobot (free tier: 50 monitors, 5-min checks)
- Better Uptime (modern, developer-friendly)
- Pingdom (paid, comprehensive)
Configure monitors for:
- Main website: https://uplink.sensational.systems
- Health check: https://uplink.sensational.systems/health/
- API: https://uplink.sensational.systems/api/
6.4 Centralized Logging¶
6.4.1 Simple Approach: Papertrail or Logtail¶
# Install remote_syslog2 for Papertrail
wget https://github.com/papertrail/remote_syslog2/releases/download/v0.21/remote_syslog_linux_amd64.tar.gz
tar xzf remote_syslog_linux_amd64.tar.gz
sudo cp remote_syslog/remote_syslog /usr/local/bin/
Create /etc/log_files.yml:
files:
- /var/log/uplink/*.log
- /var/log/nginx/*.log
destination:
host: logs.papertrailapp.com
port: YOUR_PORT
protocol: tls
Create /etc/systemd/system/remote_syslog.service:
[Unit]
Description=Remote Syslog
After=network.target
[Service]
ExecStart=/usr/local/bin/remote_syslog -c /etc/log_files.yml
Restart=always
[Install]
WantedBy=multi-user.target
6.5 Grafana Dashboards¶
- Access Grafana at
http://your-server:3000 - Add Prometheus data source: http://prometheus:9090
- Import dashboards:
- Node Exporter Full: ID 1860
- Create custom dashboard for Uplink-specific metrics
Completion Checklist¶
- [ ] Prometheus collecting metrics
- [ ] Grafana dashboards accessible
- [ ] Health check endpoint responding
- [ ] External uptime monitoring configured
- [ ] Alerts configured and tested
- [ ] Logs centralized and searchable
- [ ] Team trained on monitoring tools
- [ ] Documentation for common alerts
- [ ] On-call procedures established
Phase 7: Health Checks & System Monitoring¶
Goal: Implement comprehensive automated health checks to detect issues before users do
When: After Phase 3.5 (nginx) - Required for production readiness
Duration: 1 week
7.1 Application Health Check Endpoints¶
Create uplink/health/ app for health monitoring:
# uplink/health/views.py
from django.http import JsonResponse
from django.db import connection
from django.core.cache import cache
from django.conf import settings
import redis
import requests
from datetime import datetime
def health_check(request):
"""
Comprehensive health check endpoint
Returns 200 if all systems OK, 503 if any critical system down
"""
checks = {
'status': 'healthy',
'timestamp': datetime.utcnow().isoformat(),
'checks': {}
}
all_healthy = True
# 1. Database Check
try:
with connection.cursor() as cursor:
cursor.execute("SELECT 1")
checks['checks']['database'] = {
'status': 'healthy',
'latency_ms': 0 # Add timing if needed
}
except Exception as e:
checks['checks']['database'] = {
'status': 'unhealthy',
'error': str(e)
}
all_healthy = False
# 2. Redis/Cache Check
try:
cache_key = 'health_check_test'
cache.set(cache_key, 'ok', 10)
result = cache.get(cache_key)
if result == 'ok':
checks['checks']['redis'] = {'status': 'healthy'}
else:
raise Exception("Cache read/write failed")
except Exception as e:
checks['checks']['redis'] = {
'status': 'unhealthy',
'error': str(e)
}
all_healthy = False
# 3. Huey Task Queue Check
try:
from huey.contrib.djhuey import HUEY
pending = HUEY.pending_count()
checks['checks']['huey'] = {
'status': 'healthy',
'pending_tasks': pending
}
if pending > 1000: # Too many pending tasks
checks['checks']['huey']['warning'] = 'High pending task count'
except Exception as e:
checks['checks']['huey'] = {
'status': 'unhealthy',
'error': str(e)
}
all_healthy = False
# 4. Channels/WebSocket Check
try:
r = redis.Redis.from_url(settings.CHANNEL_LAYERS['default']['CONFIG']['hosts'][0])
r.ping()
checks['checks']['channels'] = {'status': 'healthy'}
except Exception as e:
checks['checks']['channels'] = {
'status': 'unhealthy',
'error': str(e)
}
all_healthy = False
# 5. External API Health (Optional - may slow down health check)
# Uncomment if you want to check external dependencies
# try:
# response = requests.get('https://api.fedex.com/health', timeout=2)
# checks['checks']['fedex_api'] = {
# 'status': 'healthy' if response.status_code == 200 else 'degraded'
# }
# except Exception as e:
# checks['checks']['fedex_api'] = {
# 'status': 'degraded',
# 'error': str(e)
# }
# Set overall status
if not all_healthy:
checks['status'] = 'unhealthy'
return JsonResponse(checks, status=503)
return JsonResponse(checks, status=200)
def liveness(request):
"""
Simple liveness probe - is the application running?
Used by Docker/Kubernetes health checks
"""
return JsonResponse({'status': 'alive'}, status=200)
def readiness(request):
"""
Readiness probe - is the application ready to serve traffic?
Checks critical dependencies only (faster than full health check)
"""
try:
# Quick database check
with connection.cursor() as cursor:
cursor.execute("SELECT 1")
# Quick cache check
cache.set('readiness_check', 'ok', 5)
return JsonResponse({'status': 'ready'}, status=200)
except Exception as e:
return JsonResponse({
'status': 'not_ready',
'error': str(e)
}, status=503)
Add URLs:
# uplink/urls.py
from uplink.health import views as health_views
urlpatterns = [
# ... existing patterns ...
# Health check endpoints
path('health/', health_views.health_check, name='health'),
path('health/live/', health_views.liveness, name='liveness'),
path('health/ready/', health_views.readiness, name='readiness'),
]
7.2 System Resource Monitoring Script¶
Create scripts/system_health_check.sh:
#!/bin/bash
# System health check script
# Run via cron every 5 minutes
LOG_FILE="/var/log/uplink/system_health.log"
ALERT_EMAIL="ops@sensational.systems"
CRITICAL=0
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}
alert() {
log "ALERT: $1"
CRITICAL=1
}
# 1. Check disk space
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -gt 85 ]; then
alert "Disk usage at ${DISK_USAGE}%"
fi
# 2. Check memory usage
MEM_USAGE=$(free | grep Mem | awk '{print int($3/$2 * 100)}')
if [ "$MEM_USAGE" -gt 90 ]; then
alert "Memory usage at ${MEM_USAGE}%"
fi
# 3. Check if Daphne is running
if ! systemctl is-active --quiet uplink-web; then
alert "Daphne (uplink-web) is not running"
fi
# 4. Check if Huey is running
if ! systemctl is-active --quiet uplink-huey; then
alert "Huey task queue is not running"
fi
# 5. Check nginx
if ! systemctl is-active --quiet nginx; then
alert "nginx is not running"
fi
# 6. Check MySQL
if ! systemctl is-active --quiet mysql; then
alert "MySQL is not running"
fi
# 7. Check Redis
if ! systemctl is-active --quiet redis; then
alert "Redis is not running"
fi
# 8. Check application health endpoint
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8000/health/)
if [ "$HTTP_CODE" != "200" ]; then
alert "Health check endpoint returned $HTTP_CODE"
fi
# 9. Check SSL certificate expiry
CERT_DAYS=$(echo | openssl s_client -servername uplink.sensational.systems \
-connect uplink.sensational.systems:443 2>/dev/null | \
openssl x509 -noout -dates | grep "notAfter" | \
awk -F= '{print $2}' | xargs -I{} date -d {} +%s)
NOW=$(date +%s)
DAYS_LEFT=$(( ($CERT_DAYS - $NOW) / 86400 ))
if [ "$DAYS_LEFT" -lt 14 ]; then
alert "SSL certificate expires in $DAYS_LEFT days"
fi
# 10. Check log file sizes
LOG_SIZE=$(du -sm /var/log/uplink | cut -f1)
if [ "$LOG_SIZE" -gt 1000 ]; then # > 1GB
alert "Log directory size is ${LOG_SIZE}MB"
fi
# Send alert email if critical issues found
if [ "$CRITICAL" -eq 1 ]; then
log "Critical issues detected - sending alert"
mail -s "⚠️ Uplink System Health Alert" "$ALERT_EMAIL" < "$LOG_FILE"
else
log "All systems healthy"
fi
Make executable and add to cron:
Add:
# System health checks every 5 minutes
*/5 * * * * /opt/uplink/scripts/system_health_check.sh
# Daily log rotation check
0 2 * * * find /var/log/uplink -name "*.log" -mtime +30 -delete
# Weekly disk usage report
0 9 * * 1 df -h | mail -s "Weekly Disk Usage Report" ops@sensational.systems
7.3 Database Health Monitoring¶
Create scripts/db_health_check.py:
#!/usr/bin/env python
"""
Database health check script
Monitors connection pool, slow queries, table sizes, replication lag
"""
import os
import sys
import django
# Setup Django
sys.path.insert(0, '/opt/uplink')
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'uplink.settings')
django.setup()
from django.db import connection
from django.core.mail import send_mail
import logging
logger = logging.getLogger(__name__)
def check_slow_queries():
"""Check for slow queries in the last hour"""
with connection.cursor() as cursor:
cursor.execute("""
SELECT
query_time,
lock_time,
rows_examined,
sql_text
FROM mysql.slow_log
WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR)
AND query_time > 5
ORDER BY query_time DESC
LIMIT 10
""")
slow_queries = cursor.fetchall()
if slow_queries:
logger.warning(f"Found {len(slow_queries)} slow queries")
return False
return True
def check_table_sizes():
"""Check for tables growing unexpectedly large"""
with connection.cursor() as cursor:
cursor.execute("""
SELECT
table_name,
ROUND(((data_length + index_length) / 1024 / 1024), 2) AS size_mb
FROM information_schema.tables
WHERE table_schema = DATABASE()
ORDER BY (data_length + index_length) DESC
LIMIT 10
""")
tables = cursor.fetchall()
for table_name, size_mb in tables:
if size_mb > 1000: # > 1GB
logger.warning(f"Table {table_name} is {size_mb}MB")
def check_connection_pool():
"""Check database connection pool status"""
with connection.cursor() as cursor:
cursor.execute("SHOW STATUS LIKE 'Threads_connected'")
result = cursor.fetchone()
connections = int(result[1])
cursor.execute("SHOW VARIABLES LIKE 'max_connections'")
result = cursor.fetchone()
max_connections = int(result[1])
usage_pct = (connections / max_connections) * 100
if usage_pct > 80:
logger.warning(f"Database connection pool at {usage_pct:.1f}%")
return False
logger.info(f"Database connections: {connections}/{max_connections} ({usage_pct:.1f}%)")
return True
def check_deadlocks():
"""Check for recent deadlocks"""
with connection.cursor() as cursor:
cursor.execute("SHOW ENGINE INNODB STATUS")
status = cursor.fetchone()[2]
if "LATEST DETECTED DEADLOCK" in status:
logger.warning("Recent deadlocks detected in InnoDB status")
return False
return True
if __name__ == '__main__':
all_healthy = True
all_healthy &= check_connection_pool()
all_healthy &= check_slow_queries()
all_healthy &= check_deadlocks()
check_table_sizes() # Informational only
if not all_healthy:
send_mail(
subject='Database Health Alert',
message='Database health check failed. Check logs for details.',
from_email='noreply@sensational.systems',
recipient_list=['ops@sensational.systems']
)
sys.exit(1)
sys.exit(0)
Add to crontab:
# Database health check every 15 minutes
*/15 * * * * /opt/uplink/.venv/bin/python /opt/uplink/scripts/db_health_check.py
7.4 Application-Level Smoke Tests¶
Create scripts/smoke_test.py:
#!/usr/bin/env python
"""
Smoke tests for critical application functionality
Run after every deployment
"""
import requests
import sys
BASE_URL = "https://uplink.sensational.systems"
TESTS_PASSED = 0
TESTS_FAILED = 0
def test(name, url, expected_status=200, check_content=None):
"""Run a single test"""
global TESTS_PASSED, TESTS_FAILED
try:
response = requests.get(url, timeout=10)
# Check status code
if response.status_code != expected_status:
print(f"❌ {name}: Expected {expected_status}, got {response.status_code}")
TESTS_FAILED += 1
return False
# Check content if provided
if check_content and check_content not in response.text:
print(f"❌ {name}: Expected content not found")
TESTS_FAILED += 1
return False
print(f"✅ {name}")
TESTS_PASSED += 1
return True
except Exception as e:
print(f"❌ {name}: {str(e)}")
TESTS_FAILED += 1
return False
if __name__ == '__main__':
print("Running smoke tests...\n")
# Critical endpoint tests
test("Health Check", f"{BASE_URL}/health/")
test("Homepage", f"{BASE_URL}/", check_content="Uplink")
test("API Root", f"{BASE_URL}/api/")
test("Admin Login", f"{BASE_URL}/admin/login/")
# Authentication test
test("Unauthenticated API Access", f"{BASE_URL}/api/orders/", expected_status=401)
# Static files test
test("Static Files", f"{BASE_URL}/static/admin/css/base.css")
print(f"\n{'='*50}")
print(f"Tests Passed: {TESTS_PASSED}")
print(f"Tests Failed: {TESTS_FAILED}")
print(f"{'='*50}\n")
sys.exit(0 if TESTS_FAILED == 0 else 1)
Run after each deployment:
#!/bin/bash
# scripts/deploy.sh
# ... existing deployment steps ...
echo "Running smoke tests..."
python scripts/smoke_test.py
if [ $? -ne 0 ]; then
echo "⚠️ Smoke tests failed! Consider rollback."
exit 1
fi
echo "✅ Deployment successful and smoke tests passed"
7.5 Monitoring Dashboard Checklist¶
Create a simple monitoring dashboard page:
# uplink/health/templates/health/dashboard.html
<!DOCTYPE html>
<html>
<head>
<title>Uplink System Status</title>
<meta http-equiv="refresh" content="30">
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
.status { display: inline-block; width: 20px; height: 20px; border-radius: 50%; }
.healthy { background-color: #4CAF50; }
.unhealthy { background-color: #f44336; }
.degraded { background-color: #ff9800; }
table { border-collapse: collapse; width: 100%; margin-top: 20px; }
th, td { border: 1px solid #ddd; padding: 12px; text-align: left; }
th { background-color: #4CAF50; color: white; }
</style>
</head>
<body>
<h1>Uplink System Status</h1>
<p>Last updated: {{ timestamp }}</p>
<table>
<tr>
<th>Component</th>
<th>Status</th>
<th>Details</th>
</tr>
{% for component, status in checks.items %}
<tr>
<td>{{ component }}</td>
<td>
<span class="status {{ status.status }}"></span>
{{ status.status }}
</td>
<td>{{ status.details|default:"" }}</td>
</tr>
{% endfor %}
</table>
</body>
</html>
7.6 Automated Health Check Matrix¶
| Check Type | Frequency | Tool | Alert Threshold |
|---|---|---|---|
| Application Health | 1 min | UptimeRobot | 2 consecutive failures |
| System Resources | 5 min | Custom script | Disk >85%, Memory >90% |
| Database Health | 15 min | Custom script | Slow queries, connection pool >80% |
| SSL Certificate | Daily | Custom script | <14 days until expiry |
| Log File Size | Daily | Custom script | >1GB total |
| Service Status | 5 min | systemd | Any service down |
| API Endpoints | 5 min | Prometheus | Response time >2s |
| WebSocket | 5 min | Custom check | Connection failures |
| Background Tasks | 15 min | Huey monitoring | Queue >1000 pending |
| External APIs | 30 min | Custom check | Availability <95% |
7.7 Alert Escalation Policy¶
┌─────────────────────────────────────────────────┐
│ INCIDENT SEVERITY │
├─────────────────────────────────────────────────┤
│ P1 (Critical): Complete outage │
│ → Immediate alert to on-call │
│ → SMS + Email + Phone call │
│ → Response time: 15 minutes │
│ │
│ P2 (High): Degraded performance │
│ → Email + Slack notification │
│ → Response time: 1 hour │
│ │
│ P3 (Medium): Warning threshold │
│ → Email notification │
│ → Response time: 4 hours │
│ │
│ P4 (Low): Informational │
│ → Log only │
│ → Review during business hours │
└─────────────────────────────────────────────────┘
7.8 Post-Deployment Health Check Procedure¶
After every deployment, follow this checklist:
# 1. Check all services are running
sudo systemctl status uplink-web uplink-huey nginx mysql redis
# 2. Check application health endpoint
curl -f http://localhost:8000/health/ || echo "Health check failed!"
# 3. Run smoke tests
python scripts/smoke_test.py
# 4. Check recent logs for errors
tail -n 100 /var/log/uplink/error.log | grep ERROR
# 5. Check Sentry for new errors
# Visit Sentry dashboard
# 6. Monitor response times
# Check Grafana dashboard for 5 minutes
# 7. Verify external integrations
# Test FedEx API, PrestaShop sync, etc.
# 8. Check background task queue
# Visit Huey dashboard or check pending count
Completion Checklist¶
- [ ] Health check endpoints implemented (
/health/,/health/live/,/health/ready/) - [ ] System health check script running via cron
- [ ] Database health monitoring script configured
- [ ] Smoke tests created and integrated into deployment
- [ ] Monitoring dashboard accessible
- [ ] Alert escalation policy documented
- [ ] On-call rotation established
- [ ] Runbook created for common incidents
- [ ] Post-deployment checklist automated
- [ ] Team trained on health monitoring tools
- [ ] All health checks passing in production
Common Issues & Troubleshooting¶
Health check returns 503:
# Check which component is failing
curl http://localhost:8000/health/ | jq
# Check specific service
sudo systemctl status uplink-web
journalctl -u uplink-web -n 50
High memory usage:
# Find memory-hungry processes
ps aux --sort=-%mem | head -n 10
# Check for memory leaks in Python
# Enable memory profiling temporarily
Disk space issues:
# Find large files
du -ah /var/log | sort -rh | head -n 20
# Clean old logs
find /var/log/uplink -name "*.log" -mtime +30 -delete
# Rotate logs manually
logrotate -f /etc/logrotate.d/uplink
Database connection pool exhausted:
# Check active connections
mysql -e "SHOW PROCESSLIST;"
# Kill long-running queries
mysql -e "KILL <process_id>;"
# Restart application to reset pool
sudo systemctl restart uplink-web
Rollback Plans¶
General Rollback Principles¶
- Database First - Always restore database before code
- Git Tags - Tag stable versions for easy rollback
- Backups - Automated backups before every deployment
- Monitoring - Watch metrics during and after changes
- Communication - Alert team immediately if rollback needed
Phase-Specific Rollback Procedures¶
Phase 1 (uv) Rollback¶
# Revert to Pipenv
pipenv install
pipenv shell
# Or keep both until confident
uv pip install -e . # Development
pipenv install # Fallback
Phase 2 (Python/Django) Rollback¶
# Checkout previous tagged version
git checkout v1.0-python3.9-django3.2
# Restore database from backup
mysql -u root -p uplink < backup_before_upgrade.sql
# Reinstall old dependencies
uv pip install -e .
# Restart services
sudo systemctl restart uplink-web uplink-huey
Phase 3 (Docker) Rollback¶
# Docker is optional initially - just don't use it
# Remove docker-compose.yml if causing issues
# Continue with traditional deployment
Phase 4 (systemd) Rollback¶
# Re-enable CRON jobs
crontab -e
# Stop systemd services
sudo systemctl stop uplink-web uplink-huey
sudo systemctl disable uplink-web uplink-huey
# Monitor CRON jobs
tail -f /var/log/cron
Phase 5 (VM) Rollback¶
# Update DNS to point back to old server
# Old server should still be running
# Verify functionality
# Fix issues on new server before retrying
Timeline & Dependencies¶
Proposed Timeline¶
Month 1: Foundation
├── Week 1-2: Phase 1 (uv)
└── Week 3-4: Testing & Documentation
Month 2: Upgrades
├── Week 1-2: Phase 2A (Python 3.10)
└── Week 3-4: Phase 2B (Django 4.2)
Month 3: Modernization
├── Week 1-2: Phase 3 (Docker)
├── Week 3: Phase 4 (systemd)
└── Week 4: Testing & Documentation
Month 4: Migration
├── Week 1-2: New VM Setup & Testing
├── Week 3: Dry-run migration
└── Week 4: Production Migration (Weekend)
Dependencies Between Phases¶
┌──────────────────────────────────────────────────────────────┐
│ UPGRADE FLOW DIAGRAM │
└──────────────────────────────────────────────────────────────┘
Phase 1: uv (Dependency Management) 🟢 Low Risk
│
├──> Phase 2A: Python 3.10 🟢 Low Risk
│
├──> Phase 2B: Django 4.2 🟡 Medium Risk
│
├──> Phase 2.5: Security Hardening 🟡 Medium Risk
│
├──> Phase 2C: Python 3.12 + Django 5.x 🟡 Medium Risk
│
├──> Phase 3: Docker 🔵 Coordination Required
│
├──> Phase 3.5: nginx + Daphne 🔵 Coordination Required
│
├──> Phase 4: systemd 🔵 Coordination Required
│
├──> Phase 5: VM Migration 🔴 High Risk - Weekend
│
└──> Phase 6: Monitoring 🟢 Low Risk
┌──────────────────────────────────────────────────────────────┐
│ CRITICAL PATH: Each phase must be stable before proceeding │
│ to the next. No skipping allowed! │
└──────────────────────────────────────────────────────────────┘
Legend: - 🟢 Green: Low risk, can do anytime - 🟡 Yellow: Medium risk, thorough testing needed - 🔵 Blue: Requires coordination - 🔴 Red: High risk, weekend deployment
Critical Path¶
- uv must be working before Python upgrades
- Python 3.10 must be stable before Django 4.2
- Django 4.2 must be stable before Python 3.12/Django 5.x
- systemd should be stable before VM migration
- Everything should be tested in Docker before VM migration
Risk Mitigation¶
High-Risk Items¶
- Database Migrations - Test extensively, keep backups
- Third-Party Compatibility - Check all packages support new versions
- VM Migration - Dry-run multiple times
- Weekend Deployment - Have full team available for rollback
Testing Strategy¶
- Unit tests must pass at each phase
- Integration tests for critical workflows
- Performance testing before and after
- Load testing on new VM before migration
- User acceptance testing for major changes
Communication Plan¶
- Weekly updates to stakeholders
- Slack channel for migration updates
- Email notification before weekend migration
- Status page during migration
- Post-migration report
Appendix¶
Useful Commands Reference¶
# uv
uv venv # Create virtual environment
uv pip install -e ".[dev]" # Install with dev dependencies
uv pip list # List installed packages
uv pip sync pyproject.toml # Sync dependencies
# Docker
docker-compose up -d # Start services
docker-compose down # Stop services
docker-compose logs -f web # Follow web logs
docker-compose exec web bash # Shell into container
docker-compose exec web python manage.py migrate
# systemd
sudo systemctl status uplink-web
sudo systemctl restart uplink-web
sudo journalctl -u uplink-web -f
sudo systemctl list-timers
# Deployment
./scripts/deploy.sh # Deploy
./scripts/backup-db.sh # Backup database
./scripts/rollback.sh # Rollback
# Database
mysqldump -u uplink -p uplink > backup.sql
mysql -u uplink -p uplink < backup.sql
# Git
git tag v2.0-stable # Tag stable version
git checkout v2.0-stable # Rollback to tag
git log --oneline -n 10 # Recent commits
Resources¶
- uv documentation
- Django Upgrade Guide
- Docker Best Practices
- systemd Documentation
- Ubuntu 24.04 LTS Release Notes
Post-Upgrade Enhancements¶
Goal: Add new libraries and features once the core infrastructure is stable
Timeline: Ongoing after Phase 5 completion
Prerequisites: All phases (1-5) successfully deployed and stable in production for at least 2-4 weeks
Overview¶
Once Uplink 2.0 is running smoothly on the new infrastructure (Python 3.12, Django 5.x, Ubuntu 24.04 LTS), we can enhance the application with modern libraries and improved tooling. These enhancements should be added incrementally to avoid destabilizing the newly upgraded system.
Recommended Enhancement Timeline¶
Post-Launch Month 1: Stabilization
├── Week 1-2: Monitor production, fix any issues
├── Week 3: Performance optimization
└── Week 4: Begin planning enhancements
Post-Launch Month 2: Documentation
├── Week 1-2: MkDocs setup and migration
└── Week 3-4: Content creation and refinement
Post-Launch Month 3+: Feature Enhancements
├── django-formset integration (ongoing)
└── Additional improvements as needed
Enhancement 1: MkDocs for Documentation¶
Priority: High - Improves team collaboration and onboarding
Timeline: 2-3 weeks
When to implement: After Phase 5 completion + 2-4 weeks of stable production
Why MkDocs?¶
- ✅ Better organization - Structured, searchable documentation
- ✅ Version control - Documentation lives with code in git
- ✅ Easy to maintain - Simple Markdown syntax
- ✅ Professional appearance - Clean, modern UI
- ✅ Search functionality - Built-in search across all docs
- ✅ CI/CD integration - Auto-deploy docs on commit
Implementation Steps¶
1.1 Install MkDocs and Theme¶
# Add to pyproject.toml
[project.optional-dependencies]
docs = [
"mkdocs>=1.5.0",
"mkdocs-material>=9.0.0", # Material theme (recommended)
"mkdocs-git-revision-date-localized-plugin",
"mkdocs-minify-plugin",
"pymdown-extensions",
]
# Install
uv pip install -e ".[docs]"
1.2 Initialize MkDocs Structure¶
# Create MkDocs project
mkdocs new .
# This creates:
# - mkdocs.yml (configuration)
# - docs/ (documentation source)
1.3 Configure mkdocs.yml¶
site_name: Uplink Documentation
site_url: https://docs.uplink.yourdomain.com
repo_url: https://github.com/SensationalSystems/uplink
repo_name: SensationalSystems/uplink
theme:
name: material
palette:
# Light mode
- scheme: default
primary: indigo
accent: indigo
toggle:
icon: material/brightness-7
name: Switch to dark mode
# Dark mode
- scheme: slate
primary: indigo
accent: indigo
toggle:
icon: material/brightness-4
name: Switch to light mode
features:
- navigation.tabs
- navigation.sections
- navigation.top
- search.suggest
- search.highlight
- content.code.copy
plugins:
- search
- git-revision-date-localized:
enable_creation_date: true
- minify:
minify_html: true
markdown_extensions:
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.superfences
- pymdownx.tabbed:
alternate_style: true
- admonition
- pymdownx.details
- pymdownx.snippets
- attr_list
- md_in_html
nav:
- Home: index.md
- Getting Started:
- Installation: getting-started/installation.md
- Configuration: getting-started/configuration.md
- Development Setup: getting-started/development.md
- User Guide:
- Orders: user-guide/orders.md
- Catalogue: user-guide/catalogue.md
- Contacts: user-guide/contacts.md
- Services: user-guide/services.md
- API Reference:
- Overview: api/overview.md
- Orders API: api/orders.md
- Catalogue API: api/catalogue.md
- Authentication: api/authentication.md
- Deployment:
- Overview: deployment/overview.md
- Production Setup: deployment/production.md
- Monitoring: deployment/monitoring.md
- Development:
- Contributing: development/contributing.md
- Testing: development/testing.md
- Code Style: development/code-style.md
- Architecture:
- Overview: architecture/overview.md
- Database Schema: architecture/database.md
- Background Tasks: architecture/background-tasks.md
1.4 Migrate Existing Documentation¶
# Move existing docs to MkDocs structure
mkdir -p docs/deployment
mkdir -p docs/api
mkdir -p docs/development
mkdir -p docs/getting-started
# Migrate current docs (keep originals for now)
cp docs/DEPLOYMENT.md docs/deployment/overview.md
cp docs/API.md docs/api/overview.md
cp docs/TESTING.md docs/development/testing.md
cp docs/Services.md docs/architecture/background-tasks.md
# Create index.md
cat > docs/index.md << 'EOF'
# Uplink Documentation
Welcome to the Uplink documentation!
## What is Uplink?
Uplink is a comprehensive order management and fulfillment system...
## Quick Links
- [Installation Guide](getting-started/installation.md)
- [API Reference](api/overview.md)
- [Deployment Guide](deployment/overview.md)
## Getting Help
- Check the [FAQ](faq.md)
- Report issues on [GitHub](https://github.com/SensationalSystems/uplink/issues)
EOF
1.5 Local Development¶
# Serve docs locally
mkdocs serve
# Open http://127.0.0.1:8000 in browser
# Live reload on file changes
1.6 Build and Deploy¶
# Build static site
mkdocs build
# Output in site/ directory
# Deploy to hosting (GitHub Pages, Netlify, etc.)
# GitHub Pages deployment (automated)
mkdocs gh-deploy
1.7 CI/CD Integration¶
Create .github/workflows/docs.yml:
name: Deploy Documentation
on:
push:
branches:
- main
paths:
- 'docs/**'
- 'mkdocs.yml'
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.12'
- name: Install dependencies
run: |
pip install mkdocs-material
pip install mkdocs-git-revision-date-localized-plugin
pip install mkdocs-minify-plugin
- name: Deploy docs
run: mkdocs gh-deploy --force
Completion Checklist¶
- [ ] MkDocs installed and configured
- [ ] Existing documentation migrated
- [ ] Documentation builds without errors
- [ ] Local development server works
- [ ] Auto-deployment configured (GitHub Pages/Netlify)
- [ ] Team can easily update documentation
- [ ] Search functionality working
Enhancement 2: django-formset for Modern Forms¶
Priority: Medium - Improves UX and developer experience
Timeline: 3-4 weeks (ongoing integration)
When to implement: After MkDocs setup + 2 weeks of stable documentation
Why django-formset?¶
- ✅ Modern UI - Clean, interactive form handling
- ✅ Better UX - Dynamic add/remove of formset items
- ✅ Client-side validation - Instant feedback for users
- ✅ Alpine.js integration - Lightweight JavaScript framework
- ✅ Accessibility - WCAG compliant
- ✅ Django integration - Works seamlessly with Django forms
Implementation Steps¶
2.1 Install django-formset¶
# Add to pyproject.toml
dependencies = [
"django~=5.0",
# ... other dependencies
"django-formset>=1.5",
]
# Install
uv pip install django-formset
2.2 Configure Django Settings¶
# In settings.py
INSTALLED_APPS = [
# ... existing apps
'formset',
]
# Optional: Configure static files
STATICFILES_DIRS = [
# ... existing directories
]
2.3 Load Static Assets¶
<!-- In base template -->
{% load formsetify %}
<!DOCTYPE html>
<html>
<head>
<!-- ... existing head content -->
{% render_formset_assets %}
</head>
<body>
<!-- ... content -->
</body>
</html>
2.4 Convert Existing Forms (Incremental Approach)¶
Start with high-traffic or complex forms:
Example: Service Item Form (High Complexity)
# Before (standard Django formset)
from django.forms import modelformset_factory
from services.models import ServiceItem
ServiceItemFormSet = modelformset_factory(
ServiceItem,
fields=['name', 'price', 'duration'],
extra=1,
can_delete=True
)
# After (with django-formset)
from django.forms import models
from formset.collection import FormCollection
from formset.renderers.bootstrap import FormRenderer
from services.models import ServiceItem
class ServiceItemForm(models.ModelForm):
class Meta:
model = ServiceItem
fields = ['name', 'price', 'duration']
default_renderer = FormRenderer(
field_css_classes='mb-3',
form_css_classes='row'
)
class ServiceItemCollection(FormCollection):
min_siblings = 0
max_siblings = 10
extra_siblings = 1
item = ServiceItemForm()
legend = "Service Items"
add_label = "Add Service Item"
default_renderer = FormRenderer()
Template Changes:
<!-- Before -->
<form method="post">
{% csrf_token %}
{{ formset.management_form }}
{% for form in formset %}
{{ form.as_p }}
{% endfor %}
<button type="submit">Save</button>
</form>
<!-- After -->
{% load formsetify %}
<django-formset endpoint="{{ request.path }}" csrf-token="{{ csrf_token }}">
{% render_form form_collection %}
<button type="button" df-click="submit">Save</button>
</django-formset>
2.5 Incremental Migration Plan¶
Phase 1: Pilot (Week 1-2) - Choose 1-2 complex forms to convert - Service Item formset (already has auto-linking complexity) - Product Instance formset
Phase 2: Expand (Week 3-4) - Convert high-traffic forms - Order creation forms - Contact management forms
Phase 3: Complete (Ongoing) - Migrate remaining forms as time permits - Document patterns for team
Phase 4: Cleanup (After all migrations) - Remove old formset JavaScript - Standardize on django-formset patterns
2.6 Testing Strategy¶
# Test formset validation
from django.test import TestCase
from formset.collection import FormCollection
class ServiceItemCollectionTest(TestCase):
def test_valid_formset(self):
data = {
'item-0-name': 'Service A',
'item-0-price': '100.00',
'item-0-duration': '30',
}
collection = ServiceItemCollection(data=data)
self.assertTrue(collection.is_valid())
def test_min_siblings_validation(self):
# Test minimum items requirement
collection = ServiceItemCollection(data={})
self.assertFalse(collection.is_valid())
Benefits for Uplink¶
For Service Items: - Dynamic add/remove without page reload - Better validation feedback - Cleaner UI for auto-linking product instances
For Orders: - Easier bulk item entry - Better error handling - Improved mobile experience
For Product Catalogue: - Dynamic variant management - Better image upload UX
Completion Checklist¶
- [ ] django-formset installed and configured
- [ ] Pilot forms converted successfully
- [ ] Client-side validation working
- [ ] No regression in existing functionality
- [ ] User feedback positive
- [ ] Documentation updated with examples
- [ ] Team trained on new patterns
Enhancement 3: Additional Improvements (Future Considerations)¶
These can be considered after Enhancements 1 and 2 are stable:
3.1 django-htmx (Optional)¶
- Purpose: Partial page updates without full JavaScript framework
- When: If formset + Alpine.js isn't sufficient for interactivity needs
- Benefit: Faster page interactions, better UX
3.2 django-silk Performance Optimization¶
- Status: Already installed but currently disabled due to performance overhead
- Purpose: Profiling and request inspection
- Issue: Slows down the application significantly when enabled
- Solutions to explore:
- Configure sampling rate (only profile X% of requests)
- Enable only in staging environment
- Use alternative profiling: django-debug-toolbar or py-spy
- Consider django-query-inspector for SQL-only profiling
- When: After Phase 6 monitoring is stable
- Benefit: Identify slow queries and bottlenecks without production impact
3.3 django-health-check¶
- Purpose: Automated health monitoring endpoints
- When: After systemd is stable
- Benefit: Better monitoring and alerting
3.4 Sentry Integration Enhancement¶
- Purpose: Already using Sentry, but can improve configuration
- When: Anytime after Phase 5
- Benefit: Better error tracking and performance monitoring
Post-Launch Checklist¶
Before Adding Any Enhancement: - [ ] Core infrastructure stable for 2+ weeks - [ ] No critical bugs in production - [ ] All Phase 1-5 success criteria met - [ ] Team capacity available - [ ] Enhancement documented in plan - [ ] Rollback strategy defined
After Adding Each Enhancement: - [ ] Thoroughly tested in development - [ ] Staged rollout (feature flag if applicable) - [ ] Team trained on new features - [ ] Documentation updated - [ ] Monitoring for issues - [ ] User feedback collected
Summary: Complete Uplink 2.0 Journey¶
The Path Forward¶
Current State (Ubuntu 18.04, Python 3.9, Django 3.2, Pipenv, CRON)
↓
Phase 1: Switch to uv (Month 1)
↓
Phase 2A: Python 3.10 (Month 2, Week 1-2)
↓
Phase 2B: Django 4.2 + Library Updates (Month 2, Week 3-4)
↓
Phase 2C: Python 3.12 + Django 5.0 (Month 3, Week 1-2)
↓
Phase 3: Docker (Month 3, Week 3-4)
↓
Phase 4: systemd (Month 3, Week 3-4)
↓
Phase 5: Ubuntu 24.04 VM Migration (Month 4, Weekend)
↓
Stabilization Period (2-4 weeks)
↓
Enhancement 1: MkDocs (Month 5-6)
↓
Enhancement 2: django-formset (Month 6-7)
↓
Future Enhancements (Ongoing)
↓
Target State (Modern, Fast, Reliable, Well-Documented)
Appendix C: Testing Strategy¶
Goal: Establish comprehensive testing practices for reliability and confidence
Timeline: Ongoing, integrated into each phase
Current State: Minimal unit testing, mostly manual QA
Why Testing is Critical¶
Without tests, every upgrade and deployment is a gamble. Testing provides: - Confidence to make changes without breaking things - Documentation of how the system should behave - Regression prevention - catch bugs before they reach production - Faster debugging - isolate problems quickly - Enable refactoring - improve code without fear
C.1 Testing Pyramid for Uplink¶
/\
/ \ E2E Tests (5%)
/____\ - Critical user journeys
/ \ - Full system integration
/ \
/__________\ Integration Tests (15%)
/ \ - API endpoints
/ \ - Database operations
/ \ - External service mocking
/ \
/____________________\ Unit Tests (80%)
- Models
- Utils
- Business logic
C.2 Test Types for Uplink¶
C.2.1 Unit Tests (Highest Priority)¶
What to test:
Models (catalogue/tests/test_models.py)
from django.test import TestCase
from catalogue.models import Product, ProductInstance
class ProductModelTest(TestCase):
def setUp(self):
self.product = Product.objects.create(
sku='TEST-001',
name='Test Product',
price=99.99
)
def test_product_creation(self):
"""Test product is created correctly"""
self.assertEqual(self.product.sku, 'TEST-001')
self.assertEqual(self.product.name, 'Test Product')
def test_product_str_representation(self):
"""Test string representation"""
self.assertEqual(str(self.product), 'TEST-001 - Test Product')
def test_product_price_validation(self):
"""Test price cannot be negative"""
with self.assertRaises(ValidationError):
Product.objects.create(
sku='TEST-002',
name='Invalid Product',
price=-10.00
)
def test_product_instance_linking(self):
"""Test product instances link correctly"""
instance = ProductInstance.objects.create(
product=self.product,
serial_number='SN123'
)
self.assertEqual(instance.product, self.product)
self.assertIn(instance, self.product.instances.all())
Utils/Business Logic (orders/tests/test_utils.py)
from django.test import TestCase
from orders.utils import calculate_shipping_cost, validate_address
from orders.models import Order
class ShippingCalculationTest(TestCase):
def test_domestic_shipping_cost(self):
"""Test UK domestic shipping calculation"""
cost = calculate_shipping_cost(
weight_kg=2.5,
country='GB',
service='standard'
)
self.assertEqual(cost, 5.99)
def test_international_shipping_cost(self):
"""Test international shipping calculation"""
cost = calculate_shipping_cost(
weight_kg=2.5,
country='US',
service='standard'
)
self.assertEqual(cost, 15.99)
def test_express_shipping_multiplier(self):
"""Test express shipping costs more"""
standard = calculate_shipping_cost(2.5, 'GB', 'standard')
express = calculate_shipping_cost(2.5, 'GB', 'express')
self.assertGreater(express, standard)
class AddressValidationTest(TestCase):
def test_valid_uk_address(self):
"""Test UK address validation"""
address = {
'line1': '123 Main St',
'city': 'London',
'postcode': 'SW1A 1AA',
'country': 'GB'
}
self.assertTrue(validate_address(address))
def test_invalid_postcode(self):
"""Test invalid postcode rejection"""
address = {
'line1': '123 Main St',
'city': 'London',
'postcode': 'INVALID',
'country': 'GB'
}
self.assertFalse(validate_address(address))
Form Validation (contacts/tests/test_forms.py)
from django.test import TestCase
from contacts.forms import ContactForm
class ContactFormTest(TestCase):
def test_valid_contact_form(self):
"""Test form accepts valid data"""
form = ContactForm(data={
'name': 'John Doe',
'email': 'john@example.com',
'phone': '+44 1234 567890'
})
self.assertTrue(form.is_valid())
def test_invalid_email(self):
"""Test form rejects invalid email"""
form = ContactForm(data={
'name': 'John Doe',
'email': 'invalid-email',
'phone': '+44 1234 567890'
})
self.assertFalse(form.is_valid())
self.assertIn('email', form.errors)
C.2.2 Integration Tests¶
API Endpoints (orders/tests/test_api.py)
from rest_framework.test import APITestCase
from rest_framework import status
from django.contrib.auth.models import User
from orders.models import Order
class OrderAPITest(APITestCase):
def setUp(self):
self.user = User.objects.create_user(
username='testuser',
password='testpass123'
)
self.client.force_authenticate(user=self.user)
def test_create_order(self):
"""Test order creation via API"""
data = {
'customer': 'John Doe',
'items': [
{'sku': 'TEST-001', 'quantity': 2}
]
}
response = self.client.post('/api/orders/', data, format='json')
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(Order.objects.count(), 1)
def test_list_orders(self):
"""Test order listing"""
Order.objects.create(customer='Test Customer')
response = self.client.get('/api/orders/')
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(len(response.data), 1)
def test_unauthorized_access(self):
"""Test API requires authentication"""
self.client.force_authenticate(user=None)
response = self.client.get('/api/orders/')
self.assertEqual(response.status_code, status.HTTP_401_UNAUTHORIZED)
Database Operations (catalogue/tests/test_queries.py)
from django.test import TestCase
from catalogue.models import Product, Stock
from django.db.models import Q
class StockQueryTest(TestCase):
def setUp(self):
# Create test data
self.product1 = Product.objects.create(sku='P1', name='Product 1')
self.product2 = Product.objects.create(sku='P2', name='Product 2')
Stock.objects.create(product=self.product1, location='UK', quantity=10)
Stock.objects.create(product=self.product1, location='NL', quantity=5)
Stock.objects.create(product=self.product2, location='UK', quantity=0)
def test_available_stock_query(self):
"""Test query for products with stock"""
available = Product.objects.filter(
stock__quantity__gt=0
).distinct()
self.assertEqual(available.count(), 2)
def test_out_of_stock_query(self):
"""Test query for out-of-stock products"""
out_of_stock = Product.objects.filter(
Q(stock__quantity=0) | Q(stock__isnull=True)
).distinct()
self.assertIn(self.product2, out_of_stock)
Background Tasks (orders/tests/test_tasks.py)
from django.test import TestCase
from unittest.mock import patch, MagicMock
from orders.tasks import send_order_confirmation_email
from orders.models import Order
class TaskTest(TestCase):
@patch('orders.tasks.send_email')
def test_order_confirmation_email(self, mock_send_email):
"""Test order confirmation email task"""
order = Order.objects.create(
customer_email='test@example.com',
total=99.99
)
send_order_confirmation_email(order.id)
mock_send_email.assert_called_once()
call_args = mock_send_email.call_args
self.assertIn('test@example.com', call_args[0])
C.2.3 End-to-End Tests¶
Critical User Journeys (e2e/tests/test_order_flow.py)
from django.test import LiveServerTestCase
from selenium import webdriver
from selenium.webdriver.common.by import By
class OrderFlowE2ETest(LiveServerTestCase):
@classmethod
def setUpClass(cls):
super().setUpClass()
cls.selenium = webdriver.Chrome()
cls.selenium.implicitly_wait(10)
@classmethod
def tearDownClass(cls):
cls.selenium.quit()
super().tearDownClass()
def test_complete_order_flow(self):
"""Test user can browse products and place order"""
# Navigate to product listing
self.selenium.get(f'{self.live_server_url}/products/')
# Find and click a product
product = self.selenium.find_element(By.CLASS_NAME, 'product-card')
product.click()
# Add to cart
add_to_cart = self.selenium.find_element(By.ID, 'add-to-cart')
add_to_cart.click()
# Go to checkout
checkout_btn = self.selenium.find_element(By.ID, 'checkout')
checkout_btn.click()
# Fill in shipping details
self.selenium.find_element(By.NAME, 'name').send_keys('Test User')
self.selenium.find_element(By.NAME, 'email').send_keys('test@example.com')
# Submit order
submit = self.selenium.find_element(By.ID, 'submit-order')
submit.click()
# Verify success
success_msg = self.selenium.find_element(By.CLASS_NAME, 'success')
self.assertIn('Order placed successfully', success_msg.text)
C.3 Testing External Services (Mocking)¶
FedEx API (orders/tests/test_fedex_integration.py)
from django.test import TestCase
from unittest.mock import patch, Mock
from orders.services import FedExService
class FedExServiceTest(TestCase):
@patch('orders.services.requests.post')
def test_create_shipment(self, mock_post):
"""Test FedEx shipment creation with mocked API"""
# Mock successful response
mock_response = Mock()
mock_response.status_code = 200
mock_response.json.return_value = {
'tracking_number': 'FEDEX123456',
'label_url': 'https://example.com/label.pdf'
}
mock_post.return_value = mock_response
service = FedExService()
result = service.create_shipment(
weight=2.5,
destination='UK'
)
self.assertEqual(result['tracking_number'], 'FEDEX123456')
self.assertIsNotNone(result['label_url'])
@patch('orders.services.requests.post')
def test_create_shipment_api_error(self, mock_post):
"""Test handling of FedEx API errors"""
mock_response = Mock()
mock_response.status_code = 400
mock_response.json.return_value = {'error': 'Invalid address'}
mock_post.return_value = mock_response
service = FedExService()
with self.assertRaises(ShipmentError):
service.create_shipment(weight=2.5, destination='INVALID')
C.4 Test Organization¶
uplink/
├── catalogue/
│ ├── tests/
│ │ ├── __init__.py
│ │ ├── test_models.py
│ │ ├── test_views.py
│ │ ├── test_api.py
│ │ └── test_utils.py
├── orders/
│ ├── tests/
│ │ ├── __init__.py
│ │ ├── test_models.py
│ │ ├── test_tasks.py
│ │ ├── test_fedex.py
│ │ └── test_order_flow.py
├── contacts/
│ ├── tests/
│ │ ├── __init__.py
│ │ ├── test_forms.py
│ │ └── test_models.py
└── e2e/
└── tests/
└── test_critical_flows.py
C.5 Test Configuration¶
conftest.py (pytest fixtures)
import pytest
from django.contrib.auth.models import User
from catalogue.models import Product
@pytest.fixture
def api_client():
"""Authenticated API client"""
from rest_framework.test import APIClient
client = APIClient()
user = User.objects.create_user('testuser', password='testpass')
client.force_authenticate(user=user)
return client
@pytest.fixture
def sample_product():
"""Sample product for tests"""
return Product.objects.create(
sku='TEST-001',
name='Test Product',
price=99.99
)
pytest.ini
[pytest]
DJANGO_SETTINGS_MODULE = uplink.settings
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts =
--reuse-db
--nomigrations
--cov=catalogue
--cov=orders
--cov=contacts
--cov-report=html
--cov-report=term-missing
C.6 Running Tests¶
# Run all tests
python manage.py test
# Run specific app tests
python manage.py test catalogue
# Run with pytest (faster, better output)
pytest
# Run with coverage
pytest --cov=. --cov-report=html
# Run only unit tests
pytest -m unit
# Run only integration tests
pytest -m integration
# Run specific test
pytest catalogue/tests/test_models.py::ProductModelTest::test_product_creation
C.7 CI/CD Integration¶
Tests should run automatically on every commit (see Appendix A):
# .github/workflows/test.yml
- name: Run tests
run: |
pytest --cov=. --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v3
C.8 Test Coverage Goals¶
Phase-wise Coverage Targets:
| Phase | Coverage Target | Priority Areas |
|---|---|---|
| Phase 1 (uv) | 20% | Critical business logic |
| Phase 2A | 40% | Models, utils |
| Phase 2B | 60% | + API endpoints |
| Phase 2C | 75% | + Forms, views |
| Phase 6 | 85%+ | All critical paths |
C.9 What to Test First (Priority Order)¶
- Critical Business Logic (Highest Priority)
- Order creation and processing
- Stock management
- Shipping calculations
-
Payment processing
-
Models & Database
- Product creation
- Customer management
-
Order lifecycle
-
API Endpoints
- Authentication
- CRUD operations
-
Error handling
-
External Integrations
- FedEx API (mocked)
- PrestaShop sync (mocked)
-
Payment gateways (mocked)
-
User-Facing Features
- Form validation
- Search functionality
-
Filtering
-
Background Tasks
- Email sending
- Data synchronization
- Scheduled jobs
C.10 Testing Tools & Libraries¶
# Add to pyproject.toml
[project.optional-dependencies]
test = [
"pytest>=7.4.0",
"pytest-django>=4.5.0",
"pytest-cov>=4.1.0",
"pytest-mock>=3.11.0",
"factory-boy>=3.3.0", # Test data factories
"faker>=19.0.0", # Fake data generation
"responses>=0.23.0", # Mock HTTP requests
"freezegun>=1.2.0", # Mock datetime
"django-test-plus>=2.2.0", # Enhanced Django testing
]
dev = [
# ... existing dev dependencies
"coverage[toml]>=7.0",
"pytest-xdist>=3.3.0", # Parallel test execution
]
C.11 Xero¶
Implement/integrate Uplink with Xero. Needs to connect orders and invoices with Xero.
C.12 Implementation Timeline¶
Week 1-2 (During Phase 1): - [ ] Set up pytest configuration - [ ] Write first 10 unit tests (critical models) - [ ] Set up coverage reporting
Week 3-4 (During Phase 2A): - [ ] Add 20 more unit tests (utils, business logic) - [ ] Achieve 20% coverage - [ ] Integrate tests into CI/CD
Month 2 (During Phase 2B): - [ ] Add API integration tests - [ ] Add database query tests - [ ] Achieve 40% coverage
Month 3 (During Phase 2C-3): - [ ] Add form validation tests - [ ] Add view tests - [ ] Achieve 60% coverage
Month 4-5 (During Phase 4-5): - [ ] Add E2E tests for critical flows - [ ] Add external service mocking - [ ] Achieve 75% coverage
Month 6+ (Ongoing): - [ ] Maintain and improve coverage - [ ] Add regression tests for bugs - [ ] Target 85%+ coverage
Checklist¶
- [ ] pytest configured and running
- [ ] At least 75% code coverage
- [ ] All critical business logic tested
- [ ] All API endpoints tested
- [ ] Tests run in CI/CD on every PR
- [ ] Tests complete in <5 minutes
- [ ] Team trained on writing tests
- [ ] Test documentation available
- [ ] No deployment without passing tests
Appendix D: Deployment Procedures¶
D.1 Overview¶
This appendix provides rigorous, step-by-step deployment procedures for both development and production environments using Docker.
Deployment Principles:
- 🔒 Never deploy directly to production - Always test in dev first
- 📝 Document everything - Keep deployment logs
- 🔄 Always have a rollback plan - Know how to revert quickly
- ⏱️ Time deployments appropriately - Production deploys during low-traffic periods
- ✅ Verify after deployment - Run health checks and smoke tests
D.2 Development Environment Deployment¶
Purpose: Test Docker setup, uv dependencies, and application changes locally before production
Prerequisites:
- [ ] Docker and Docker Compose installed
- [ ] .env file configured for development
- [ ] Git repository up to date on feature branch
D.2.1 Initial Docker Setup (One-Time)¶
# 1. Navigate to project directory
cd /home/hannah/CODING/uplink
# 2. Create .env file from template
cp .env.example .env
# 3. Edit .env for local development
nano .env
# Set:
# DEBUG=True
# DATABASE_HOST=db
# REDIS_URL=redis://redis:6379/0
# 4. Build Docker images
docker-compose build
# 5. Start database and redis first (for initialization)
docker-compose up -d db redis
# 6. Wait for services to be healthy
docker-compose ps
# 7. Run initial migrations
docker-compose run --rm web python manage.py migrate
# 8. Create superuser
docker-compose run --rm web python manage.py createsuperuser
# 9. Start all services
docker-compose up -d
# 10. Verify all services are running
docker-compose ps
D.2.2 Regular Development Deployment¶
When to use: After pulling new code, changing dependencies, or switching branches
# 1. Stop running containers
docker-compose down
# 2. Pull latest code
git pull origin <branch-name>
# 3. Rebuild images if dependencies changed
docker-compose build
# 4. Start services
docker-compose up -d
# 5. Run migrations
docker-compose exec web python manage.py migrate
# 6. Collect static files (if changed)
docker-compose exec web python manage.py collectstatic --noinput
# 7. Check logs for errors
docker-compose logs -f web
# 8. Test application
# Visit http://localhost:8000
# Run smoke tests
docker-compose exec web python manage.py check
D.2.3 Development Smoke Tests¶
Run these after every deployment:
# 1. Django system check
docker-compose exec web python manage.py check --deploy
# 2. Database connectivity
docker-compose exec web python manage.py dbshell --command="SELECT 1;"
# 3. Redis connectivity
docker-compose exec redis redis-cli ping
# 4. Run critical tests
docker-compose exec web python manage.py test catalogue.tests.test_critical
# 5. Check all services are healthy
docker-compose ps
Expected output:
NAME IMAGE STATUS PORTS
uplink_web uplink:latest Up (healthy) 0.0.0.0:8000->8000/tcp
uplink_db mysql:8.0 Up (healthy) 0.0.0.0:3306->3306/tcp
uplink_redis redis:7-alpine Up (healthy) 0.0.0.0:6379->6379/tcp
uplink_huey uplink:latest Up
uplink_daphne uplink:latest Up 0.0.0.0:9000->9000/tcp
D.3 Production Environment Deployment¶
Purpose: Deploy tested changes to production with minimal downtime
Prerequisites: - [ ] Changes fully tested in development - [ ] All tests passing - [ ] Database backup completed - [ ] Rollback plan documented - [ ] Deployment window scheduled (if needed) - [ ] Team notified of deployment
D.3.1 Pre-Deployment Checklist¶
24 Hours Before:
- [ ] Merge feature branch to main
- [ ] Tag release in git: git tag v2.0.X
- [ ] Review all changes since last deployment
- [ ] Identify any risky changes or database migrations
- [ ] Schedule deployment window (if major changes)
- [ ] Notify users of potential downtime (if applicable)
1 Hour Before: - [ ] Backup production database - [ ] Backup media files - [ ] Check server disk space - [ ] Verify no other deployments in progress - [ ] Have rollback commands ready
D.3.2 Production Deployment Steps¶
Step 1: Pre-Deployment Backup
# SSH to production server
ssh user@uplink.sensational.systems
# Navigate to project directory
cd /opt/uplink
# Create backup directory
mkdir -p ~/backups/uplink-$(date +%Y%m%d-%H%M%S)
BACKUP_DIR=~/backups/uplink-$(date +%Y%m%d-%H%M%S)
# Backup database
docker-compose exec -T db mysqldump -u root -p$DB_ROOT_PASSWORD uplink > $BACKUP_DIR/database.sql
# Backup media files
tar -czf $BACKUP_DIR/media.tar.gz media/
# Backup current .env
cp .env $BACKUP_DIR/.env.backup
# Note current git commit for rollback
git rev-parse HEAD > $BACKUP_DIR/git_commit.txt
echo "Backup completed in $BACKUP_DIR"
Step 2: Pull Latest Code
# Fetch latest changes
git fetch origin
# Checkout specific tag or main branch
git checkout v2.0.X
# OR
git pull origin main
# Verify correct version
git log -1 --oneline
Step 3: Update Environment Variables (if needed)
# Compare .env with .env.example for new variables
diff .env .env.example
# Add any new required environment variables
nano .env
Step 4: Build New Docker Images
# Pull base images
docker-compose pull
# Build new application image
docker-compose -f docker-compose.yml -f docker-compose.prod.yml build
# Verify images built successfully
docker images | grep uplink
Step 5: Database Migrations (Dry Run First)
# Check for pending migrations
docker-compose exec web python manage.py showmigrations | grep "\\[ \\]"
# Dry run migrations (check for issues)
docker-compose exec web python manage.py migrate --plan
# If migrations look good, proceed
# If migrations look risky, review with team first
Step 6: Deploy with Minimal Downtime
# Restart services with new code
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# Wait for containers to be healthy
watch -n 1 'docker-compose ps'
# Run migrations
docker-compose exec web python manage.py migrate --noinput
# Collect static files
docker-compose exec web python manage.py collectstatic --noinput --clear
# Restart all services to pick up changes
docker-compose -f docker-compose.yml -f docker-compose.prod.yml restart
Step 7: Post-Deployment Verification
# 1. Check all containers are running
docker-compose ps
# 2. Check application logs for errors
docker-compose logs --tail=100 web
# 3. Run Django system check
docker-compose exec web python manage.py check --deploy
# 4. Test critical endpoints
curl -I https://uplink.sensational.systems/
curl -I https://uplink.sensational.systems/admin/
curl -I https://uplink.sensational.systems/api/
# 5. Check database connectivity
docker-compose exec web python manage.py dbshell --command="SELECT COUNT(*) FROM catalogue_product;"
# 6. Verify background workers
docker-compose logs --tail=50 huey
# 7. Check WebSocket connections
docker-compose logs --tail=50 daphne
D.3.3 Production Smoke Test Script¶
Create smoke_test.sh:
#!/bin/bash
set -e
echo "=== Production Smoke Tests ==="
# Test homepage
echo "Testing homepage..."
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://uplink.sensational.systems/)
if [ $STATUS -ne 200 ]; then
echo "❌ Homepage failed: HTTP $STATUS"
exit 1
fi
echo "✓ Homepage OK"
# Test admin
echo "Testing admin..."
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://uplink.sensational.systems/admin/)
if [ $STATUS -ne 200 ]; then
echo "❌ Admin failed: HTTP $STATUS"
exit 1
fi
echo "✓ Admin OK"
# Test API
echo "Testing API..."
STATUS=$(curl -s -o /dev/null -w "%{http_code}" https://uplink.sensational.systems/api/)
if [ $STATUS -ne 200 ]; then
echo "❌ API failed: HTTP $STATUS"
exit 1
fi
echo "✓ API OK"
# Check database
echo "Testing database..."
docker-compose exec -T web python manage.py dbshell --command="SELECT 1;" > /dev/null
echo "✓ Database OK"
# Check Redis
echo "Testing Redis..."
docker-compose exec -T redis redis-cli ping > /dev/null
echo "✓ Redis OK"
echo ""
echo "=== All Smoke Tests Passed ✓ ==="
D.3.4 Production Rollback Procedure¶
# Find backup directory
ls -lt ~/backups/ | head -5
BACKUP_DIR=~/backups/uplink-YYYYMMDD-HHMMSS
# Stop current containers
docker-compose down
# Restore previous git commit
PREV_COMMIT=$(cat $BACKUP_DIR/git_commit.txt)
git checkout $PREV_COMMIT
# Restore .env if changed
cp $BACKUP_DIR/.env.backup .env
# Restore database (if migrations were run)
cat $BACKUP_DIR/database.sql | docker-compose exec -T db mysql -u root -p$DB_ROOT_PASSWORD uplink
# Rebuild and restart containers
docker-compose -f docker-compose.yml -f docker-compose.prod.yml build
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# Verify rollback
./smoke_test.sh
D.4 Deployment Helper Script¶
Create deploy-prod.sh:
#!/bin/bash
set -e
echo "=== Uplink Production Deployment ==="
# Check if running on production server
if [ "$(hostname)" != "uplink.sensational.systems" ]; then
echo "❌ Error: This script must run on production server"
exit 1
fi
# Prompt for confirmation
read -p "Deploy to PRODUCTION? (yes/no): " CONFIRM
if [ "$CONFIRM" != "yes" ]; then
echo "Deployment cancelled"
exit 0
fi
# Prompt for git tag
read -p "Enter git tag to deploy (e.g., v2.0.1): " GIT_TAG
echo ""
echo "Starting deployment of $GIT_TAG..."
# Create backup
echo "1. Creating backup..."
BACKUP_DIR=~/backups/uplink-$(date +%Y%m%d-%H%M%S)
mkdir -p $BACKUP_DIR
docker-compose exec -T db mysqldump -u root -p$DB_ROOT_PASSWORD uplink > $BACKUP_DIR/database.sql
tar -czf $BACKUP_DIR/media.tar.gz media/
cp .env $BACKUP_DIR/.env.backup
git rev-parse HEAD > $BACKUP_DIR/git_commit.txt
echo "✓ Backup saved to: $BACKUP_DIR"
# Pull code
echo "2. Pulling code..."
git fetch origin
git checkout $GIT_TAG
echo "✓ Checked out $GIT_TAG"
# Build images
echo "3. Building Docker images..."
docker-compose -f docker-compose.yml -f docker-compose.prod.yml build
echo "✓ Images built"
# Deploy
echo "4. Deploying containers..."
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d
echo "✓ Containers started"
# Migrations
echo "5. Running migrations..."
docker-compose exec web python manage.py migrate --noinput
echo "✓ Migrations complete"
# Static files
echo "6. Collecting static files..."
docker-compose exec web python manage.py collectstatic --noinput --clear
echo "✓ Static files collected"
# Health check
echo "7. Running health checks..."
sleep 10
docker-compose ps
docker-compose exec web python manage.py check --deploy
echo "✓ Health checks passed"
# Smoke tests
echo "8. Running smoke tests..."
./smoke_test.sh
echo "✓ Smoke tests passed"
echo ""
echo "=== Deployment Complete ✓ ==="
echo "Backup location: $BACKUP_DIR"
echo "Git tag deployed: $GIT_TAG"
echo "Deployment time: $(date)"
Make it executable:
D.5 Deployment Schedule¶
Development: - Frequency: As needed (multiple times per day) - Process: Automated via git pull + docker-compose - Testing: Immediate smoke tests
Production: - Frequency: Weekly or bi-weekly - Timing: Tuesday/Wednesday 2-4 PM (low traffic) - Process: Full deployment procedure with logging - Testing: Full smoke test suite + 24-hour monitoring
Emergency Production: - Frequency: As needed for critical fixes - Timing: Immediately (with team notification) - Process: Expedited deployment + immediate rollback plan - Testing: Critical path smoke tests only
D.6 Deployment Metrics to Track¶
- Deployment frequency
- Deployment duration
- Success rate
- Rollback frequency
- Time to rollback
- Downtime during deployment
- Issues found post-deployment
- Mean time to recovery (MTTR)