DEV Community

cycy
cycy

Posted on

From Broken to Bulletproof: Fixing a Django Docker Deployment with ECR, SSH, and Missing Migrations

A complete guide to debugging and resolving common Django deployment issues in a containerized environment

TL;DR

Spent 6 hours debugging a broken Django backend deployment. Issues included missing migrations, SSH authentication problems, database type mismatches, missing dependencies, and outdated ECR images. This post documents the complete solution with all commands used.


🚨 The Disaster Scenario

Picture this: You're tasked with fixing a "broken backend deployment" and the error logs look like hieroglyphics:

backend-1 | django.db.migrations.exceptions.NodeNotFoundError: 
          | Migration accounts.0002_initial dependencies reference 
          | nonexistent parent node ('company', '0001_initial')
Enter fullscreen mode Exit fullscreen mode

Sound familiar? Here's how I solved this multi-layered deployment nightmare.

πŸ” Problem Discovery Phase

The Initial Error

docker-compose logs backend
# Output: "Dependency on app with no migrations: accounts"
Enter fullscreen mode Exit fullscreen mode

Root Cause Analysis:

  • Django couldn't start because migration files were missing
  • The accounts app had models but no migration files
  • This is a common issue when migrations aren't committed to version control

πŸ—‚οΈ Issue #1: Missing Django Migrations

The Investigation

# Check what migrations exist
ls accounts/migrations/
# Output: Only __init__.py (empty directory)

# Check the models
cat accounts/models.py
# Output: Complex CustomUser model with relationships
Enter fullscreen mode Exit fullscreen mode

The Solution: Proper Migration Generation

❌ Wrong Way (Don't do this):

# Generating migrations on server (bad practice)
docker-compose exec backend python manage.py makemigrations
Enter fullscreen mode Exit fullscreen mode

βœ… Right Way:

  1. Generate migrations locally or in development
  2. Commit to version control
  3. Deploy via proper CI/CD pipeline

The Migration Files Created:

# accounts/migrations/0001_initial.py
# Generated by Django 4.2 on 2025-08-29 12:05

class Migration(migrations.Migration):
    initial = True
    dependencies = []

    operations = [
        migrations.CreateModel(
            name='CustomUser',
            fields=[
                ('password', models.CharField(max_length=128)),
                ('id', models.UUIDField(default=uuid.uuid4, primary_key=True)),
                ('email', models.EmailField(unique=True)),
                # ... more fields
            ],
        ),
        # ... more models
    ]
Enter fullscreen mode Exit fullscreen mode

Pro Tip: Migration Dependencies

# Always check migration dependencies
python manage.py showmigrations
Enter fullscreen mode Exit fullscreen mode

πŸ” Issue #2: SSH Authentication Nightmare

This was the most confusing part. The repository was configured for HTTPS, but GitHub discontinued password authentication in 2021.

The SSH Errors

git pull origin develop
# Error: Username for 'https://github.com': 
# Error: remote: Support for password authentication was removed

ssh -T git@github.com
# Error: Host key verification failed
Enter fullscreen mode Exit fullscreen mode

The SSH Solution Step-by-Step

Step 1: Check existing SSH keys

ls -la ~/.ssh/
# Found: id_rsa, id_rsa.pub (keys already existed)
Enter fullscreen mode Exit fullscreen mode

Step 2: Add GitHub to known hosts

ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
Enter fullscreen mode Exit fullscreen mode

Step 3: Test SSH connection

ssh -T git@github.com
# Success: "You've successfully authenticated, but GitHub does not provide shell access."
Enter fullscreen mode Exit fullscreen mode

Step 4: Switch repository from HTTPS to SSH

# Check current remote
git remote -v
# Output: origin https://github.com/user/repo.git

# Switch to SSH
git remote set-url origin git@github.com:user/repo.git

# Verify change
git remote -v
# Output: origin git@github.com:user/repo.git
Enter fullscreen mode Exit fullscreen mode

SSH Debugging Commands

# Test SSH connection with verbose output
ssh -vT git@github.com

# Check SSH key fingerprint
ssh-keygen -lf ~/.ssh/id_rsa.pub
Enter fullscreen mode Exit fullscreen mode

🐳 Issue #3: The ECR Image Time Warp

This was the trickiest issue to understand. The ECR (Elastic Container Registry) had an outdated image that didn't include our fixes.

The Problem Visualization

Timeline:
August 21: ECR image built (missing migrations, missing cryptography)
    ↓
Today: Code fixed in GitHub (migrations added, cryptography added)
    ↓  
Problem: ECR still serving August 21 image!
Enter fullscreen mode Exit fullscreen mode

Understanding Docker Image Immutability

# Your docker-compose.yml was pulling old image
services:
  backend:
    image: 123456.dkr.ecr.region.amazonaws.com/app:latest
    # This "latest" tag was pointing to old broken image!
Enter fullscreen mode Exit fullscreen mode

The ECR Fix Strategy

Step 1: Temporary local build (proof of concept)

# docker-compose.yml
services:
  backend:
    # image: 123456.dkr.ecr.region.amazonaws.com/app:latest
    build:
      context: ./backend-directory
Enter fullscreen mode Exit fullscreen mode
docker-compose down
docker-compose up --build -d
# Result: βœ… Works perfectly! (proves fixes are correct)
Enter fullscreen mode Exit fullscreen mode

Step 2: Update ECR with fixed image

# Login to ECR
aws ecr get-login-password --region ca-central-1 | \
docker login --username AWS --password-stdin \
123456.dkr.ecr.ca-central-1.amazonaws.com

# Tag local working image
docker tag backend-backend:latest \
123456.dkr.ecr.ca-central-1.amazonaws.com/app:latest

# Push updated image
docker push 123456.dkr.ecr.ca-central-1.amazonaws.com/app:latest
Enter fullscreen mode Exit fullscreen mode

Step 3: Switch back to ECR

# docker-compose.yml
services:
  backend:
    image: 123456.dkr.ecr.ca-central-1.amazonaws.com/app:latest
    # build:
    #   context: ./backend-directory
Enter fullscreen mode Exit fullscreen mode
docker-compose down
docker-compose up -d
# Result: βœ… Now using updated ECR image!
Enter fullscreen mode Exit fullscreen mode

πŸ§ͺ Issue #4: Missing Dependencies

The Cryptography Package Error

# Error in logs:
django.core.exceptions.ImproperlyConfigured: 
'cryptography' package is required for sha256_password or caching_sha2_password
Enter fullscreen mode Exit fullscreen mode

The Database Authentication Problem

  • MySQL 8.1+ uses caching_sha2_password by default
  • Python's MySQL connector requires cryptography package for this auth method
  • Our requirements.txt didn't include cryptography

The Solutions

Option 1: Add cryptography dependency (Recommended)

# requirements.txt
cryptography>=3.4.8
Enter fullscreen mode Exit fullscreen mode

Option 2: Temporary workaround (while waiting for deployment)

# docker-compose.yml
services:
  db:
    image: mysql:8.1
    command: --default-authentication-plugin=mysql_native_password
Enter fullscreen mode Exit fullscreen mode

🎯 The Complete Fix Implementation

Final docker-compose.yml Structure

version: "3.8"
services:
  db:
    image: mysql:8.1
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: ${DATABASE_PASS:-mypassword}
      MYSQL_DATABASE: ${DATABASE_NAME:-mydb}
    command: --default-authentication-plugin=mysql_native_password
    healthcheck:
      test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
      interval: 10s
      timeout: 5s
      retries: 5

  backend:
    image: 123456.dkr.ecr.region.amazonaws.com/app:latest
    restart: always
    ports:
      - "8000:8000"
    depends_on:
      db:
        condition: service_healthy
    environment:
      - DATABASE_HOST=db
      - DATABASE_PORT=3306
Enter fullscreen mode Exit fullscreen mode

Verification Commands

# Check containers are running
docker-compose ps

# Test backend endpoint
curl -s http://localhost:8000/admin/login/ | head -3

# Check logs for errors
docker-compose logs backend --tail=20
Enter fullscreen mode Exit fullscreen mode

πŸš€ Results & Lessons Learned

Before vs After

# Before:
❌ Backend crashes on startup
❌ Missing migration files
❌ SSH authentication failing
❌ ECR serving outdated images
❌ Database connectivity issues

# After:
βœ… Backend running smoothly
βœ… All migrations applied
βœ… SSH authentication working
βœ… ECR updated with fixed image
βœ… Database connections stable
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  1. Migration Management: Always commit migrations to version control
  2. SSH Setup: Understand the difference between HTTPS and SSH authentication
  3. Container Immutability: ECR images don't auto-update; you must push new versions
  4. Dependency Management: Keep requirements.txt updated with all necessary packages
  5. Debugging Strategy: Work systematically through each layer (database β†’ application β†’ container β†’ registry)

πŸ› οΈ Essential Commands Reference

Docker Operations

# Basic container management
docker-compose down
docker-compose up -d
docker-compose ps
docker-compose logs service-name

# ECR operations
aws ecr get-login-password --region region | docker login --username AWS --password-stdin registry-url
docker tag local-image:tag registry-url/repo:tag
docker push registry-url/repo:tag
Enter fullscreen mode Exit fullscreen mode

Git & SSH

# SSH debugging
ssh -T git@github.com
ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts

# Repository management
git remote set-url origin git@github.com:user/repo.git
git pull origin branch --no-rebase
Enter fullscreen mode Exit fullscreen mode

Django Management

# Migration operations
docker-compose exec backend python manage.py makemigrations
docker-compose exec backend python manage.py migrate
docker-compose exec backend python manage.py showmigrations
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Prevention Strategies

For Development Teams

  1. Always commit migrations immediately after generating them
  2. Use consistent authentication methods (SSH preferred)
  3. Maintain updated requirements.txt files
  4. Implement proper CI/CD pipelines that auto-build ECR images
  5. Document deployment procedures for team members

For DevOps

  1. Set up automated ECR builds on code changes
  2. Implement health checks in docker-compose
  3. Use proper environment variable management
  4. Monitor container logs in production
  5. Maintain rollback procedures

πŸ† Conclusion

What started as a simple "backend is broken" ticket turned into a comprehensive debugging session covering:

  • Django migrations and database relationships
  • SSH authentication and Git workflows
  • Docker containerization and ECR image management
  • MySQL authentication methods
  • CI/CD pipeline troubleshooting

The key lesson? Modern web development involves many interconnected systems. Understanding how each piece worksβ€”and more importantly, how they failβ€”is crucial for effective debugging.

Time invested: 6 hours

Issues resolved: 5 major problems

Knowledge gained: Invaluable

Backend status: πŸš€ Production ready!


Have you faced similar deployment nightmares? Share your war stories in the comments below! πŸ‘‡

Top comments (0)