Houston Wong

Posted on Jul 7

Fixing Django Squashed Migration in Multi-App Issue #36168

#django

Introduction:

This article explains a potential fix for Django back-migration failures caused by squashed migrations in multi-app.
This issue was originally reported in the Django issue: https://code.djangoproject.com/ticket/36168
with a reproducible example provided here: https://github.com/vanschelven/squashwithrename

In the reproducible example project, there are two Django apps: squashme and triggerfailingcode. Both apps use squashed migrations. The issue appears when triggerfailingcode attempts to back-migrate to 0001_initial, triggering this error:

FieldDoesNotExist: squashme.Foo has no field named 'name'

This becomes a problem because if you remove either one of the squashed migration files, back-migrating to 0001_initial works without error. That means there is likely a bug in how Django handles multiple apps with squashed migrations.

_create_project_state Debug:

Traceback (most recent call last):
  File "/Users/houston/Desktop/Contribution/squashwithrename/manage.py", line 22, in <module>
    main()
  File "/Users/houston/Desktop/Contribution/squashwithrename/manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Users/houston/Desktop/Contribution/django/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/Users/houston/Desktop/Contribution/django/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/houston/Desktop/Contribution/django/django/core/management/base.py", line 416, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/houston/Desktop/Contribution/django/django/core/management/base.py", line 460, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/houston/Desktop/Contribution/django/django/core/management/base.py", line 107, in wrapper
    res = handle_func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/houston/Desktop/Contribution/django/django/core/management/commands/migrate.py", line 302, in handle
    pre_migrate_state = executor._create_project_state(with_applied_migrations=True)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/houston/Desktop/Contribution/django/django/db/migrations/executor.py", line 120, in _create_project_state
    migration.mutate_state(state, preserve=False)
  File "/Users/houston/Desktop/Contribution/django/django/db/migrations/migration.py", line 91, in mutate_state
    operation.state_forwards(self.app_label, new_state)
  File "/Users/houston/Desktop/Contribution/django/django/db/migrations/operations/fields.py", line 294, in state_forwards
    state.rename_field(
  File "/Users/houston/Desktop/Contribution/django/django/db/migrations/state.py", line 313, in rename_field
    raise FieldDoesNotExist(
django.core.exceptions.FieldDoesNotExist: squashme.foo has no field named 'name'

We started with _create_project_state, where the it fails. This function simulates what the database schema would look like if all migrations were applied — but without actually touching the database.

I think it's actually great that we have a working case(remove the squashed migrations file in triggerfailingcode ) — because it gives us a reference point, a "known-good" scenario that we can compare against. It helps us better understand where things go wrong. The states are build based on full plan in _create_project_state

full_plan = self.migration_plan(
                self.loader.graph.leaf_nodes(), clean_start=True
            )

full plan Debug:

Here’s the full migration plan when both squashed files are present (this is the failing case):

******[DEBUG] full plan: [('squashme', '0001_initial'), ('squashme', '0002_rename_name_foo_rename_squashed_0003_foo_another_field'), ('squashme', '0002_rename_name_foo_rename'), ('squashme', '0003_foo_another_field'), ('triggerfailingcode', '0001_squashed_0002_baz_baz'), ('triggerfailingcode', '0001_initial'), ('triggerfailingcode', '0002_baz_baz')]

Now compare that to the plan when only one squashed file is present (this is the working case):

******[DEBUG] full plan: [('squashme', '0001_initial'), ('squashme', '0002_rename_name_foo_rename_squashed_0003_foo_another_field'), ('triggerfailingcode', '0001_initial'), ('triggerfailingcode', '0002_baz_baz')]

As you can see in the failing case, after applying 0002_rename_name_foo_rename_squashed_0003_foo_another_field, Django still tries to apply the individual migrations 0002_rename_name_foo_rename and 0003_foo_another_field.
That’s a problem — because 0002_rename_name_foo_rename tries to rename the name field, but the squashed migration already did that. So the migration system tries to rename a field that no longer exists — and that’s when it breaks.

So we can conclude that the problem lies in how the full migration plan is built:

There are two parts of the full plan function:
self.loader.graph.leaf_nodes() returns all migrations that have no other migration in the same app depending on them. These are seen as the latest migrations for each app.
self.migration_plan(…) to the rest to complete the full plan

Ideally, if we could fix the issue at the leaf_nodes() or migration_plan() level — without breaking any other logic — that would be the cleanest solution. However, after some investigation, it's might be neither part is a not great place to apply the fix.

migration_plan Debug:

elif (
                self.loader.replace_migrations
                and target not in self.loader.graph.node_map
            ):
                self.loader.replace_migrations = False
                print(f"******[DEBUG] REBULD node Before: {[i for i in self.loader.graph.nodes if i[0] in ["triggerfailingcode", "squashme"] ]}")
                self.loader.build_graph()
                print(f"******[DEBUG] REBULD node After: {[i for i in self.loader.graph.nodes if i[0] in ["triggerfailingcode", "squashme"] ]}")
                return self.migration_plan(targets, clean_start=clean_start)

The problem is in migration_plan. If the target node is not found in self.loader.graph.node_map, that it's a replaced migration, Django will call self.loader.build_graph() — which rebuilds the entire migration graph.

In this case, the target is: (('triggerfailingcode', '0001_initial'), True). But this migration has been replaced by 0001_squashed_0002_baz_baz.
Since it's not in the graph nodes , Django triggers a rebuild to try to resolve it.

before it rebuld:

******[DEBUG] REBULD node Before: [('squashme', '0001_initial'), ('squashme', '0002_rename_name_foo_rename_squashed_0003_foo_another_field'), ('triggerfailingcode', '0001_squashed_0002_baz_baz')]

after it build:

******[DEBUG] REBULD node After: [('squashme', '0002_rename_name_foo_rename'), ('squashme', '0001_initial'), ('squashme', '0002_rename_name_foo_rename_squashed_0003_foo_another_field'), ('squashme', '0003_foo_another_field'), ('triggerfailingcode', '0001_initial'), ('triggerfailingcode', '0002_baz_baz'), ('triggerfailingcode', ‘0001_squashed_0002_baz_baz')]

0002_rename_name_foo_rename and 0003_foo_another_field are added — but they shouldn't be, because their logic is already included in the squashed migration 0002_rename_name_foo_rename_squashed_0003_foo_another_field.

On the other hand, triggerfailingcode: 0001_initial and 0002_baz_baz are also included — and in this case, that’s correct, because Django needs both in the plan in order to back-migrate through unapplied 0002_baz_baz.

To sum up the issue:

When Django performs a back-migration and encounters a replaced migration, migration_plan will rebuild the graph for all apps, not just the one involved. As a result, in this case, the squashme app gets reloaded — and both the squashed and unsquashed migrations end up in the graph, leading to duplicated application and failure.
The ideal behavior should be:

Only rebuild the graph nodes for the specific app_label that needs it, leave the other apps' graph state untouched

This issue doesn’t occur when there's only one app using squashed migrations, because rebuilding just that app’s graph helps Django properly determine what to unapply.
But in a multi-app setup, rebuilding unrelated apps can cause serious issues — like reintroducing migrations that were already replaced.

Passing app_label Down Is Too Aggressive

If we wanted to fix this properly by limiting the rebuild to just the affected app, we’d need to pass the app_label all the way down through the migration system — from the back-migration command, through migration_plan, and ultimately into leaf_nodes() and build_graph().

That would require significant changes across Django’s internals. It’s not practical, and it would introduce complexity and risk of side effects elsewhere.

So while rebuilding only the affected app’s graph is the ideal behavior in theory, it’s not an ideal solution in practice — at least not without a major redesign of Django’s migration internals.

Also, we can’t simply “freeze” the old nodes and prevent Django from rebuilding the graph — because in many cases, the rebuild is necessary.

For example, look at the state before the graph is rebuilt: it only includes 0001_squashed_0002_baz_baz for the triggerfailingcode app. At that point, the graph has no knowledge of 0001_initial or 0002_baz_baz, which are required for Django to correctly unapply the unapplied migrations during a backward migration.

So, without rebuilding the graph, Django would not even know what needs to be unapplied — it would just crash.

That means rebuilding the graph is a correct step — but it becomes unsafe when it rebuilds nodes of all app, especially apps that don’t need to be touched.

Safer Approach

This approach is safer and only requires changes in two functions: _create_project_state and _migrate_all_backwards.
The idea is simple: _create_project_state doesn't need the full migration dependency tree — it just needs enough information to simulate the correct state.
For example, if I'm backward migrating triggerfailingcode to 0001_squashed_0002_baz_baz, then _create_project_state should not care about the individual migrations that this squashed file replaces (0001_initial, 0002_baz_baz).

 if  (migration.app_label, migration.name) in replaced_migration:
                        print(f"[DEBUG] _create_project_state skip: {migration.app_label}.{migration.name}")
                        continue

Why? Because the squashed migration already includes all the logic required to build the correct state. There's no need to reintroduce or apply the original migrations it replaces.
We don’t change the full migration plan. Instead, during _create_project_state, we simply skip calling mutate_state for any migration that has been replaced.

We keep the full plan — and we must — because we still need migrations like 0001_initial or 0002_baz_baz in the plan so they can be unapplied during _migrate_all_backwards which is the second function we need to modify.

Even without passing app_label explicitly through the entire back-migration flow, we already know app_label without any change. We can get the app label directly from the plan in _migrate_all_backwards, because that plan contains the migrations Django is about to unapplied And their app label must match the one passed to the management command, like:

python manage.py migrate triggerfailingcode 0001_initial

We can’t simply reuse the logic from _create_project_state by skipping all replaced migrations — doing so would break things.

Why? For the target app (triggerfailingcode), we still need migrations like 0001_initial and 0002_baz_baz to invoke mutate_state(). Without them, Django can’t properly simulate the project state, nor can it determine how to unapply 0002_baz_baz later in the process.

What we need instead is a more precise condition for _migrate_all_backwards:

If a migration is replaced and its app label is not the one being back-migrated, skip it.

if  (migration.app_label, migration.name) in replaced_migration and migration.app_label not in unapply_migrations:
                    print(f"[DEBUG] _migrate_all_backwards SKIPPED {migration.app_label}.{migration.name}")
                    continue

This adjustment ensures replaced migrations are correctly ignored only when they’re irrelevant to the app_label. As a result, this logic fixes the issue and passes the regression tests without breaking existing behavior.

To help identify and debug the problem, I created a small project here:
https://github.com/houston0222/django-debug-36168
This project clearly shows where things break during back-migration and was key to understanding the issue.