DEV Community

Hisyam Johan
Hisyam Johan

Posted on

Fixing a GitLab CE Upgrade Error: FinalizeHkFixNonExistingTimelogUsers and undefined method 'id' for nil:NilClass

When upgrading my GitLab CE instance to 18.5.1, the upgrade repeatedly failed during database migrations with this Ruby error:

undefined method `id' for nil:NilClass
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/background_migration/fix_non_existing_timelog_users.rb:12:in `perform'
Enter fullscreen mode Exit fullscreen mode

The failure happened inside a batched background migration finalize step called FinalizeHkFixNonExistingTimelogUsers. This post documents how I diagnosed and fixed it, without rolling back or losing data.


Environment and Symptoms

The environment:

  • Ubuntu 24.04
  • GitLab CE 18.5.1 (/opt/gitlab/embedded/service/gitlab-rails)
  • PostgreSQL 16.10 as the main DB (gitlabhq_production)

Every time I ran:

sudo gitlab-rake db:migrate
Enter fullscreen mode Exit fullscreen mode

I saw:

main: == 20250916232115 FinalizeHkFixNonExistingTimelogUsers: migrating
...
undefined method `id' for nil:NilClass
Enter fullscreen mode Exit fullscreen mode

The stack trace pointed to Gitlab::BackgroundMigration::FixNonExistingTimelogUsers, executed from the finalize migration 20250916232115_finalize_hk_fix_non_existing_timelog_users.rb.

gitlab-ctl reconfigure also failed because it runs database migrations as part of the omnibus setup.


Step 1: Inspect background migrations

GitLab provides a Rake task to list batched background migrations and their status. I started there:

sudo gitlab-rake gitlab:background_migrations:status \
  NAME=FixNonExistingTimelogUsers
Enter fullscreen mode Exit fullscreen mode

This printed a long list of migrations with states like finished, finalized, and a few active. The row for FixNonExistingTimelogUsers showed status = 5 in the database, which corresponds to finalized in GitLab’s internal enum.

However, the finalize migration was still trying to run a job for that background migration, and that job crashed when it encountered a timelog whose associated record was nil (hence .id on nil). This behavior matches a known GitLab bug where this background migration fails in some edge cases.

At this point I knew:

  • The background migration itself was already marked as finalized.
  • The failing piece was the post‑migration finalize step (FinalizeHkFixNonExistingTimelogUsers).

Step 2: Try the official helper (and why it didn’t work)

GitLab’s docs recommend using helper Rake tasks to manage batched migrations, such as marking jobs as succeeded or requeuing them.

I tried:

sudo gitlab-rake gitlab:background_migrations:mark_all_jobs_as_succeeded \
  NAME=FixNonExistingTimelogUsers
Enter fullscreen mode Exit fullscreen mode

But this failed with:

Don't know how to build task 'gitlab:background_migrations:mark_all_jobs_as_succeeded'
Enter fullscreen mode Exit fullscreen mode

On this version/packaging of GitLab CE, that helper task simply doesn’t exist, so I couldn’t use the documented way to skip the broken migration.


Step 3: Inspect batched_background_migrations directly

Since the helper didn’t exist, the next step was to look directly at the batched_background_migrations table in PostgreSQL.

Connect to the DB:

sudo gitlab-psql -d gitlabhq_production
Enter fullscreen mode Exit fullscreen mode

Then:

SELECT id, job_class_name, status
FROM batched_background_migrations
WHERE job_class_name = 'FixNonExistingTimelogUsers';
Enter fullscreen mode Exit fullscreen mode

The result:

 id  |       job_class_name       | status
-----+----------------------------+--------
 548 | FixNonExistingTimelogUsers |      5
(1 row)
Enter fullscreen mode Exit fullscreen mode

According to GitLab’s enum mapping, status = 5 means finalized, which confirmed the background migration was already considered done.

Initially I tried to set status = 'finished', but that failed because status is a numeric (smallint) column, not a string. There was no need to force this value anyway; it was already at the most “done” state.


Step 4: Identify the real culprit – the finalize migration

db:migrate kept trying to run this migration:

== 20250916232115 FinalizeHkFixNonExistingTimelogUsers: migrating
Enter fullscreen mode Exit fullscreen mode

The traceback showed it calling ensure_batched_background_migration_is_finished, which in turn tried to run another batch job for FixNonExistingTimelogUsers, even though its status was already finalized. That job raised undefined method 'id' for nil:NilClass, and the whole migration aborted.

So the root cause was:

  • A post‑migration (FinalizeHkFixNonExistingTimelogUsers) that was incorrectly trying to execute a finalized background migration and hitting a data edge case.

GitLab has documented similar situations where a buggy batched migration requires a follow‑up fix or manual intervention.


Step 5: Mark the finalize migration as applied in schema_migrations

On my GitLab version, schema_migrations only has a version column (no dirty flag), so the simplest fix was to tell Rails, “this finalize migration is already applied.”

Again in psql:

sudo gitlab-psql -d gitlabhq_production
Enter fullscreen mode Exit fullscreen mode

Check if the version exists:

SELECT version
FROM schema_migrations
WHERE version = '20250916232115';
Enter fullscreen mode Exit fullscreen mode

If the query returns no rows, insert it:

INSERT INTO schema_migrations (version)
VALUES ('20250916232115')
ON CONFLICT (version) DO NOTHING;
Enter fullscreen mode Exit fullscreen mode

Then exit:

\q
Enter fullscreen mode Exit fullscreen mode

This explicitly marks 20250916232115 FinalizeHkFixNonExistingTimelogUsers as already applied, so db:migrate will skip it entirely.


Step 6: Rerun migrations and reconfigure

After inserting that row, I reran:

sudo gitlab-rake db:migrate
sudo gitlab-ctl reconfigure
Enter fullscreen mode Exit fullscreen mode

This time:

  • db:migrate did not print == 20250916232115 FinalizeHkFixNonExistingTimelogUsers: migrating.
  • The undefined method 'id' for nil:NilClass error disappeared.
  • gitlab-ctl reconfigure completed successfully, and GitLab came up cleanly on 18.5.1.

Lessons learned

A few takeaways from this troubleshooting:

  • Batched background migrations can block upgrades. When a finalize post‑migration insists on finishing a batched job that has a bug, you can be stuck even if the underlying migration is already marked as finalized.
  • GitLab’s helper Rake tasks differ by version. On some installations, gitlab:background_migrations:mark_all_jobs_as_succeeded and similar helpers may not exist, and you must operate directly at the DB layer.
  • schema_migrations is the source of truth for Rails migrations. If a finalize migration is purely orchestration around an already‑finalized batched job, marking that migration version as applied is a safe and effective way to unblock the upgrade, especially when there’s an upstream bug acknowledged in issues/MRs.

If you hit the same FinalizeHkFixNonExistingTimelogUsers error on GitLab CE 18.x and see FixNonExistingTimelogUsers already finalized in batched_background_migrations, inserting the migration version into schema_migrations as shown above should let you move forward while you watch for the official upstream fix.

Top comments (0)