DEV Community

Cover image for Flaky Tests in Laravel: Why Your CI Randomly Fails
CodeCraft Diary
CodeCraft Diary

Posted on • Originally published at codecraftdiary.com

Flaky Tests in Laravel: Why Your CI Randomly Fails

Your test suite passes locally.
CI fails.

You rerun the pipeline.
Now everything is green.

You change absolutely nothing.
An hour later, another random failure appears.

If this sounds familiar, you are probably dealing with flaky tests.

Flaky tests are tests that sometimes pass and sometimes fail without any meaningful code changes. They are one of the most frustrating problems in modern software development because they slowly destroy trust in your test suite.

And once developers stop trusting tests, they start ignoring failures, rerunning pipelines blindly, and eventually shipping bugs to production.

After dealing with flaky tests in multiple Laravel projects, I noticed something important:

Most flaky tests are not caused by PHPUnit itself.

They are usually caused by hidden shared state, timing assumptions, asynchronous behavior, or infrastructure leaking between tests.

In this article, I’ll show the most common causes of flaky tests in Laravel and how to fix them properly.

Previous article in Testing category: https://codecraftdiary.com/2026/05/09/how-mutation-testing-exposes-the-truth-php-2026-edition/


What Makes a Test “Flaky”?

A flaky test has three characteristics:

  • It fails inconsistently
  • The failure is difficult to reproduce
  • Rerunning the test often “fixes” it

This is different from a normal failing test.

A normal failing test indicates a deterministic bug.

A flaky test creates uncertainty.

And uncertainty is dangerous in CI pipelines because developers eventually stop taking failures seriously.


1. Time-Dependent Tests

One of the most common sources of flaky tests is time.

Laravel makes working with time easy through Carbon, but time-based logic can easily become unstable.

Consider this example:

public function test_subscription_expires_after_24_hours(): void
{
    $subscription = Subscription::factory()->create([
        'expires_at' => now()->addDay(),
    ]);

    sleep(1);

    $this->assertFalse($subscription->isExpired());
}
Enter fullscreen mode Exit fullscreen mode

This test may pass most of the time.

But depending on:

  • CI speed
  • server load
  • execution timing
  • timezone handling

it can eventually fail unpredictably.

The fix is simple:

Use fixed time.

Carbon::setTestNow('2026-05-28 10:00:00');

$subscription = Subscription::factory()->create([
    'expires_at' => now()->addDay(),
]);

$this->assertFalse($subscription->isExpired());
Enter fullscreen mode Exit fullscreen mode

And always clean up afterwards:

Carbon::setTestNow();
Enter fullscreen mode Exit fullscreen mode

Without cleanup, fake time can leak into other tests and create even more randomness.


2. Shared Database State

Another massive source of flaky tests is database leakage between tests.

I still see projects where tests depend on records created by previous tests.

Example:

public function test_user_can_create_post(): void
{
    $this->post('/posts', [
        'title' => 'Example',
    ]);

    $this->assertDatabaseCount('posts', 1);
}
Enter fullscreen mode Exit fullscreen mode

At first, this looks harmless. However, once another test inserts posts into the database, the count may suddenly become 2, 5, or even 12.

The fix is proper database isolation.

In Laravel, this usually means:

use RefreshDatabase;
Enter fullscreen mode Exit fullscreen mode

or:

use DatabaseTransactions;
Enter fullscreen mode Exit fullscreen mode

depending on your architecture.

I already wrote an entire article comparing these approaches because using the wrong one can create hidden instability.

The important part is this:

Tests should never depend on leftovers from previous tests.

Ever.


3. Random Factories

Factories are great.

Randomness is not.

This test looks innocent:

$user = User::factory()->create();

$this->assertEquals('admin', $user->role);
Enter fullscreen mode Exit fullscreen mode

But if the factory generates random roles, this test becomes unstable immediately.

I’ve seen this problem especially in large Laravel projects where factories evolved over years and slowly accumulated randomness everywhere.

Instead, explicitly define required state:

$user = User::factory()->create([
    'role' => 'admin',
]);
Enter fullscreen mode Exit fullscreen mode

Deterministic data creates deterministic tests.


4. Queue and Async Problems

Queues are one of the biggest sources of flaky behavior.

Especially when developers partially fake queues while still allowing some jobs to execute asynchronously.

Example:

Queue::fake();

dispatch(new SendInvoiceJob($invoice));

$this->assertDatabaseHas('invoices', [
    'status' => 'sent',
]);
Enter fullscreen mode Exit fullscreen mode

This can fail because the queued job never actually runs.

Or worse:
it runs sometimes depending on environment configuration.

Another common issue is testing behavior immediately after dispatching async jobs.

Example:

dispatch(new SyncProductsJob());

$this->assertDatabaseCount('products', 500);
Enter fullscreen mode Exit fullscreen mode

The assertion may execute before the worker finishes.

Locally it passes.

In CI it randomly fails.

A better approach is either:

  • testing the dispatch itself,
  • or running jobs synchronously during tests.

Example:

Bus::fake();

dispatch(new SyncProductsJob());

Bus::assertDispatched(SyncProductsJob::class);
Enter fullscreen mode Exit fullscreen mode

Or:

config()->set('queue.default', 'sync');
Enter fullscreen mode Exit fullscreen mode

during the test environment.


5. Parallel Testing Issues

Parallel testing speeds up CI dramatically.

But it also exposes hidden shared state.

I’ve seen failures caused by:

  • shared Redis keys
  • shared files
  • cached config
  • temporary directories
  • static variables
  • singleton state

Example:

Storage::disk('local')->put('report.pdf', 'content');
Enter fullscreen mode Exit fullscreen mode

If multiple tests write the same file simultaneously, random failures appear.

The fix is isolation.

Example:

Storage::fake();
Enter fullscreen mode Exit fullscreen mode

or unique filenames:

$file = Str::uuid() . '.pdf';
Enter fullscreen mode Exit fullscreen mode

Parallel testing does not create flaky tests.

It reveals problems that already existed.


6. External APIs

Real HTTP calls inside tests are dangerous.

Sometimes the API is slow.

Sometimes rate limits trigger.

Sometimes sandbox environments fail.

And suddenly your test suite becomes unreliable for reasons completely outside your application.

This is why external APIs should usually be mocked or faked.

Laravel provides excellent HTTP faking:

Http::fake([
    '*' => Http::response([
        'success' => true,
    ], 200),
]);
Enter fullscreen mode Exit fullscreen mode

Now your tests become:

  • faster
  • deterministic
  • independent from network stability

I covered this topic in more detail in my API mocking article because external integrations are one of the easiest ways to accidentally create unstable tests.


7. Tests That Depend on Execution Order

This one is extremely dangerous.

A test passes only because another test ran before it.

Example:

public function test_admin_exists(): void
{
    $this->assertDatabaseHas('users', [
        'email' => 'admin@example.com',
    ]);
}
Enter fullscreen mode Exit fullscreen mode

This silently depends on another test creating the admin user first.

Run tests individually and this suddenly fails.

A good test should work:

  • independently
  • repeatedly
  • in any order

If execution order matters, the suite is fragile.


Why Flaky Tests Become Expensive

The biggest problem with flaky tests is not technical.

It is psychological.

Once developers stop trusting CI:

  • failures get ignored
  • reruns become normal
  • real bugs get missed
  • confidence disappears

I’ve seen teams where developers reran pipelines three or four times automatically because “CI is always flaky anyway.”

That is dangerous.

Because eventually a real regression hides inside the noise.


Final Thoughts

Flaky tests are rarely random.

There is almost always an underlying engineering problem:

  • shared state
  • uncontrolled time
  • async behavior
  • non-isolated infrastructure
  • hidden dependencies

The solution is not “rerun CI.”

The solution is making tests deterministic.

A reliable test suite should produce the same result every time:

  • locally
  • in CI
  • on every machine
  • under every execution order

Once your tests become deterministic, your entire development workflow becomes faster, safer, and dramatically less frustrating.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.