Hafiz

Posted on Jun 15 • Originally published at hafiz.dev

Multi-Tenancy + Queues: The Three Bugs Every Laravel SaaS Hits in Its First Year

#laravel #multitenancy #queues #saas

Originally published at hafiz.dev

Your multi-tenant Laravel app works perfectly in development. Every test passes. You push to production, onboard your first ten customers, and everything looks fine. Then one morning a customer emails you a screenshot of someone else's dashboard data.

That's not a hypothetical. It's what happens when multi-tenancy and queues collide without the right safeguards. The combination is tricky because queue workers are long-running daemons. Unlike HTTP requests (which boot fresh for every visitor), a queue worker starts once and processes hundreds of jobs sequentially. Any tenant state left over from one job bleeds into the next.

I've seen three specific bugs come up again and again in multi-tenant Laravel SaaS apps. They're all silent. They don't throw exceptions in local development. They don't fail your test suite. And they can leak data between tenants in production. Here's what they are and how to fix each one.

Bug 1: Tenant Context Evaporates in Queued Jobs

This is the classic one, and it's the scariest because it fails silently.

You dispatch a job from a controller while Tenant A is active. The job gets serialized to the queue. A worker picks it up seconds later. But the worker process has no idea which tenant dispatched that job. It's running in the central (landlord) context, or worse, it's still holding stale context from the previous job it processed for Tenant B.

Here's the flow that causes the data leak:

View the interactive diagram on hafiz.dev

The problem gets worse if you're using SerializesModels. Laravel's model serialization stores the model's ID and class name, then rehydrates the model when the job runs. But rehydration calls newQueryForRestoration(), which applies any global scopes, including your BelongsToTenant scope. That scope tries to filter by tenant_id, but tenant_id is null because no tenant is active yet. The query returns nothing. ModelNotFoundException. Your job fails, but the real cause is buried under a misleading error message.

The worst part? This never shows up in tests. Most test suites use Queue::fake() or the sync driver, which processes jobs inline with no serialization round-trip. The bug only appears with a real queue worker.

And it's hard to spot in production too. If your models don't use tenant-scoped global scopes, you won't even get exceptions. The job will happily query the central database, find nothing (or find the wrong tenant's data), and complete "successfully." You'll only discover the problem when a customer reports seeing data that isn't theirs. Or worse, when they don't report it because they didn't notice.

Here's how to detect it before a customer does. Add a sanity check at the start of every tenant job's handle() method during your early production phase:

public function handle(): void
{
    if (! tenant()) {
        report(new \RuntimeException(
            "Job " . static::class . " ran without tenant context. "
            . "Expected tenant: {$this->tenantId}"
        ));
        $this->fail();
        return;
    }

    // Proceed with actual job logic...
}

This gives you an early warning in your error tracker instead of a silent data leak. Remove it once you've confirmed your bootstrapper is working correctly.

The Fix

Both major tenancy packages handle this, but you have to explicitly enable it.

With stancl/tenancy (v3.10):

Enable the QueueTenancyBootstrapper in your config/tenancy.php:

'bootstrappers' => [
    Stancl\Tenancy\Bootstrappers\DatabaseTenancyBootstrapper::class,
    Stancl\Tenancy\Bootstrappers\CacheTenancyBootstrapper::class,
    Stancl\Tenancy\Bootstrappers\QueueTenancyBootstrapper::class, // Add this
],

This serializes the current tenant's ID into the job payload and restores tenant context before the job runs. But there's a catch. The bootstrapper fires after SerializesModels tries to rehydrate your models. So don't pass tenant-scoped Eloquent models directly into job constructors. Pass the ID instead:

// Bad: model rehydration runs before tenant context is restored
class ProcessInvoice implements ShouldQueue
{
    public function __construct(public Invoice $invoice) {}
}

// Good: pass the ID, query inside handle()
class ProcessInvoice implements ShouldQueue
{
    public function __construct(public int $invoiceId) {}

    public function handle(): void
    {
        // Tenant context is active by this point
        $invoice = Invoice::findOrFail($this->invoiceId);
        // Process it...
    }
}

With spatie/laravel-multitenancy (v4.1):

Set queues_are_tenant_aware_by_default to true in config/multitenancy.php. Or implement the TenantAware marker interface on individual jobs:

use Spatie\Multitenancy\Jobs\TenantAware;

class ProcessInvoice implements ShouldQueue, TenantAware
{
    // This job will automatically run in the correct tenant context
}

Jobs that should explicitly run in the central context can implement NotTenantAware instead. The same SerializesModels warning applies here: pass IDs, not models.

Bug 2: Cache Key Collisions Across Tenants

This bug is quieter than the first one. No exceptions, no failed jobs. Just wrong numbers on a dashboard that nobody notices for weeks.

Imagine you cache a dashboard stat:

Cache::put('monthly_revenue', $total, now()->addHour());

That key, monthly_revenue, is the same string for every tenant. If Tenant A's queue job processes a report and caches the result, Tenant B reads the same cache key and sees Tenant A's revenue figure.

The obvious fix is to prefix every cache key with the tenant ID manually. But that's error-prone because you'll forget it in at least one place. The better fix is to let the tenancy package handle prefixing globally.

The Fix

With stancl/tenancy:

The CacheTenancyBootstrapper (shown in the Bug 1 config above) automatically prefixes all cache keys with the tenant's identifier. Every Cache::get() and Cache::put() call is transparently scoped. You don't change your application code at all.

But watch for these edge cases that the bootstrapper doesn't catch automatically:

// These need manual tenant scoping:

// 1. Redis locks
Cache::lock("report_generation_{$tenant->id}", 30);

// 2. Rate limiting keys
RateLimiter::attempt("api_{$tenant->id}_{$user->id}", 60, fn() => true);

// 3. Laravel Scout search indexes
// Use tenant-prefixed index names or a filterable tenant_id attribute

// 4. spatie/laravel-permission cache
// Set a tenant-specific cache key in config/permission.php

With spatie/laravel-multitenancy:

Add the PrefixCacheTask to your switch tenant tasks:

// config/multitenancy.php
'switch_tenant_tasks' => [
    \Spatie\Multitenancy\Tasks\SwitchTenantDatabaseTask::class,
    \Spatie\Multitenancy\Tasks\PrefixCacheTask::class, // Add this
],

This prefixes cache keys for memory-based stores like Redis and APC. The same edge cases apply: locks, rate limiters, and third-party package caches all need manual attention.

One more thing. If you're running Laravel Octane in a multi-tenant setup, you have an additional risk. Octane reuses the same application instance across requests. A stale cache prefix from a previous request's tenant can bleed into the next request if the bootstrapper doesn't reset properly between requests.

And cache isn't the only place where unprefixed keys cause cross-tenant leaks. Session cookies on wildcard domains (*.yourapp.com) can let a session from one tenant's subdomain work on another tenant's subdomain. Validation rules like Rule::unique('users', 'email') check the full table unless you scope them with a where clause. Rate limiting keys based on IP addresses mean tenants behind the same corporate proxy share rate counters. These aren't queue-specific bugs, but they compound when a queued job reads from a session or applies a validation rule without tenant scoping.

Bug 3: Failed Job Retries Run in the Wrong Tenant

Your tenancy bootstrapper serializes the tenant ID into the job payload. Jobs dispatch and process correctly. You think you're covered. Then a job fails.

This one is particularly nasty because it takes time to appear. Bugs 1 and 2 can show up on day one if you're paying attention. But Bug 3 only triggers when a job actually fails, and then only when someone retries it. Most SaaS apps don't have a robust retry workflow in their first few months. They're focused on getting things working, not recovering from failures. So this bug sits dormant, waiting for the first time an external API times out or a database connection hiccups.

Laravel stores failed jobs in the failed_jobs table. When you run php artisan queue:retry 5 three days later, the framework pulls the serialized payload, reconstructs the job, and dispatches it again. But here's the problem: the retry mechanism doesn't fire your tenancy bootstrapper the same way the original dispatch did.

With spatie/laravel-multitenancy, this was a confirmed bug in earlier versions. The tenant ID was embedded in the payload, but the retry path didn't restore tenant context before the job started processing. The job would run in whatever tenant context the worker happened to have at that moment, which could be the central database or a completely different tenant.

The Fix

The fix depends on your package version.

With stancl/tenancy v3.10: The QueueTenancyBootstrapper handles retries correctly in recent versions. The tenant ID is stored at the top level of the job payload and restored on retry. Verify this by inspecting a failed job's payload:

$failed = DB::table('failed_jobs')->find(5);
$payload = json_decode($failed->payload, true);

// You should see a tenant_id key at the top level
dd($payload['tenant_id']); // Should not be null

If tenant_id is missing from the payload, your bootstrapper isn't configured correctly. Go back to Bug 1 and ensure QueueTenancyBootstrapper is in your bootstrappers array.

With spatie/laravel-multitenancy v4.1: The MakeQueueTenantAwareAction in v4 handles this correctly. But if you're on an older version, you can listen for the retry event manually:

// In a service provider
use Illuminate\Queue\Events\JobRetryRequested;

Event::listen(JobRetryRequested::class, function ($event) {
    $payload = $event->payload();
    $tenantId = $payload['tenant_id'] ?? null;

    if ($tenantId) {
        $tenant = Tenant::find($tenantId);
        $tenant?->makeCurrent();
    }
});

Testing retries properly:

This is the part most teams skip. You can't test retries with Queue::fake(). You need an integration test that uses a real queue driver, dispatches a job that deliberately fails, then retries it and verifies the tenant context:

it('retries failed jobs in the correct tenant context', function () {
    $tenant = Tenant::factory()->create();
    $tenant->makeCurrent();

    // Dispatch a job that will fail on first attempt
    dispatch(new FailOnceJob($tenant->id));

    // Process the queue (job fails)
    Artisan::call('queue:work', ['--once' => true]);

    // Retry the failed job
    Artisan::call('queue:retry', ['id' => 'all']);

    // Process again (should succeed in correct tenant context)
    Artisan::call('queue:work', ['--once' => true]);

    // Assert the job ran in the correct tenant
    expect(DB::connection('tenant')->table('job_results')->count())->toBe(1);
});

The Audit Checklist

If you're running a multi-tenant Laravel application, here's a quick audit you can run right now:

Tenant context in jobs: Dispatch a job from a tenant context with the database or redis driver (not sync). Check whether the job runs against the correct tenant's data. If you're passing Eloquent models to job constructors, switch to passing IDs.

Cache isolation: Open tinker as Tenant A, run Cache::put('test', 'tenant-a'). Switch to Tenant B, run Cache::get('test'). If you get tenant-a back, your cache isn't scoped. Enable the cache bootstrapper or prefix task.

Failed job retries: Deliberately fail a tenant job, wait a minute, run queue:retry all. Check which tenant's database the retried job hit. If it's wrong, check your package version and bootstrapper configuration.

Queue topology: If one tenant dispatches a massive CSV export job, does it block jobs for every other tenant? Consider dedicating separate queues for heavy operations so one tenant's workload doesn't starve the rest.

Worker restart cadence: Queue workers hold state in memory. If you deploy a tenancy config change but don't restart workers, the old configuration stays active. Always run php artisan queue:restart after deploying changes to your tenancy bootstrappers or cache prefix configuration. In production, use a process manager like Supervisor that can gracefully restart workers on deploy.

FAQ

Do I need stancl/tenancy or spatie/laravel-multitenancy for this?

Not strictly. You can build tenant-aware queues manually by storing the tenant ID in every job and restoring context in handle(). But the packages automate the serialization, context restoration, and cache scoping. If you're already using one of them for your multi-tenancy implementation, enabling queue support is a few lines of config.

Can I use the sync queue driver to avoid these bugs?

Technically yes, because the sync driver processes jobs inline in the same request, so tenant context is always present. But the sync driver blocks the request until the job finishes, which defeats the purpose of using queues. These bugs only surface with async drivers (database, Redis, SQS), and that's exactly what you'll use in production.

Does Laravel Horizon help with tenant-scoped queues?

Horizon gives you visibility into what's happening on your queues, but it doesn't handle tenant context itself. You still need the tenancy package's queue bootstrapper for context restoration. That said, Horizon's queue balancing and monitoring are useful for spotting when one tenant's jobs dominate the queue.

What about database-per-tenant setups? Is the jobs table per tenant or central?

Keep the jobs and failed_jobs tables in the central (landlord) database. If you put them in tenant databases, the queue worker won't know which tenant database to connect to before it even picks up a job. Both stancl/tenancy and spatie/laravel-multitenancy recommend central job storage with tenant context serialized into the payload.

Should I run separate queue workers per tenant?

For most apps, no. A shared worker pool with tenant context in the payload works fine. Separate workers per tenant only makes sense if you have strict resource isolation requirements or tenants with wildly different job volumes. The complexity of managing dozens of worker processes usually isn't worth it until you're well past your first year.

Wrapping Up

Multi-tenancy and queues are both solved problems individually. The bugs live at the intersection, in the gap between "tenant context exists during the HTTP request" and "tenant context needs to exist in a background worker too." All three bugs share the same root cause: queue workers are long-lived processes that don't get fresh context the way web requests do.

The good news is that both stancl/tenancy (v3.10) and spatie/laravel-multitenancy (v4.1) have solid solutions for all three. But you have to enable them, test them with a real queue driver, and audit your cache keys. If you're building a multi-tenant SaaS on Laravel and want help getting the queue layer right, let's talk.

DEV Community

Multi-Tenancy + Queues: The Three Bugs Every Laravel SaaS Hits in Its First Year

Bug 1: Tenant Context Evaporates in Queued Jobs

The Fix

Bug 2: Cache Key Collisions Across Tenants

The Fix

Bug 3: Failed Job Retries Run in the Wrong Tenant

The Fix

The Audit Checklist

FAQ

Wrapping Up

Top comments (0)