Deploynix

Posted on Apr 5 • Edited on May 10 • Originally published at deploynix.io

Building a Multi-Server Laravel Architecture: When and How to Split Your Stack

#deploynix #multiserver #infrastructure #scaling

Every Laravel application starts on a single server. PHP, Nginx, MySQL, Valkey, queue workers, cron jobs — everything running on one machine, sharing the same CPU, memory, and disk. This is the correct starting point. It is simple, cheap, and sufficient for more applications than most developers realize.

But there comes a point where the single server cannot keep up. Maybe your database is consuming so much memory that PHP-FPM workers are getting killed. Maybe a heavy queue job is spiking CPU and slowing down web requests. Maybe you need redundancy because downtime is no longer acceptable.

This is when you split your stack across multiple servers. The decision is not "should I use multiple servers?" — it is "when should I, and which component should I separate first?" Getting this wrong means spending money on infrastructure that does not solve your actual bottleneck.

This guide walks through the progression from single server to multi-server architecture, explains how to identify which component to split first, and shows how Deploynix's server types make the transition straightforward.

The Single-Server Baseline

A single well-configured server handles more traffic than most developers expect. A $24/month DigitalOcean Droplet (4 vCPU, 8GB RAM) running Nginx, PHP-FPM, MySQL, and Valkey can comfortably serve a Laravel application handling 500-1,000 requests per minute with a typical CRUD workload.

The advantage of this architecture is simplicity. All inter-process communication happens locally with zero network latency. Database connections are over a Unix socket, not TCP. Cache reads hit local memory. Deployments update one server. Backups capture everything in one place.

When to stay single-server:

Your application handles under 1,000 RPM consistently
CPU and memory usage stay below 60% during peak hours
You do not require high availability (some downtime during deployments or maintenance is acceptable)
Your database fits comfortably in memory alongside PHP and the web server

Deploynix server type: App Server — this is the all-in-one configuration that includes Nginx, PHP, your chosen database, and cache on a single machine.

Identifying the Bottleneck

Before splitting anything, you need to know what is actually constraining your system. The four most common bottlenecks for Laravel applications, in order of frequency:

1. Database Contention

The database is the most common bottleneck because it is the most resource-intensive component. MySQL and PostgreSQL want as much memory as possible for their buffer pools and caches. PHP-FPM workers also want memory. On a single server, they compete.

Symptoms:

High iowait CPU percentage (database waiting on disk)
Slow queries appearing in logs that were previously fast
PHP-FPM workers timing out on database operations
Memory usage consistently above 80%, with the database consuming the majority

2. Worker Competition

Queue workers run PHP processes that compete with web-serving PHP-FPM processes for CPU and memory. A heavy background job (generating PDFs, processing images, syncing data from external APIs) can monopolize resources and slow down web responses.

Symptoms:

Web response times spike when queue volume is high
CPU usage correlates with queue processing activity
Users report slowness during batch operations

3. Memory Exhaustion

When your web server, database, cache, and workers all share the same physical memory, the operating system's OOM killer may start terminating processes. Which process gets killed is unpredictable — it might be a PHP-FPM worker, or it might be your database server.

Symptoms:

Processes randomly dying and restarting
dmesg shows OOM killer activity
Application errors about lost database connections
Swap usage above zero (any swap means you have exceeded physical RAM)

4. Scalability Limits

Sometimes the bottleneck is not a single resource but the inability to scale horizontally. If traffic spikes require more web workers than a single server can support, you need multiple web servers behind a load balancer.

The Split Order: What to Separate First

Step 1: Separate the Database

The database is almost always the first component to split. It benefits the most from dedicated resources, and isolating it provides the clearest performance improvement.

What changes:

Provision a dedicated Database server through Deploynix (supports MySQL, MariaDB, and PostgreSQL)
Update your Laravel .env to point DB_HOST to the database server's private IP
Add a firewall rule on the database server to allow connections from your web server's IP address through Deploynix's firewall management

What you gain:

The database gets all available memory for its buffer pool, dramatically improving query performance for workloads that exceed what shared memory allowed
PHP-FPM workers no longer compete with the database for memory
You can size each server independently — a smaller web server and a larger database server
Database backups no longer impact web server performance

What you lose:

Network latency between web and database. On the same provider and datacenter, this is typically 0.5-1ms per query. If your application makes 20 queries per request, that is 10-20ms of added latency — noticeable but usually acceptable.
Simplicity. You now have two servers to manage, monitor, and back up.

Cost analysis: A $24/month web server plus a $24/month database server ($48 total) often outperforms a single $48/month server because each component gets dedicated resources optimized for its workload.

Deploynix server type: Database Server — provisioned with the database engine of your choice and optimized configuration. Use the firewall management interface to restrict database port access to only your web server's IP address.

Step 2: Separate the Cache

If your application relies heavily on caching (and it should), separating Valkey onto its own server ensures cache operations do not compete with PHP processing.

What changes:

Provision a dedicated Cache server through Deploynix
Update REDIS_HOST (Valkey uses the Redis protocol) in your .env
Cache operations now go over the network instead of a local socket

What you gain:

Dedicated memory for caching means a larger cache that is never evicted due to memory pressure from other processes
Cache persistence across web server deployments and restarts
In a multi-web-server setup (coming in Step 4), all servers share the same cache

When to do this: When your cache eviction rate is high (meaning Valkey is removing keys to make room), or when you are about to move to multiple web servers (which requires a shared, external cache).

Deploynix server type: Cache Server — optimized for Valkey with appropriate memory allocation and network configuration.

Step 3: Separate the Workers

Queue workers are the easiest component to isolate because they have no inbound traffic. They simply pull jobs from the queue and process them.

What changes:

Provision a dedicated Worker server through Deploynix
Deploy your application code to the worker server (it needs the same codebase to execute jobs)
Workers connect to the same database and cache servers as your web server

What you gain:

Heavy background jobs no longer impact web response times
You can scale workers independently based on queue volume
You can provision worker servers with different specs (more CPU for compute-heavy jobs, more memory for data processing jobs)

When to do this: When queue processing is visibly impacting web performance, or when you need to process more jobs concurrently than your web server's CPU allows.

Deploynix server type: Worker Server — configured with PHP and Supervisor for running queue workers, no Nginx or public web access needed.

Step 4: Add More Web Servers and a Load Balancer

This is the step that enables horizontal scaling. Instead of one web server handling all traffic, you distribute requests across multiple identical web servers.

What changes:

Provision additional Web servers through Deploynix
Deploy your application to all web servers (Deploynix deploys to all servers in a site configuration)
Provision a Load Balancer server through Deploynix

Load balancing methods available in Deploynix:

Round Robin: Distributes requests evenly across servers. Simple and effective when servers have equal specs.
Least Connections: Sends each request to the server with the fewest active connections. Better when request processing times vary.
IP Hash: Routes requests from the same IP to the same server. Useful when you need session affinity, though Laravel's centralized session storage (database or Valkey) makes this less necessary.

What you gain:

Horizontal scalability — add more web servers as traffic grows
Redundancy — if one web server fails, the load balancer routes traffic to the remaining servers
Zero-downtime deployments become even smoother with rolling deploys across servers

Prerequisites you must address first:

Sessions must be stored in a shared store (database or Valkey, not file)
Cache must be external (separate Valkey server, not local file cache)
File uploads must go to external storage (S3, DigitalOcean Spaces) or a shared filesystem
Any local state must be eliminated

Deploynix server types: Web Server (multiple) + Load Balancer — web servers handle PHP processing while the load balancer distributes traffic.

The Full Architecture

At full scale, a multi-server Laravel architecture on Deploynix looks like this:

[Users] → [Load Balancer]
              ├── [Web Server 1] ──┐
              ├── [Web Server 2] ──┤
              └── [Web Server 3] ──┤
                                   ├── [Database Server (Primary)]
                                   ├── [Cache Server (Valkey)]
                                   └── [Worker Server(s)]

Each component is independently sized and scalable. You can add web servers during a traffic spike, upgrade the database server without touching anything else, or add worker servers during a heavy processing period.

Cost Analysis: When Multi-Server Makes Financial Sense

Multi-server architecture costs more in raw infrastructure but often saves money when you account for the full picture.

Single large server approach:

1x server with 8 vCPU, 32GB RAM: ~$96/month on Hetzner, ~$192/month on DigitalOcean

Multi-server approach (equivalent total resources):

2x Web servers (2 vCPU, 4GB each): ~$14/month on Hetzner
1x Database server (4 vCPU, 16GB): ~$22/month on Hetzner
1x Cache server (2 vCPU, 4GB): ~$7/month on Hetzner
1x Worker server (2 vCPU, 4GB): ~$7/month on Hetzner
Total: ~$50/month on Hetzner

The multi-server setup costs less than the single large server because each component is sized for its actual needs. The database gets the most memory (for its buffer pool), while web servers and workers need less.

More importantly, multi-server architecture gives you redundancy. With a single server, any hardware failure means total downtime. With multiple web servers behind a load balancer, a single server failure reduces capacity but does not cause an outage.

Common Mistakes When Splitting

Splitting Too Early

Do not split your stack at the first sign of performance trouble. Optimize first:

Add database indexes
Implement caching for expensive queries
Fix N+1 queries
Enable PHP OPcache
Configure MySQL's buffer pool properly

A well-optimized single server handles far more traffic than a poorly optimized multi-server setup.

Splitting the Wrong Component

If your bottleneck is CPU-intensive queue jobs, separating the database will not help. Identify the actual constraint before deciding what to split. Use monitoring data, not guesses.

Ignoring Network Latency

Every component you split introduces network latency. A database query that took 1ms locally takes 2ms over the network. If your application makes 50 queries per request, that is an extra 50ms. Optimize query counts before splitting, and always use eager loading to minimize round trips.

Forgetting About Deployment Complexity

More servers means more deployment targets. Deploynix handles this by deploying to all servers in a site configuration simultaneously, but you still need to think about deployment order. Database migrations should complete before new application code goes live on web servers.

Conclusion

The progression from single server to multi-server architecture is not a switch you flip — it is a series of deliberate decisions driven by observed bottlenecks and growth requirements. Start with a single App server, monitor its performance, and split components only when monitoring data shows a clear bottleneck.

The typical progression is: separate the database first (it always benefits from dedicated resources), then the cache (especially before adding multiple web servers), then workers (when background processing impacts web performance), and finally add more web servers behind a load balancer (when you need horizontal scaling or redundancy).

Deploynix's server types — App, Web, Database, Cache, Worker, Meilisearch, and Load Balancer — map directly to this architecture. Each server type is provisioned with optimized defaults for its role, and firewall rules between servers can be configured through the firewall management interface. The transition from single server to multi-server does not require starting over or re-learning your deployment workflow.

Build what you need today, monitor what you have, and scale when the data tells you to. That is the only scaling strategy that avoids both premature optimization and unpleasant surprises.

DEV Community

Building a Multi-Server Laravel Architecture: When and How to Split Your Stack

The Single-Server Baseline

Identifying the Bottleneck

1. Database Contention

2. Worker Competition

3. Memory Exhaustion

4. Scalability Limits

The Split Order: What to Separate First

Step 1: Separate the Database

Step 2: Separate the Cache

Step 3: Separate the Workers

Step 4: Add More Web Servers and a Load Balancer

The Full Architecture

Cost Analysis: When Multi-Server Makes Financial Sense

Common Mistakes When Splitting

Splitting Too Early

Splitting the Wrong Component

Ignoring Network Latency

Forgetting About Deployment Complexity

Conclusion

Top comments (0)