Anmol Baranwal

Posted on Jan 8

7 steps to building scalable Backend from scratch

#backend #programming #beginners #opensource

Creating a scalable backend from scratch can be very hard.

With the right approach, you can build something ready to grow as your app does.

Today, we will cover seven practical steps to help you scale your backend and how Encore (as a backend framework) helps in some of those.

Let's jump in.

🎯 What does scaling the backend mean?

Before we start with the points, it's important to understand the overall concept.

Scaling a backend is necessary to make sure your app performs well as traffic, users or data grow. For instance, think of a social media app that handles millions of daily photo uploads or an e-commerce website managing sales with thousands of orders per second (such as Amazon).

There are two main approaches to that:

✅ vertical scaling (adding more resources to a single server)
✅ horizontal scaling (distributing the load across multiple servers).

You can watch this video if you want to know the difference between horizontal scaling and vertical scaling.

My suggestion? If you are just starting out, don't overthink it.

I believe it's something to worry about when you are sure you are going to need better infrastructure. Until then, you can rely on the usual stuff, many apps don't even need overengineering like that.

That said, let's learn more about it.

1. Build the right foundation with the correct architecture.

Think of your backend like the blueprint of a building. A bad foundation will lead to issues as you add more weight (or users). Let's see the main two types of architecture you can use.

🎯 What is Monolithic Architecture?

Monolithic Architecture means everything is built as one big app. It’s easier to start with but can get messy as you grow. All parts of your app depend on each other, so if one part breaks, the whole app can be affected.

So many popular codebases started with this, such as early versions of Instagram. This architecture works for smaller apps, but they can get messy as you scale.

To be honest, it can still be designed in a modular way so that most of the disadvantages are addressed.

🎯 What is Microservices Architecture?

Microservices Architecture breaks down the application into smaller, independent services that communicate over APIs. This approach is more scalable and flexible because you can update or scale each service independently without affecting the whole system.

Netflix, Amazon and countless others use this, where different parts of the app (like user authentication, payments, recommendations) are handled by separate services.

This can get tedious and complex so this is only recommended for large companies.

You should also read 5 Common microservices pain points and how to deal with them.

If you're interested in a comparison guide, check this blog about Microservices vs Monolith by Atlassian.

Encore is an open source backend framework and it supports microservices architecture. It's designed to give teams the tools they need for building microservices, from day one.

It also provides a visual tool known as Flow that gives you an always up-to-date view of your entire system, helping you reason about your microservices architecture and identify which services depend on each other and how they work together.

✅ Track down bottlenecks before they grow into big problems.
✅ Instantly reveal the nature and scale of its dependencies.
✅ Auto update in real-time to reflect your architecture as you make code changes.

2. Make your app stateless for true horizontal scaling.

Statelessness means each server doesn’t need to remember user sessions. It’s like not relying on one cashier to complete your shopping, where any cashier can help you.

This approach makes it easier to add or remove servers without disrupting the existing app and they don’t depend on each other to function.

It's more useful for apps that expect traffic spikes, like Spotify during music premieres or e-commerce sites (like Amazon) during sales events.

🎯 How to Implement Statelessness?

To achieve this, session management must be offloaded. You can use:

JWT (JSON Web Tokens) : a simple way to pass user data between requests without storing it on the server.
Redis : for session management like storing temporary session data in memory, and caching frequently accessed data, to minimize database calls and thus improve performance. You can also use Memcached instead of Redis.

For example, Netflix uses stateless designs extensively for user sessions, which leads to smooth streaming regardless of server loads.

Also REST APIs are typically stateless by design, which is why they are great for scaling.

Cheatsheet diagram on stateless architecture — Cheatsheet diagram on stateless by ByteByteGo

3 Smart traffic distribution that can save the day.

In this section, we will cover the topic of Load balancing. It's the process of distributing incoming requests evenly across multiple servers to make sure no single server is overwhelmed, which can otherwise lead to slowdowns or crashes.

It improves availability, fault tolerance and a lot of things including scalability.

There are different algorithms each targeting a unique traffic pattern:

⚡ Round Robin : assigns requests sequentially to each server in a loop. It's simple but may struggle with uneven loads.

⚡ Least connections : directs traffic to the server with the fewest active connections.

⚡ Weighted Round Robin : weights are assigned to each server based on capacity, allowing servers with higher weights to handle more traffic

⚡ IP Hash : The IP hash load balancing algorithm allows you to use the hashed value of the client's IP address to decide which server fulfills the request.

There is an interesting article by Equinix that breaks down how load balancing algorithms work with visual diagrams.

🎯 Tools to Implement Load Balancers?

Here are some of the most widely used tools to implement this:

NGINX : open source web server and reverse proxy, very flexible in handling load balancing and caching.
AWS ELB : cloud-managed load balancer that automatically distributes incoming traffic across instances and scales based on demand.
HAProxy : lightweight and highly reliable open source load balancer to handle large volumes of traffic with minimal latency.

There are more tools (other than these) that you can find on online blogs.

Tip: You could combine load balancers with Content Delivery Networks (CDNs) like Cloudflare to reduce the load on servers.

4. Don't let your database be the bottleneck.

Your database can turn into a major problem as your app grows (as it's usually never optimized). Slow queries and delays can frustrate users and drag everything down.

There are smart strategies you can use to handle the load properly.

🎯 Common tips for developers to optimize your database.

⚡ Indexing : using indexes to frequently queried columns for faster lookups. For instance, e-commerce apps can index product IDs for quicker search results.

⚡ Caching : storing frequently accessed data in memory, reducing the number of direct hits to the database. Redis or Memcached can be useful here.

⚡ Database sharding : for large-scale apps, you can shard your database which is like splitting it into smaller, more manageable pieces. Each piece (known as shard) can be distributed across multiple servers.

⚡ Database Partitioning : you can split your database into smaller, manageable pieces (like dividing customers by region) to distribute the load.

⚡ Connection pooling : Using pooling libraries to manage database connections efficiently without overwhelming the database.

There are many more concepts like query optimization and denormalization, but I'm not covering everything here (otherwise this will be very long).

If your app grows big, it's recommended to use distributed databases like MongoDB, Amazon Aurora or CockroachDB, which handle larger datasets and traffic in a much better way.

If you're looking to learn more, I recommend checking out:

A guide to understanding database scaling patterns by freeCodeCamp
Database Scaling by MongoDB

Database scaling cheatsheet by ByteByteGo

5. Smart monitoring and logging to prevent backend meltdowns.

As your app scales, smart monitoring and logging can help you stay on top of your backend’s health and catch problems before they get out of control.

Imagine your app goes viral overnight, traffic surges and everything seems fine at first. Then, users start reporting errors. Without proper monitoring and logs, you will be left in the dark to figure out what actually went wrong.

Getting real-time insights and detailed records, so you can identify bottlenecks and debug faster will make your work a lot easier, especially during unexpected traffic.

🎯 What is Monitoring?

Monitoring tools track your app's performance in real time. They give you metrics like server load (resource usage), response times and error rates so you can spot trends. You can use tools like:

Datadog : provides visibility across your entire stack. It provides log management, application performance monitoring (APM) and even real-time dashboards.
Prometheus : open source monitoring tool that collects metrics from configured targets at specified intervals. Plus, it integrates well with Grafana for visualizing data.
New Relic : another monitoring tool with insights related to transaction times, error rates and database query performance.

You can even set up alerts to notify your team the moment something looks off.

🎯 What is Logging?

Logging records every significant event in your backend like requests, errors or warnings. It’s like a diary for your app, helping you debug when things break.

Centralized logging using tools like ELK Stack or AWS CloudWatch makes it easy to search and analyze logs across all services.
Structured logging (JSON format) makes logs machine-readable, making the debugging process slightly easier.

🎯 What is Tracing?

Tracing helps track a single request across multiple services, pinpointing slow or failing parts. Tools like Jaeger or OpenTelemetry are great for this.

To get the maximum output, you can combine monitoring, logging and tracing. You can implement all of these with encore.

✅ Encore provides distributed tracing to track requests across your app and infrastructure.

As opposed to the intensive instrumentation you would normally need to go through to use tracing, Encore automatically captures traces for your entire application, in all environments. Uniquely, this means you can use tracing even for local development to help debug and speed up iterations.

You get access to a lot of information such as stack traces, structured logging, HTTP requests, network connection information, API calls, database queries and more.

✅ In the Encore cloud, you also get built-in support for keeping track of key metrics. It's also easy to define custom metrics for your app.

✅ Encore offers built-in support for Logging, which combines a free-form log message with structured and type-safe key-value pairs. It makes it easier for a computer to parse, analyze, and index.

You just need to add it in the module: import log from "encore.dev/log";.

Then you can call any of the logging functions such as error, warn, info, debug or trace to emit a log message.

const logger = log.with({is_subscriber: true})
logger.info("user logged in", {login_method: "oauth"}) // includes is_subscriber=true

You can also live-stream your logs directly to your terminal by running: encore logs --env=prod. Read more on the docs.

In short, Encore provides a lot of things including a way to visualize your cloud microservices architecture and automatically get a Service Catalog with complete API docs.

If you're interested in learning more, A Guide to Observability Principles: Significance, Practical Implementations, and Best Practices is a good place to start.

6. Automate the infrastructure provisioning and DevOps.

Manually setting up servers, configuring environments or deploying code can sometimes slow you down as your app scales. Automation eliminates most of the repetitive tasks and improves consistency (even if it's a little bit).

🎯 What is Infrastructure Automation?

Infrastructure automation is like using tools to manage and provision your servers, networks, and databases automatically, without manual intervention.

Tools like Terraform or AWS CloudFormation let you define your infrastructure as code (IaC).
With IaC, you can spin up identical environments for development, testing, and production with just one command.

Pulumi is one such modern IaC tool that uses programming languages you already know, like JavaScript or Python.

🎯 Why should you try to automate DevOps?

DevOps automation improves CI/CD pipelines, deployment processes and monitoring tasks.

Tools like Jenkins, GitHub Actions, or GitLab CI help in faster, error-free code deployments.
Rolling updates or blue-green deployments can happen easily without downtime.

For instance, Kubernetes automates container orchestration by managing your app’s scaling and load balancing on its own.

Let's see how Encore helps in this department.

Encore lets you define infrastructure as type-safe objects in your app, unifying your infra with your application code. Encore can then automate the infrastructure provisioning and DevOps, by parsing the application code.

Encore Cloud automatically provisions all necessary infrastructure, in all environments and across all major cloud providers, without requiring application code changes.

With Encore you do not define any cloud service specifics in the application code. This means that after deploying, you can safely use your cloud provider's console to modify the provisioned resources or use the built-in configuration UI in the Encore Cloud dashboard. Encore Cloud takes care of syncing the changes automatically in both directions.

There are so many other things that get easier with Encore Cloud as it can fully automate DevOps and infrastructure on AWS and GCP, reduce DevOps work by almost 90%.

If you're interested in learning more, check out:

Encore docs to study about Infrastructure provisioning automation.
What is DevOps Automation: How to Get Started by LambdaTest

7. Caching secrets no one talks about.

Caching just might be the most important part of backend scalability, often overlooked but capable of drastically reducing response times and server load.

The secret lies in using caching smartly for the parts of your app that demand it the most.

🎯 What is Caching?

Caching is used to store frequently accessed data temporarily so that future requests can be served faster without repeatedly querying the database or reprocessing the information.

For instance, think about an e-commerce app like Amazon during Black Friday sales. Millions of users search for trending deals and best-sellers.

Instead of querying the database for every search, the app can cache the most popular products and their details for a few minutes. This not only improves response times for users but also reduces database strain during peak traffic.

🎯 Caching techniques you should know about.

⚡ Application-Level Caching : frameworks like Django or Express.js support in-memory caching for specific business logic.

⚡ In-Memory Caching : ideal for session data, user profiles, or any frequently accessed data that requires low latency. Redis or Memcached is perfect for this.

⚡ Content Delivery Networks : services like Cloudflare or Akamai cache static files (e.g., images, CSS, JavaScript) closer to the user’s location. For instance, Amazon uses CDNs to load product images almost instantly.

⚡ Client-side caching : storing data on the client’s device like a web browser, this can lead to issues with stale data, as the cached data may not always be up-to-date.

⚡ Distributed caching : allows multiple servers to share the workload of storing and retrieving data.

Then there are also caching strategies to improve performance like:

⚡ Cache Aside (lazy loading) : it first checks the cache for data. If the data is not present (cache miss), it retrieves it from the database and populates the cache for future requests. Minimizes cache size and makes sure only the frequent data is stored.

⚡ Cache Invalidation : to update stale data, by using time-based expiration or event-driven updates.

⚡ Data Partitioning : partitioning your cache based on specific criteria (like user ID or geographical location).

⚡ Write Through : When data is updated, it is written to both the cache and the database at the same time.

⚡ Write Behind : data is written to the cache first and then to the database at a later time. This could lead to inconsistencies if the cache is not managed properly.

⚡ Adaptive Caching : dynamically adjusts caching policies based on real-time data access patterns, traffic load or resources used.

⚡ Cache Aggregation : combines multiple cache entries into a single, consolidated response, reducing the overhead of multiple lookups. Mainly useful for APIs or microservices.

⚡ Cache Stampede Prevention : To make sure that multiple requests for the same data don’t overwhelm the backend when a cache expires. Techniques like locking, request collapsing, or early recomputation can help.

There are many more strategies such as Content-aware caching, Multi-Level Caching, Cache Warm-Up, and not many developers are aware of these concepts.

If you're interested in learning more, there is a very interesting article on Cache strategies that gives you types of caching, strategies, real life examples and even code samples using those techniques. Highly recommended!

There is a very good blog that covers 9 Caching Strategies for System Design Interviews.

By using caching strategically, you can scale your backend without burning through server resources.

😮 Whoa! These are a lot of things to implement just to take care of scaling a backend. Is this really worth it? And the short answer is “Yes”.

I hope these will give you a solid foundation to scale your backend.

Have a great day! Until next time :)

I also run an international community for developers and technical writers (250+ members) where I share everything I learn. You can join at dub.sh/opensouls.