Web applications are often divided into multiple deployment units. Often their called microservices and most of the time they are not really microservices. An architecture that is divided in multiple deployment units is called a distributed architecture. A deployment unit is a self-contained package of software components which can be individually deployed. An example is a single web application with its database. In web applications, deployment units are connected through protocols such as REST, SOAP, Events or others. An architecture which is not is called a monolithic architecture.
Advantages of distributed architectures over monolithic architectures
- Deploy applications over several machines and you'll have much more computing power available.
- When designed well, applications won't go down completely if a part of the application goes down.
- Multiple teams can work more easily individually on a part of the application.
Sounds great right? Well, there are a lot of challenges and disadvantages to face if we choose for a distributed architecture. In the next section I will describe them.
The fallacies of distributed computing
The fallacies of distributed computing are a set of eight false assumptions programmers and architects often make. They're made by L Peter Deutsch and others at Sun Microsystems in 1994. The next section will describe them and provide options to deal with the problems these fallacies might cause.
Fallacy 1: The network is reliable.
If system 2 works perfectly well, but is not accessible for service 1 due to network issues, service 2 is still unavailable. This is why timeouts, service breakers and retry policies exist. A great tool for .NET to handle common network issues is Polly, but even when using a tool like this, the network is still not completely reliable.
Fallacy 2: Latency is zero
If component 1 calls component 2 within a monolith the latency is almost zero. If a network call has to be made, the time it takes for the call to be completed will be much longer. Especially if many network calls have to be made. If the latency of a particular call is 100ms and we'll chain ten calls, latency will add one second to complete the business process.
- Reduce the number of network calls. Instead of sending multiple pieces of data individually, send them in the same request.
- Latency could be reduced by moving the data closer to the client. If the client is in West Europe, make sure your data is in West Europe as well.
- Temporary caching data could reduce the number of network calls, reducing the latency to zero if data has already been fetched. Storing data locally at the client (with an pub/sub model for example) could also be an option to reduce the latency to zero.
Fallacy 3: Bandwidth is infinite
Let's say component 1 fetches 500kb data from service 2. That doesn't sound like much, but if that happens 2.000 times, 1Gb of bandwidth will be used. This is could cause increased latency and bottlenecks. Therefor, monitoring bandwidth is probably a good idea in a distributed architecture. Ways to reduce the amount of bandwidth are:
- Caching
- Storing data locally
- Compression
- GraphQL
- Field selectors
- You could also use lightweight data formats like JSON or a binary serialization format.
Fallacy 4: The network is secure
The attack surface of distributed applications is much bigger than the attack surface of distributed applications. Every single component should be secured because there are many ways you're application can be attacked like XSS, vulnerabilities in operation systems, libraries and DDOS just to name a few.
Fallacy 5: The topology never changes
This fallacy is about every network component like routers, servers, firewalls and proxy servers. The topology changes all the time. Updated network components could make services unavailable, if a component breaks it will be replaced, if a server can't handle the request anymore it could be replaced or a load balancer and an extra server could be added. With modern technology like Kubernetes, Docker and Azure app services for example, virtual machines or containers could even dynamically be added or removed.
- Use host names instead of hard coded IP addresses.
- If that's not enough, use discovery services.
- Service bus frameworks could also help, because every components communicates with the service bus.
- Automate as much as possible so you can replace a server as quickly as possible .
- Monitor all services.
Fallacy 6: There is only one admin
Distributed architectures are complex, especially when they get big. It can't be maintained by a single administrator so it requires a lot of communication between teams to make everything work correctly. This makes decoupling, release management and monitoring extra important.
Fallacy 7: Transport cost is zero
This fallacy is about money. With a distributed architecture you will need extra servers, extra proxies and firewalls etc. which makes a distributed architecture more expensive. If you want to cache data, like discussed in fallacy 2 and 3, we might need extra server memory or a Redis cluster. If we use compression like discussed in fallacy 3 we would need more computing power to compress data. Extra resources are also needed for serializing and deserializing of data. These things might seem cheap, but at large scale it could become very expensive.
Fallacy 8: The network is homogeneous
Networks consist of different components of different vendors which have to be compatible with each other. In a distributed architecture a lot of different combinations of components can be used and not all of them are fully tested. We also don't have control of which browsers and devices connect to you're service. This fallacy isn't all about hardware. It's also about software.
- Try to use open and popular standards like JSON or XML.
- Using PaaS or IaaS providers will take some hardware challenges away.
Other challenges
Monitoring
Finding bugs in a distributed architecture is hard. In a monolithic application there is one log instead of several. Combining all these logs is necessary to trace what happened when an error occurred. There are tools for this, but it's still much more difficult than a single log.
Contract versioning
When multiple components talk to each other they need to understand each other. A data contract is used for this purpose. A data contract describe the messages being sent from one component to another. It consists of which kind of standard is being used (XML or JSON for example), properties, datatype and the structure of data. Contracts can't be changed because it might cause another component to break. Therefor contracts need versioning to be able to migrate to a new version of a contract. Changes in contracts must also be communicated to other development teams and there should be an overview of which deployment unit uses which contract so you know when you can remove an old contract.
Deployment
Many components have to be deployed in a distributed architecture.
- Make sure components are loosely coupled so each can be deployed individually .
- Automate deployments to reduce the amount of work deploying all components .
Distributed transactions
Transactions are easy in monolithic applications. Begin transactions -> do stuff -> commit or rollback transaction. But what if the stuff you want to do requires actions in multiple components? Technologies exist to handle these situations, but that's still much more complex than transaction that are not distributed. Distributed architectures often rely on eventual consistency.
Local development
Local development with a distributed architecture can be done in two different ways. The first is to setup all the components (or only the subset of components required by a developer) of the application. The larger the application gets, the harder and more time consuming this process gets. The other way is to setup an extra environment for development purposes. This environment must be maintained and will come with the cost of extra hardware. Infrastructure as code or scripts make life easier setting up new environments. Debugging is also much harder in a distributed environment.
What kind of application could be suitable for a distributed architecture?
- Applications with a huge code base. I used to work at a company where I had some colleagues who worked on a huge monolithic application which took hours to compile. In this scenario a distributed architecture could be a good idea.
- An application built in a big company with a lot of developers.
- Applications which needs to be very scalable.
- Applications where availability is very important.
In all other scenario's a monolithic architecture is probably the best approach.
I hope that by now you can choose if your next application will be monolithic or distributed, but that doesn't mean you're done yet. There are different types of distributed and monolithic architectures each with their own advantages and disadvantages. It's important to know about them and make a good decision.
Details of these types are out of the scope of this post, so I will only mention some of them.
Monolithic architecture types
- Layered architecture
- Pipeline architecture
- Microkernel architecture
Distributed architecture types
- Service oriented architecture (SOA)
- Microservices
- Serverless
I hope you enjoyed reading this and I love to hear any feedback!
Top comments (5)
Nice post. I think microservices got a bit overhyped, though they certainly do have merits and use cases. Once a developer admitted to me that they went with a microservice architecture because "microservices are cool" (I think they were criticizing their past self when they said that :)
I think the vast majority of projects should be started pretty monolithically, but you can and should design the monolith to be modular and easy to break out into microservices eventually should the circumstances require it. If you take pains to write modular code, you can get a lot of the benefits of microservices without a lot of the drawbacks.
Thank you. I absolutely agree that the vast majority of projects should start as monolithically. I think most of them will never have to change to a microservices apporach. The reason for using microservice I hear often is need for scalability. When I talk further I often see that the need for scalability is often overrated. I also see a lack of knowledge about how to make a monolithic architecture scalable. That's why I just published this this.
But there are always more project specific options to make an application more scalable. I used to work on an application which was a SPA and an API. We also needed to process a lot off messages. We handled that with a seperate Azure Function App. In that way we had the scalability we needed but the architecture doesn't even come close to a microservices architecture.
If youโre interested, check out PURISTA.
Itโs a typescript backend framework which is addressing the things you mentioned.
Iโm the initiator of it ๐
It allows you to implement you logic with the option to decide how you want to deploy it (later).
The range goes from single monolith, over microservice style, over lambda style, up to a mix of it.
I've taken a look at the documentation and I have to say that I like the idea very much. I think it should get more mature before I'd use it for serious clients, because it's new and there is only one developer.
Thanks for your feedback! ๐
Sure, it is still young and currently not kind of real production ready.
The biggest pain currently is, that I'm the only developer. At the moment, I invest more time to get feedback, preparing documentation, slides, articles, and so on, to get more people engaged and finding some other developers who are willing to join.