It is very easy to talk about your current systems and code as if they are old and legacy. "That Monolith" is legacy. It is old and the code is written in a way new joiners might not like. Even the coding language or framework might be something not in the top charts at the Github Octoverse anymore.
But more often than not, these systems, monoliths, are still the moneymaker (€€) for your company. The same for us at Coolblue.
Let me explain the problem we believe to have with at least one of our monoliths based on two concerns.
As written earlier on our vision to have Guided Ownership as low as possible in the Development teams, having a monolith that you want to improve or needs maintenance is counter intuitive then.
Shared ownership tends to leads to no ownership
We are probably not the first department that sees this issue 1.
Some practical issues can arise also when working with multiple teams on a single solution. Most of which can be addressed with proper release management, good automation and a mature CI/CD platform.
One of our two monoliths was written almost two decades ago. We used Delphi to write an application to handle all aspects of our business processes.
In itself Delphi is not the problem, even though usage in the market is declining. For us the more pressing reason to actively address this monolith is that the application is written as a desktop/windows application.
The biggest concern we have is the lack of clear and separated business logic in the application design. Logic resides in either a button click/screen or in a trigger in the data layer.
We also built a single database and datamodel for this monolith to work on. Here the lack of clear ownership is becoming more and more visible. Using data from tables across by services created by other teams, making it hard to than innovate and make changes to your schema.
Not going to hang out our laundry too much here, just setting the scene why we are moving forward with the following approach.
# what are we going to do about it? String.Replace("monolith","guided ownership");
The basis of our approach to solve the described problems is going to be replace by rearchitecting. Every time we want to improve or change a process, we will build a new solution and make it replace part of the monolith.
This approach allows us to work piece by piece, carving out features, processes and data as we go. Allowing for MVP like implementations first. The downside is that in many cases you are dealing with part of your logic and data living in one system (the monolith) and a newer replacement system. When you look at the data part, this means we have the constant challenge to get to an accurate and consistent data warehouse.
We made the following choices to help us do exactly this:
Domain Driven Design; Together with our tech principle design-to-encapsulate-volatility and using a pattern as Ports and Adapters allows our code to separate logic and infrastructure specific code (e.g. Oracle specific queries). This allows us to separate out the parts we are re-architecting that still need the monolith (data)access.
Events and Event Driven Architecture; to decouple the different processes and their supporting apps and services we are moving towards Event Driven Architecture. This way we can abstract the data leaving the bounded context of an application from its technical form or technology. This helps to hide the underlying situation of having two data stores during the transition period (the monolith on the one hand, and the new re-architected one on the other) from the systems receiving these events.
We also see that having more and more apps and services emit and rely on events we will need to support that. We chose to begin with Apache Kafka as the platform to facilitate these event streams. Allowing both our Data Warehouse to tap into these streams as well as enable teams to rely on Kafka to stream between apps (bounded context).
Let me be clear, we are not going to replace all inter-service-communication with events and Kafka. For some processes a batch approach, e.g. via Airflow, is still a valid and great choice. The "it depends" strikes again.
This brings a new problem to the table for our teams. Setting up producers, topics and consumers in this new Kafka platform. We have to help the teams here by enabling them.
Here are three examples how we have been and will be enabling our teams for this new direction.
In a previous post I hinted already that for the year of 2023 we are looking to invest and develop the tools, templates and processes to make deploying an app or service with queues, Kafka topics and BigQuery staging tables along side the needed compute components for AWS easy and enabled for Self-service.
In the past few months we have deliberately done a few things to help with the knowledge needed. We sent a small delegation to the DDD Europe event and the Event Storming workshop that was held then (summer 2022). This group soaked up the knowledge and went ahead inside our tech organisation to help teams and their respective domains to perform these event storming sessions to learn how to do them and to use the output generated in the design of their next solution. This was a joint effort between Principal Developers, Data Engineering and Development teams.
We also have done some research and experiments and based on that we created a solution block based on the Transactional Outbox pattern in one of our core technologies in use: dotnet/c#. And will add this to our template for creating new apps and services. This way teams can use it right away, and can see how this integrates into their new solution.
Going forward, we will have to keep checking if Kafka is the right fit for us to do more decoupling via Events. We chose this technology based on an proof of concept we did almost 2 years ago.
We need to keep an eye out how the carving up of this monolith is progressing. Many parts are now separate Employee Activity focussed applications already. But to turn a monolith fully off, that is the biggest challenge.
And lastly, what ever we do or change in our technology vision going forward, we need to make sure we always enable the teams. Give them the Guided Ownership they need to realise our vision.