DEV Community

mafflerbach
mafflerbach

Posted on

Maintaining and Supporting Over 50 Services with Java, Apache Camel, and ActiveMQ/Kafka.

Maintaining and supporting over 50 services can be quite a challenging task, but with the right tools and processes in place, it can become a manageable task. In this blog post, we will share some of our general experiences on how we maintain and support these services.

We are running our services in java using the apache camel framework and using activeMq or kafka as a message bus system. All of our services implement the Dead Letter Queue (DLQ) pattern, which is implemented on the framework level. The modules of our services are strictly divided into filter, enrich, transform or send.

One of the daily tasks of our support team is to check if messages are failing and then contacting the stakeholder to determine whether we should dump and forget the message or replay it. It can get tedious to communicate with the creator or receiver of the message and wait for their response, as we are just a "man in the middle" and do not have any ownership over the content of the message.

However, our centralized logging and monitoring system, consisting of log4j and fluentd, has proven to be a useful tool. The logs are accessible via openSearch and allow us to quickly identify any issues. Monitoring is handled by Grafana and Prometheus as a data source, while Opsgeni serves as our alerting system. With all these tools in place, we are quickly alerted whenever something goes wrong, allowing us to address it promptly.

Automated testing is also crucial in maintaining our services. We rely on integration tests based on jbehave, which helps us to detect issues early in the development cycle. We have also implemented a standard failure handling mechanism that is implemented at the framework level. This mechanism handles error messages and puts them on the right dead letter queue while also implementing an automated retry mechanism.

In conclusion, maintaining and supporting over 50 services can be quite challenging, but with the right tools, processes, and mindset in place, it becomes a manageable task. Good communication, automated testing, a robust monitoring system, and a profound failure handling mechanism are all essential aspects of maintaining these services. By leveraging these tools and processes, we ensure that our services remain reliable, performant, and meet our stakeholder's needs.

Top comments (0)