Don't kill the bearer of bad news

#discuss #career #management #leadership

Have you ever been in a situation where you had to manage expectations of people who thought a software project was on track and delivering the required features would be easy? Having to explain technical debt, gaps in feature specifications and risky external dependencies?

It is scary to tell the unpleasant truth. You don't want to sound negative, decrease the team morale and make managers angry. And there might be managers or customers who will dislike people telling them such not-so-nice news. On the other hand, staying silent about problems and risks won't make them disappear. It is more likely they will only grow and suddenly blow up when it is too late to adjust plans and mitigate them.

People telling bad news early and accurately are important. And so are parts of the system doing the same.

In microservices architectures and many other real-world systems that have to be distributed it is common that a service, in order to execute a request, calls APIs of another service or services.

Some architectures even contain dedicated services whose purpose is to translate the communication between two parts of the system, typically between a legacy application and new services or between in-house systems and external (hard-to-influence) services. Such services are often called a gateway, anti-corruption layer, adapter or facade.

It is not enough if these services can translate the data in positive scenarios. If something is not right, forwarding it as-is to the other side or even worse, ignoring it, can lead to issues that are hard to detect and fix.

Imagine a service A calling service B that in turn calls service C using a REST API. What should service B do if service C returns an HTTP 4xx status? Just forwarding the same error to A and/or writing it in a log using a generic exception handler won't do.

+-----------+        REST       +-----------+        REST       +-----------+
| Service A | --------------->  | Service B | --------------->  | Service C |
+-----------+        ??? <----  +-----------+     HTTP 4xx <--  +-----------+

Service B is supposed to translate between the worlds of C and A. It probably adds some more data and context when calling C, so C's response may make no sense to A and will be unusable to fix the problem.

4xx statuses in HTTP mean client error. But can we be sure it is A's fault if C returned a 4xx error? Service B should expect the most probable errors returned by C, detect them and translate them to responses that A understands.

For errors returned by C that B does not expect, it is not clear if they are caused by A or B, so to be on the safe side, it may be more correct to return a 500 (internal server error) to A because it is an error that B failed to identify.

Also, B should validate the data in the request from A. In case the data cannot be used to create a valid request for C, it should not even make the call to C and return a meaningful validation error to A. Skipping the validation may lead to strange hard-to-understand errors coming from C and forwarded to A or written to logs.

Correct error handling and logging at all levels of a software system is essential to prevent bad things from becoming even worse. Especially in distributed systems where it is very common that the original context is lost.

Top comments (1)

rokoss21 • Dec 15 '25

This applies not only to people, but to systems themselves.

A well-designed system should surface bad news early, explicitly, and in a form the next boundary can actually understand. Swallowing errors, forwarding them blindly, or translating everything into a generic failure just delays the moment when reality catches up.

Gateways, adapters, and anti-corruption layers exist precisely to preserve meaning across boundaries. If they fail to translate errors properly, they don’t just hide problems — they distort them. At that point, the system is lying, not protecting.

In my experience, most “unexpected outages” are simply ignored signals that were already there. The architecture didn’t lack data — it lacked honesty.

This is as much a technical responsibility as it is a leadership one.