DEV Community: Bartek Żyliński

Need to enforce “Package X must never depend on Y”? Just assert it. ArchUnit converts architectural rules into JUnit tests, breaking the build before review. #architecture #testing

Bartek Żyliński — Mon, 25 Aug 2025 15:42:05 +0000

Bartek Żyliński

Aug 15 '25

ArchUnit Guide – How to Unit Test Your Architecture

#java #softwareengineering #archunit #testing

6 min read

Chatting with Spring & WebSocket

Bartek Żyliński — Mon, 11 Aug 2025 18:55:27 +0000

As you may have already guessed from the title the topic for today will be Spring Boot WebSockets. Some time ago I provided an example of WebSocket chat based on Akka toolkit libraries. However, this chat will have somewhat more features, and a quite different design.

I will skip some parts so as not to duplicate too much content from the previous article. Here you can find a more in depth intro into WebSockets. Please note that all the code that’s used in this article is also available in the GitHub repository.

Spring Boot WebSocket – Tools Used

Let’s start the technical part of this text with a description of tools that will be further used to implement the whole application. As I cannot fully grasp how to build real WebSocket API with classic Spring — STOMP overlay. I decided to go for Spring WebFlux and make everything reactive.

Spring Boot — no modern Java app based on Spring can exist without Spring Boot, all the autoconfiguration is priceless.
Spring WebFlux — reactive version of classic Spring, provides quite a nice and descriptive toolkit for handling both WebSockets and REST. I would dare to say that it is the only way to actually get WebSocket support in Spring.
Mongo — one of the most popular NoSQL databases, I am using it for storing messages history.
Spring Reactive Mongo — Spring Boot starter for handling Mongo access in reactive fashion. Using reactive in one place but not the other is not the best idea. Thus, I decided to make DB access reactive as well.

Let’s start the implementation!

Spring Boot WebSocket – Implementation

Dependencies and config.

pom.xml

<dependencies>
    <!--Compile-->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-webflux</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-mongodb-reactive</artifactId>
    </dependency>
</dependencies>

application.properties

spring.data.mongodb.uri=mongodb://chats-admin:admin@localhost:27017/chats

I prefer .properties over .yml — IMHO YAML is not readable and non-maintainable on a bigger scale.

WebSocketConfig

@Configuration
class WebSocketConfig {

    @Bean
    ChatStore chatStore(MessagesStore messagesStore) {
        return new DefaultChatStore(Clock.systemUTC(), messagesStore);
    }

    @Bean
    WebSocketHandler chatsHandler(ChatStore chatStore) {
        return new ChatsHandler(chatStore);
    }

    @Bean
    SimpleUrlHandlerMapping handlerMapping(WebSocketHandler wsh) {
    Map<String, WebSocketHandler> paths = Map.of("/chats/{id}", wsh);
        return new SimpleUrlHandlerMapping(paths, 1);
    }

    @Bean
    WebSocketHandlerAdapter webSocketHandlerAdapter() {
        return new WebSocketHandlerAdapter();
    }
}

And surprise, all the four beans defined here are very important.

ChatStore — custom bean for operating on chats, I will go into more details on this bean in following steps.
WebSocketHandler — this bean will store all the logic related to handling WebSockets sessions.
SimpleUrlHandlerMapping — responsible for mapping urls to correct handler full url for this one will look more or less like this ws://localhost:8080/chats/{id}.
WebSocketHandlerAdapter — a kind of capability bean it adds WebSockets handling support to Spring Dispatcher Servlet.

ChatsHandler

class ChatsHandler implements WebSocketHandler {

    private final Logger log = LoggerFactory.getLogger(ChatsHandler.class);

    private final ChatStore store;

    ChatsHandler(ChatStore store) {
      this.store = store;
    }

    @Override
    public Mono handle(WebSocketSession session) {
        String[] split = session.getHandshakeInfo()
            .getUri()
            .getPath()
            .split("/");
        String chatIdStr = split[split.length - 1];
        int chatId = Integer.parseInt(chatIdStr);
        ChatMeta chatMeta = store.get(chatId);
        if (chatMeta == null) {
            return session.close(CloseStatus.GOING_AWAY);
        }
        if (!chatMeta.canAddUser()) {
            return session.close(CloseStatus.NOT_ACCEPTABLE);
        }

        String sessionId = session.getId();
        store.addNewUser(chatId, session);
        log.info("New User {} join the chat {}", sessionId, chatId);
        return session
               .receive()
               .map(WebSocketMessage::getPayloadAsText)
               .flatMap(message -> store.addNewMessage(chatId, sessionId, message))
               .flatMap(message -> broadcastToSessions(sessionId, message, store.get(chatId).sessions())
               .doFinally(sig -> store.removeSession(chatId, session.getId()))
               .then();
    }

    private Mono broadcastToSessions(String sessionId, String message, List sessions) {
        return sessions
        .stream()
        .filter(session -> !session.getId().equals(sessionId))
        .map(session -> session.send(Mono.just(session.textMessage(message))))
        .reduce(Mono.empty(), Mono::then);
    }
}

As I mentioned above here you can find all the logic related to handling WebSocket sessions. First we parse the ID of a chat from the url to get target chat. Responding with different statuses depend on the context present for a particular chat.

Additionally, I am also broadcasting the message to all the sessions related to particular chat — for users to actually exchange the messages. I have also added doFinally trigger that will clear closed sessions from the chatStore , to reduce redundant communication.

As whole this code is reactive there are some restrictions I need to follow. I have tried to make it as simple and readable as possible, if you have any idea how to improve it I am open.

ChatsRouter

@Configuration(proxyBeanMethods = false)
    class ChatRouter {

    private final ChatStore chatStore;

    ChatRouter(ChatStore chatStore) {
        this.chatStore = chatStore;
    }

    @Bean
    RouterFunction routes() {
        return RouterFunctions
        .route(POST("api/v1/chats/create"), e -> create(false))
        .andRoute(POST("api/v1/chats/create-f2f"), e -> create(true))
        .andRoute(GET("api/v1/chats/{id}"), this::get)
        .andRoute(DELETE("api/v1/chats/{id}"), this::delete);
    }
}

WebFlux approach to defining REST endpoints is quite different from the classic Spring. Above you can see the definition of 4 endpoints for managing chats. As similar as in the case of Akka implementation I want to have a REST API for managing Chats and WebSocket API for actual handling chats.

I will skip the functions implementations as they are pretty trivial, you can see them on GitHub.

ChatStore

First the interface

public interface ChatStore {

    int create(boolean isF2F);

    void addNewUser(int id, WebSocketSession session);

    Mono addNewMessage(int id, String userId, String message);

    void removeSession(int id, String session);

    ChatMeta get(int id);

    ChatMeta delete(int id);
}

Then the implementation

public class DefaultChatStore implements ChatStore {

    private final Map<Integer, ChatMeta> chats;
    private final AtomicInteger idGen;
    private final MessagesStore messagesStore;
    private final Clock clock;

    public DefaultChatStore(Clock clock, MessagesStore store) {
        this.chats = new ConcurrentHashMap<>();
        this.idGen = new AtomicInteger(0);
        this.clock = clock;
        this.messagesStore = store;
    }

    @Override
    public int create(boolean isF2F) {
        int newId = idGen.incrementAndGet();
        ChatMeta chatMeta = chats.computeIfAbsent(newId, id -> {
        if (isF2F) {
            return ChatMeta.ofId(id);
        }
            return ChatMeta.ofIdF2F(id);
        });
        return chatMeta.id;
    }

    @Override
    public void addNewUser(int id, WebSocketSession session) {
        chats.computeIfPresent(id, (k, v) -> v.addUser(session));
    }

    @Override
    public void removeSession(int id, String sessionId) {
      chats.computeIfPresent(id, (k, v) -> v.removeUser(sessionId));
    }

    @Override
    public Mono addNewMessage(int id, String userId, String message) {
        ChatMeta meta = chats.getOrDefault(id, null);
        if (meta != null) {
            Message messageDoc = new Message(id, userId, meta.offset.getAndIncrement(), clock.instant(), message);
            return messagesStore.save(messageDoc)
                    .map(Message::getContent);
        }
        return Mono.empty();
    }
    // omitted
}

Base of ChatStore is the ConcurrentHashMap that holds the metadata of all open chats. Most of the methods from the interface are self-explanatory and there is nothing special behind them.

create -> creates a new chat with bool attribute denoting if the chat is f2f or group.
addNewUser -> add a new user to existing chats.
removeUser – remove user from existing chat.
get – gets the metadata of chat with an id.
delete – deletes the chat from CMH.

The only complex method here is addNewMessages. It increments the message counter within the chat and persists message content in MongoDB, for durability.

MongoDB

Message Entity

public class Message {
   @Id
   private String id;
   private int chatId;
   private String owner;
   private long offset;
   private Instant timestamp;
   private String content;
}

A model for message content stored in a database there are three important fields here:

chatId – represent chat in which particular message was send.
ownerId – the userId of message sender.
offset – ordinal number of message within the chat, for retrieval ordering.

MessageStore

public interface MessagesStore extends ReactiveMongoRepository<Message, String> {}

Nothing special, classic Spring Repository but in reactive fashion, provides the same set of features as JpaRepository. It is used directly in ChatStore.

Additionally in the main application class, WebsocketsChatApplication, I am activating reactive repositories by using @EnableReactiveMongoRepositories. Without this annotation messageStorefrom above would not work.

And here we go, we have the whole chat implemented. Let’s test it!

Spring Boot WebSocket – Testing

For tests, I’m using Postman and Simple Web Socket Client.

I’m creating a new chat using Postman. In response body, I got a WebSocket URL to the recently created chat.
Now it is time to use them and check if users can communicate with one another. Simple Web Socket Client comes into play here. Thus, I am connecting to the newly created chat here.
Here we are, everything is working and users can communicate with each other.

There is one last thing to do. Let’s spend a moment to look at things that can be done better.

What Can Be Done Better

As what I have just built is the most basic chat app, there are a few (or in fact quite a lot) things that may be done better. Below, I listed the things I find worthy of improving:

Authentication and rejoining support — right now, everything is based on sessionId. It is not an optimal approach, It would be better to have some authentication in place and actual rejoining based on user data.
Sending attachments — for now, the chat only supports simple text messages. While texting is the basic function of a chat, users enjoy exchanging images and audio files, too.
Tests — there are no tests for now, but why leave it like this? Tests are always a good idea.
Overflow in offset — currency it is the simple int if we would track the offset for a very long time it will overflow sooner or later.

Summary

Et voilà! The Spring Boot WebSocket chat is implemented, and the main task is done. You have some ideas on what to develop in next steps.

Please keep in mind that this chat case is very simple, and it will require lots of changes and development for any type of commercial project.

Anyway, I hope that you learned something new while reading this article.

Thank you for your time.

Might interest you:

Blog Chatting with Spring & WebSocket from Pask Software.

Availability – Theory, Problems, Tools and Best Practices

Bartek Żyliński — Wed, 09 Jul 2025 14:59:58 +0000

Availability is the measure of a system’s ability to stay up and running despite the failures, of its parts. Today, I will explore this core trait of distributed systems. I will cover theory, challenges, tools and best practises to ensure your system stays up and running against all odds.

Let’s start with theory.

What Is Availability?

Availability describes how our systems handle failures and determines the system’s uptime. Usually, we describe the availability of a system in “nines” notation. 99% availability guarantees a maximum of 14.40 minutes of downtime per day, while 99.999%-the so-called 5 nines-reduces this time to 846 milliseconds.

Most cloud services have an SLA with either three (99.9%) to five (99.999%) nines availability guarantees for end users.

Availability (%)	Downtime per day (~)	Downtime per month (~)	Downtime per year (~)
90	144 minutes (2.4 hours)	73 hours	36.53 days
99	14 minutes	7 hours	3.65 days
99.9	1.5 minutes	44 minutes	8.77 hours
99.99	9 seconds	4.4 minutes	52.6 minutes
99.999	846 milliseconds	26 seconds	5.3 minutes
99.9999	86.40 milliseconds	2.6 seconds	31.5 seconds

Additionally, the term high availability or HA is used to describe services that have at least 3 nines of availability guarantees.

There is a famous struggle related to availability and consistency. The common notion is that in case of a failure, we can have either one or the other. While in most cases this is true, the topic as a whole is vastly more nuanced and complex. For example, CRDTs put this whole statement into question; the same is true for Google’s internal Spanner.

Moreover, we can use various techniques to balance both of these traits. A system may favor one over the other in certain places while not in others. Just remember: this struggle exists and is one of the most important cases of study in distributed systems research.

How To Measure Availability

Availability is probably the simplest trait to measure – at least for a single service. You probably already have uptime or downtime metrics in one of your dashboards. Just divide the value you have there by: 24 (hours), 1440 (minutes), 5184000 (seconds). Et voilà, you have your service daily uptime percentage ready, and you can easily see how many nines you archived.

Things are getting more complicated when our service has multiple dependencies, or when we want to measure availability on the scale of whole system.

As an example consider the service A with two dependencies: DB and Email Service.

Service A has uptime of 99.99%.
DB has uptime of 99.9%.
Email Service has uptime of 99%.

Thus, availability of service A is not 99.99 but in fact 98.89%.

0.9999 × 0.999 × 0.99 = 0.9889 => 98.89%.

In more readable format:

Component	SLA (nines)	Availability (decimal)
Front-end API	99.99%	0.9999
Database	99.9%	0.9990
Email service	99 %	0.9900
Composite A	0.9999 × 0.9990 × 0.9900 = 0.9889 → 98.89%	0.9889

While the final difference is not big, it clearly illustrates the point. Availability of service is not a standalone but a product of all dependencies.

The same principle applies to system. Availability of a system as a whole is a product of all its services and tools. Even a single poorly available component can bring whole system down.

Weakest link	Best product you can ever reach
99 % (two nines)	< 99 %
99.9 % (three nines)	< 99.8 %
99.99 % (four nines)	< 99.96 %

Here is a quick note on how you can structure your Availability related metrics:

Tier	Example in an availability context
SLI
(Indicator)	`http_request_success_ratio` = successful requests ÷ total requests
SLO
(Objective)	`http_request_success_ratio ≥ 99.95 % over 30 days`
SLA
(Agreement)	“We guarantee 99.9 % monthly availability; otherwise you get service credits.”

Signs That System Has Poor Availability

There are a couple of behaviors we can notice, which indicate availability problems of our service. Additionally, some of those are similar to the signs of poor scalability.

Low uptime percentage – most obvious of all, directly shows that the service is down and users cannot access it.
Service “flapping” – the service oscillates between up and down as automated restarts or failovers repeatedly flip the service in and out.
Health-check Failures – Persistent probe timeouts under normal load mean the service is down or will be down in near future.
High Mean Time To Recover – outages last hours, before the team can resolve it and bring system back online.
Suddenly traffic drops to zero – service is either down or users gave up attempts to connect.
Direct Feedback – an important client is calling CTO/CIO (or whoever else) complaining everything is down, alerts start spinning, and other interesting events.

The Availability Game Changers

In my opinion, the game change for availability is automatic and graceful failover. While it sounds simple, it is actually more complex. To achieve it, we need to combine multiple different concepts and make the work together. Nonetheless, it is crucial for providing a zero downtime experience.

The anatomy of state of the art zero-down-time failover:

Stage	What happens	Typical target time
1. Detect	Health probe sees anomalies (5× timeouts/60 s).	≤ 5 s
2. Decide	Orchestrator marks node unhealthy, stops scheduling it.	≤ 1 s
3. Redirect	Load balancer removes endpoint from pool; sticky sessions migrate.	≤ 2 s
4. Restore	Replacement pod/VM starts and passes readiness checks.	≤ 40 s (hot standby: ≈ 0 s)

Of course, automatic failover is not a silver bullet and comes with drawbacks. The two most significant ones are higher complexity of the design and increased costs. Redundancy is responsible for increased costs, while failover itself adds the complexity.

It may sound bad, unfortunately without such mechanism we will not be able to provide high availability.

Tools For Availability

I have already covered automatic failover as key tools to build available systems. However, these are not the only concepts. There are more, and you can find them below.

Replication

Replication is a method to implement redundancy. The key difference is that redundancy impacts all layers of our system from software to hardware. While replication is mostly related to the data layer.

We provide multiple up-to-date copies of the same data-set, usually split across multiple nodes. Thus, in case one of the nodes fails the data is still available for the user.

There are two main types of Replication:

Single-master/Single-leader – only one of replica nodes is handling incoming writes – the leader. The rest of nodes provides read access and can be used to offload part of incoming traffic. Leader propagate changes to others nodes, usually, using some type of Write Ahead Log (WAL). If leader node fails or becomes unavailable for some reason. The leader election process takes place, and the new leader is selected from up-and-running nodes.
Multi-master/multi-leader – all the nodes accept both reads and writes at the same time. Writes are then propagated to other nodes. The biggest problem in case is that the same write operation can end up on two different nodes at the same time. Thus, it requires separate conflict resolution mechanism.

The concept of replication is a very extensive one. The good walkthrough and comparison even of these two approaches is out of scope of this article. However, I promise to dive deeper into replication in separate article.

For now remember following table:

Single-master	Multi-master
Only one node accepts write	Multiple nodes accepts write
Propagate via WAL	Conflict resolution and propagation

Automatic Failover

Automatic and gracefully (not noticeable by user) failover mechanism is the key for availability.

Good automatic failover we will need to combine at least three concepts:

Redundancy – we need more than one node to even start thinking of building any failover.
Health checks – we need properly defined health checks to detect if nodes are down or should not handle user requester.
Load-balancer/actual failover – we need way to change the failing components and redirect the traffic to up-and-running ones.

Each piece alone is insufficient; all must work together.

Isolating failure

Another way to increase the availability of our system is to isolate failures. By doing so, we can ensure that a failure of one component will not cause the cascade failure of the other components involved in the same processing flow.

As with most concepts from this paragraph there is no single tool or method to achieve that. Instead, we can follow one of the patterns below. We can also mix different patterns.

Let’s dive into them below:

Circuit breaker – one of the most common microservices patterns in existences. It implements the fail-fast concepts in a way similar to an electrical circuit breaker. If multiple consecutive calls to other service fail in certain period the circuit breaks switches. Then for the duration of a timeout period all attempts to invoke that service will fail immediately. Thus reducing the load on possibly faulty service and giving it time to recover. Also avoids introducing potential timeouts on other stages of the flow.
Bulkhead – according to this pattern components and resources in our system should be compartmentalized. Partitioning should be done in such a way that components do not share any resources. For example, each partition should have its own thread pools, connections pools and CPU or memory limits. Such split will decrease the chances of one component overuse (high resource utilization) impact the other components in the system.
Error Kernel – we split our system into two types of components core and side ones. The core ones must not fail whatever the reason. The side ones may fail, and we should be able to easily restart them. Then we can move the side ones into the “outskirts” of the system. Thus, we end with reliable core and easy to restart leaf components.

Multi-Region or Multi-Cloud Deployment

Multi-Availability Zone or Multi-Region Deployment will protect us form the least expected type of failures. The ones that will wipe-out whole datacenter or multiple data-centers located in a particular region. Like burning of OVH datacenter in France or GCP electrical problem in Iowa.

We can go even further and build Multi-Cloud failover. If your core cloud provider is down you can switch to a backup. While it adds a ton of extra complexity to your system it drastically reduces the probability of system-wide failure even more. Region wide failures are rare by themselves. Provider wide failures are even rarer. Nevertheless, both may happen. Being able to handle them probably will not decide the difference between 99.99% and lower vitality tiers.

However, being able to handle such events have a few advantages:

Besides staying alive when others are down.
Indicate how good your architecture is.

Chaos Engineering/ Fault Injection

Chaos engineering will not actually help you to build available system by itself. Rather, it helps you ensure that your system is in fact available. By introducing deliberate and trackable failure you can identify weaknesses and problems that will not show up in any other case. I also mentioned this concept here.

Just remember it is not fully safe and double-check that your system will be able to handle it.

Why We Fail To Achieve High Availability

After what, how, and why, it is time for why we fail. In my opinion and experience, there are a few factors that lead to our failure in building available systems.

Some reason will be the same as in the case of my article on scalability.

Ignoring the trade-offs – every decision we make has short- and long-lasting consequences we have to be aware of. Of course, we can ignore them; still, we have to know them first and be conscious as to why we are ignoring some potential drawbacks.
Incorrect health-checks – they react either too slowly or too quickly. Restarting service too early or too late increasing the likelihood of users experiencing the failure.
Lack of Redundancy – critical components do not have properly configured redundancy.
Badly designed Failover – we are unable to redirect the traffic to the up-and-running nodes fast enough.

Below a simple checklist how to increase the chance of not failing in availability:

Do today	Impact
Add health check to every component.	30 min work slashes 502 errors during deploys/failovers.
Track availability product	Makes hidden single points painfully obvious.
Set a written SLO	Aligns team on what “good enough” means.
Run a failover drill.	Check your design in practise.

Summary

I have shared a number of concepts and approaches for building highly available systems.

Let’s do a quick recap of key takeaways:

Making highly available systems requires mixing different concepts like: redundancy, healthcheck and failovers.
Proper health checks will help you keep up with the state of your components.
Isolating failures and preventing their propagation will keep the system running even if some components will fail.
Multi-region deployment will save you in the most unexpected moment

Some concepts discussed here can’t be implemented using a single tool. They require architectural thinking and coordination across layers of the stack.

Concept	Tool
Replication	Usually part of database product you are using
Automatic failover	K8s probes, Cloud autoscaling products
Failure isolation	Resilience4j, K8s Namespaces
Multi AZ	Cloud providers Availability Zones

🚀 High availability isn't just a metric - it is a mindset. Build for failure. Monitor everything. And treat availability as first class feature.

I wish you luck on your struggle with availability.

Thank you for your time.

Blog Availability – Theory, Problems, Tools and Best Practices from Pask Software.

From REST To Message Queue – 7 Ways To Build APIs

Bartek Żyliński — Tue, 17 Jun 2025 15:19:15 +0000

There are multiples ways to build API. I have already mentioned and describe some of the in different articles.

This text is a kind of One Ring article — one to rule them all. I want you to have a single place where you can find a comparison of all the approaches done in clear and consistent manner. Thus, I have put here all the previous comparisons, and add some more into this text.

I will compare a total of 7 ways across 10 axes.

Tools:

REST
gRPC
WebSockets
SSE
GraphQL
Webhooks
Message-Queue Base

Axes:

Communication Direction
Underlying protocols
Message Structure
Complexity
Security
Data size
Throughput
Latency
Ease of adoption
Tooling

Let’s start today’s journey from REST. As it is probably the most common way of building APIs I would use it as a baseline for all the comparison.

Ways To Build API

1. REST

REST, or Representational State Transfer, is an architecture styles. It uses HTTP as the underlying communication medium thus it is stateless by nature and can benefit from all the advantages of HTTP, like caching. It can utilize both HTTP/1.1, HTTP/2.

Shines when: you need a simple, cache-friendly, web-native API consumed by every language or tool.
Struggles when: you need full-duplex streams or extremely low latency.

2. gRPC

gRPC is probably the most modern implementation of the relatively old concept of — Remote Procedure Call. gRPC usesGoogle’s Protocol Buffers as a serialization tool. By default, it utilizes HTTP/2 as transport medium data and exchange data in a binary format.

Shines when: services need compact binary messages, strong typing, and bidirectional streaming.
Struggles when: you must expose an API directly to browsers without extra tooling.

3. WebSockets

WebSocket provides bidirectional communication between a server and client with the usage of a single long-lasting TCP connection. Thanks to this feature, the data is exchanged between interested parties in “real-time”. Each message is send as binary frame data or Unicode text. While WebSockets utilize custom protocol for most of the time it is still using HTTP for initial handshake.

Shine when: both client and server need real-time, low-latency push in either direction.
Struggle when: intermediaries (CDNs, firewalls) or strict request-response patterns dominate.

4. SSE

SSE is a technology that allows a web server to send updates to a web page. It is a part of HTML 5 specification and utilizes a single long live HTTP connection to send data in “real-time”. It can use both HTTP/1.1 and HTTP/2. SSE also has its unique MIME type: text/event-stream. The important thing here is that it is can be only used for server-browser communication.

Shines when: the server pushes one-way event streams to browsers with a minimal setup.
Struggles when: you need client-to-server push or non-browser consumers.

5. GraphQL

GraphQL is a query language for your API. It allows clients to request only the subset of data they need. The client knows the server’s endpoint, the endpoint provides a schema. The schema defines the communication protocol, inputs and outputs. The request, is validated with the schema, thus malformed request woulds be rejected by the server with a proper error message.

Shines when: clients must trim over-fetching and compose rich queries from one endpoint.
Struggles when: the workload is write-heavy or teams cannot maintain a strict schema discipline.

6. Webhooks

Webhooks address the question — Has anything changed yet?. It is a push-style HTTP callback that lets another service tell your app when something relevant happens. It works in, in “real” time, no polling, no sockets, just an outbound POST.

Shines when: you want lightweight server-to-server notifications without polling.
Struggles when: delivery guarantees must be exactly-once, or the receiver is offline for long periods.

7. Message-Queue

The message queue base approach uses a middleware as a way to communicate between services. Usually in form of some platform like Kafka or RabbitMQ. The main idea behind message queues is to provide asynchronous communication. Messages are sent to a queue, which acts as a buffer between the sender and receiver. This decouples the sender and receiver and allows them to operate independently of each other.

Shine when: you need high-throughput, decoupled, async processing with a replay.
Struggle when: you require simple, stateless request-response or openly exposed public APIs.

Ways To Build API – Comparison

Below is the most important takeaway from this article. For traits scaled from Very Low to Very High I am using REST a baseline. Thus, for example if one the complexity of the tools is described as Very High you can think of it as far more complex then REST.

Technology	Communication Direction	Underlying Protocols	Message Structure	Complexity	Security²	Data Size	Throughput	Latency	Ease of Adoption	Tooling
REST	Unidirectional	HTTP/1.1 or HTTP/2	Mostly JSON or XML (text)	Very Low	HTTPS	Large	Moderate	High	Very High	Extensive (Postman, Swagger/OpenAPI, cURL)
gRPC	Unidirectional or Bi-directional	HTTP/2	Protocol Buffers (binary)	Moderate	mTLS	Very Small – binary	Very High	Very Low	Moderate but growing	Decent (gRPCurl, Postman)
WebSockets	Bi-directional	WebSocket (upgrade from HTTP/1.1 or HTTP/2)	Text or binary frames	Low	WSS	Small – binary frames	High	Very Low	Moderate	Good
SSE (Server-Sent Events)	Unidirectional	HTTP/1.1	`text/event-stream` (UTF-8)	Very Low	HTTPS	Moderate – plain text	Moderate	Low	Moderate	Good
GraphQL	Unidirectional, Bi-directional with subscriptions	HTTP/1.1, HTTP/2; WebSocket for subscriptions	JSON (text)	Moderate	HTTPS	Large	Variable; fewer round-trips	Medium–high	Moderate but rapidly growing	Very Good (Apollo, GraphiQL)
Webhooks	Unidirectional	HTTP/1.1 or HTTP/2	JSON, form-encoded, or custom	Low	HTTPS	Large	Depends on event volume (bursty)	Moderate	Very High	Good (ngrok, request-bin)
Message-Queue	Unidirectional	Kafka TCP, AMQP, MQTT, NATS, …	Binary records, Avro, JSON …	Very High	ACLs, TLS	Small to medium – highly tunable	Very High	Very Low	Low	Extensive

Summary

There are no good or bad here, nor any silver bullets, it just a couple of tools you can use.

Nonetheless, if you want to know my recommendations here they are:

Scenario / Tech	REST	gRPC	WebSockets	SSE	GraphQL	Webhooks	Message Queue
Mobile → backend	✓	✓			✓		✓
IoT / edge device		✓	✓				✓
Service-To-Service	✓	✓	✓		✓	✓	✓
Browser push / live UI			✓	✓
3rd integration	✓				✓	✓
Streaming pipelines							✓

Remember, if you will be building anything more complex then a very focus microservice the odds are that you will be using more than one approach. Mixing different tools the get the best design is part of our daily struggle just please do not overengineer your solution.

Thank you for your time.

Blog From REST To Message Queue – 7 Ways To Build APIs from Pask Software.

7 API Integration Patterns: REST, gRPC, SSE, WS & Queues

Bartek Żyliński — Mon, 16 Jun 2025 18:19:15 +0000

There are multiples API integration patterns. I have already mentioned and describe some of the in different articles: gRPC vs REST, WebSockets vs SSE

I will compare a total of 7 API integration patterns 10 axes.

Tools:

REST
gRPC
WebSockets
SSE
GraphQL
Webhooks
Message-Queue Base

Axes:

Communication Direction
Underlying protocols
Message Structure
Complexity
Security
Data size
Throughput
Latency
Ease of adoption
Tooling

Let’s start today’s journey from REST. As it is probably the most common way of building APIs I would use it as a baseline for all the comparison.

API Integration Patterns

1. REST

Shines when : you need a simple, cache-friendly, web-native API consumed by every language or tool.
Struggles when : you need full-duplex streams or extremely low latency.

2. gRPC

Shines when : services need compact binary messages, strong typing, and bidirectional streaming.
Struggles when : you must expose an API directly to browsers without extra tooling.

3. WebSockets

WebSockets provides a real-time bidirectional communication between a server and client with the usage of a single long-lasting TCP connection.

Thanks to this feature, the data is exchanged between interested parties in “real-time”. Each message is send as binary frame data or Unicode text. While WebSockets utilize custom protocol for most of the time it is still using HTTP for initial handshake.

Shine when : both client and server need real-time, low-latency push in either direction.
Struggle when : intermediaries (CDNs, firewalls) or strict request-response patterns dominate.

4. SSE

SSE is a technology that allows a web server to send updates to a web page. It is a part of HTML 5 specification and utilizes a single long live HTTP connection to send data in “real-time”. It can use both HTTP/1.1 and HTTP/2. SSE also have its unique MIME type:

text/event-stream. The important thing here is that it is can be only used for server-browser communication.

Shines when : the server pushes one-way event streams to browsers with a minimal setup.
Struggles when : you need client-to-server push or non-browser consumers.

5. GraphQL

Shines when : clients must trim over-fetching and compose rich queries from one endpoint.
Struggles when : the workload is write-heavy or teams cannot maintain a strict schema discipline.

6. Webhooks

Shines when : you want lightweight server-to-server notifications without polling.
Struggles when : delivery guarantees must be exactly-once, or the receiver is offline for long periods.

7. Message-Queue

Shine when : you need high-throughput, decoupled, async processing with a replay.
Struggle when : you require simple, stateless request-response or openly exposed public APIs.

API Integration Patterns – Comparison

Technology	Communication Direction	Underlying Protocols	Message Structure	Complexity	Security²	Data Size	Throughput	Latency	Ease of Adoption	Tooling
REST	Unidirectional	HTTP/1.1 or HTTP/2	Mostly JSON or XML (text)	Very Low	HTTPS	Large	Moderate	High	Very High	Extensive (Postman, Swagger/OpenAPI, cURL)
gRPC	Unidirectional or Bi-directional	HTTP/2	Protocol Buffers (binary)	Moderate	mTLS	Very Small – binary	Very High	Very Low	Moderate but growing	Decent (gRPCurl, Postman)
WebSockets	Bi-directional	WebSocket (upgrade from HTTP/1.1 or HTTP/2)	Text or binary frames	Low	WSS	Small – binary frames	High	Very Low	Moderate	Good
SSE (Server-Sent Events)	Unidirectional	HTTP/1.1	`text/event-stream` (UTF-8)	Very Low	HTTPS	Moderate – plain text	Moderate	Low	Moderate	Good
GraphQL	Unidirectional, Bi-directional with subscriptions	HTTP/1.1, HTTP/2; WebSocket for subscriptions	JSON (text)	Moderate	HTTPS	Large	Variable; fewer round-trips	Medium–high	Moderate but rapidly growing	Very Good (Apollo, GraphiQL)
Webhooks	Unidirectional	HTTP/1.1 or HTTP/2	JSON, form-encoded, or custom	Low	HTTPS	Large	Depends on event volume (bursty)	Moderate	Very High	Good (ngrok, request-bin)
Message-Queue	Unidirectional	Kafka TCP, AMQP, MQTT, NATS, …	Binary records, Avro, JSON …	Very High	ACLs, TLS	Small to medium – highly tunable	Very High	Very Low	Low	Extensive

Summary

There are no good or bad here, nor any silver bullets, it just a couple of tools you can use.

Nonetheless, if you want to know my recommendations here they are:

Scenario / Tech	REST	gRPC	WebSockets	SSE	GraphQL	Webhooks	Message Queue
Mobile → backend	✓	✓			✓		✓
IoT / edge device		✓	✓				✓
Service-To-Service	✓	✓	✓		✓	✓	✓
Browser push / live UI			✓	✓
3rd integration	✓				✓	✓
Streaming pipelines							✓

Thank you for your time.

May also interest you:

Blog 7 API Integration Patterns: REST, gRPC, SSE, WS & Queues from Pask Software.

ArchUnit Guide – How to Unit Test Your Architecture

Bartek Żyliński — Mon, 16 Jun 2025 14:57:31 +0000

Enforcing a specific package structure or architecture is very important. Especially in Java where some things must be public to work correctly or actually be available outside its package. ArchUnit is an open-source library that will help you whenever the compiler is not enough.

All the code examples from this article is available in my GitHub repo.

What is ArchUnit?

ArchUnit is an open-source library for writing and enforcing architecture rules within your project. There is no use of facades and encapsulations when you can just reach out and take what you want. It is even more significant when you are trying to follow ports and adapters or other approaches that impose very strict restrictions on which classes should use others and in what way.

That is the moment where ArchUnit comes into play. It can easily check the dependencies between packages and classes — which class is using/importing the other. With this “simple” feature we can easily set up the set of rules that will put restrictions on how our classes can interact with one another. Later we can easily add these rules to our test suite and by extension unit test our architecture.

For all designs and purposes the rules are normal testes and can be easily run by any unit test library/framework. Beside “simple” packages and classes dependencies check mentioned above it can also check dependencies between layers and slices, check for cyclic dependencies and more.

We can create following rules with the help of ArchUnit

“Classes in package X should only depend on classes in package Y.”
“Classes in the service layer should not access controller layer classes.”
“No cyclic dependencies should exist among these packages.”
Prevent a field and setter based injection
Ensure @Transactional annotation is used only in the service layer
Enforce @Repository and @Service annotation usage in specific packages
….

Without going into much detail ArchUnit works by reading and analyzing bytecode not the source code itself. Thus, our rules are not applied to source code per se but rather to output bytecode.

ArchUnit-Junit

ArchUnit-Junit artifact is part of the wider ArchUnit framework. It makes the tests more descriptive and smaller by removing a lot of JUnit related boilerplate code. The most import part of this package is ArchTest annotation. With using it we can write tests as methods not JUnit tests. The artifact will take care of actually converting the method into proper JUnit test.

ArchUnit-Junit

@ArchTest
static final ArchRule classesInXShouldOnlyDependOnClassesInY =
        ArchRuleDefinition
              .classes()
              .that().resideInAPackage("..x..")
              .should().onlyDependOnClassesThat()
              .resideInAnyPackage(
                        "..y..",
                        "java.."
                );

JUnit

@Test
void testClassesInXShouldOnlyDependOnClassesInY() {
    ArchRule rule = classes()
          .that().resideInAPackage("..x..")
          .should().onlyDependOnClassesThat()
          .resideInAnyPackage(
                    "..y..",   
                    "java.."   
            );

    rule.check(IMPORTED_CLASSES);
}

The difference is seem no a big deal, however I can see potential benefits if you have a lot of tests. Personally, I prefer the classic JUnit way.

All the tests written here follow a JUnit way without the archunit-junit5-engine. Nevertheless, you can find the examples written with junit-archunit lib in the repo.

ArchUnit Examples

Let’s start with implementation of all the rules from above. Then I will move on to presenting rules that will ensure your ports & adapters setup remains unchanged.

Classes in package X should only depend on classes in package Y.

@Test
void testClassesInXShouldOnlyDependOnClassesInY() {
    // Given: Define a rule that restricts classes in package '..x..' 
    // to depend only on classes in '..y..' or standard Java packages.
    ArchRule rule = classes()
            .that().resideInAPackage("..x..")
            .should().onlyDependOnClassesThat()
            .resideInAnyPackage(
                    "..y..", // Allow dependency on package '..y..'
                    "java.." // Allow dependency on Java standard library
            );

    // Then
    rule.check(IMPORTED_CLASSES);
}

Classes in the service layer should not access controller layer classes.

@Test
void testServiceLayerShouldNotAccessControllers() {
    // Given: Define a rule that prevents the service layer
    // from depending on classes in the controller layer.
    ArchRule rule = noClasses()
            .that().resideInAPackage("..service..")
            .should().dependOnClassesThat()
            .resideInAPackage("..controller..");

    // Then
    rule.check(IMPORTED_CLASSES);
}

No cyclic dependencies should exist among packages.

@Test
void testNoCyclicDependencies() {
    // Given: Define a rule to ensure there are no cyclic dependencies
    // between modules grouped by their first-level sub-packages under 'org.ps'.
    ArchRule rule = SlicesRuleDefinition.slices()
            .matching("org.ps.(*)..") // Define slices by sub-packages under 'org.ps'
            .should()
            .beFreeOfCycles(); // Ensure there's no cyclic dependency between them
    // Then
    rule.check(IMPORTED_CLASSES);
}

Prevent the field and setter based

@Test
void testNoFieldInjection() {
    // Given: Define a rule that disallows field injection using @Autowired.
    ArchRule noFieldInjectionRule = noFields()
            .should().beAnnotatedWith(Autowired.class)
            .because("Use constructor injection instead of field injection.");

    // Also define a rule that disallows setter injection using @Autowired.
    ArchRule noSetterInjectionRule = noMethods()
            .that().haveNameMatching("set[A-Z].*")
            .should().beAnnotatedWith(Autowired.class)
            .because("Use constructor injection instead of setter injection.");

    // When: Combine both rules into one composite rule.
    ArchRule compositeRule = CompositeArchRule.of(noFieldInjectionRule).and(noSetterInjectionRule);

    // Then
    compositeRule.check(IMPORTED_CLASSES);
}

Ensure @Transactional annotation is used only in the service layer.

@Test
void testTransactionalAnnotationOnlyInService() {
    // Given: Define a rule that ensures classes annotated with @Transactional
    // are located in the service layer.
    ArchRule classLevelTransactional = classes()
            .that().areAnnotatedWith(Transactional.class)
            .should().resideInAPackage("..service..")
            .because("Class-level @Transactional belongs in the service layer only.");

    // Also define a rule for methods annotated with @Transactional
    // to be declared only in service layer classes.
    ArchRule methodLevelTransactional = methods()
            .that().areAnnotatedWith(Transactional.class)
            .should().beDeclaredInClassesThat().resideInAPackage("..service..")
            .because("Method-level @Transactional belongs in the service layer only.");

    // When: Combine both rules into one composite rule.
    ArchRule compositeRule = CompositeArchRule.of(classLevelTransactional).and(methodLevelTransactional);

    // Then
    compositeRule.check(IMPORTED_CLASSES);
}

Enforce @Repository and @Service annotation usage in specific packages.

@Test
void testRepositoryAnnotationInRepositoryPackage() {
    // Given: Define a rule that ensures @Repository-annotated classes
    // are only located in the repository package.
    ArchRule rule = classes()
            .that().areAnnotatedWith(Repository.class)
            .should().resideInAPackage("..repository..");

    // Then
    rule.check(IMPORTED_CLASSES);
}

@Test
void testServiceAnnotationInServicePackage() {
    // Given: Define a rule that ensures @Service-annotated classes
    // are only located in the service package.
    ArchRule rule = classes()
            .that().areAnnotatedWith(Service.class)
            .should().resideInAPackage("..service..");

    // Then
    rule.check(IMPORTED_CLASSES);
}

ArchUnit & Hexagonal Architecture

Here the setup is somewhat more complex. A complete set of ArchUnit tests for Hexagonal architecture.

Due to Java packaging model, enforcing proper classes and methods visibility is sometimes impossible. Thus, someone may easily use the class outside its intended scope. By extend breaking the encapsulation and our beautiful separation of domain and infrastructure.

Let’s consider following setup:

org.ps
├─ domain
│ └─ ... (domain models and services)
├─ application
│ ├─ port
│ │ ├─ in
│ │ │ └─ ... (interfaces for incoming incoming requests and messages)
│ │ └─ out
│ │ └─ ... (interfaces for outgoing requests and messages)
│ └─ ... (application services, use case implementations)
├─ adapters
│ ├─ in (incoming requests and messages)
│ └─ out (outgoing requests and messages)
├─ infrastructure
│ └─ ... (external setups - DB connections, queues, metrics)
└─ config
└─ ... (configurations classes for all other packages)

This is the closest to recommended package structure for hexagonal architecture I managed to get. It seems that there are no one general way. Almost every article is pushing its one version.

We want our structure to obey following set of rules, as far as I have managed to understand industry-wide standard when it comes to hexagonal architecture:

Domain may not access any layer but can be access by Application and Adapters layers.
Application may access the Config and Domain layers but can be access by Adapters layer.
Adapters may access Application, Adapters, Domain and Infrastructure by cannot be access by other layers.
Infrastructure can only access Config layer but can be access only by Adapters.
Config may not be access any layer but can be access by Application, Adapters, Domain and Infrastructure.

Below is how we may test and enforce it with the help of ArchUnit. The test is quite lengthy but particular rules are clearly split from one another.

@Test
public void hexagonArchTest() {
    // Given
    JavaClasses importedClasses = new ClassFileImporter().importPackages("org.ps.hexagon");
    LayeredArchitecture portsAndAdaptersLayers = layeredArchitecture()
          .consideringOnlyDependenciesInLayers()
            // Define >each “layer” by its package
          .layer("Adapters").definedBy("..adapters..")
          .layer("Application").definedBy("..application..")
          .layer("Config").definedBy("..config..")
          .layer("Domain").definedBy("..domain..")
          .layer("Infrastructure").definedBy("..infrastructure..")
            // Domain may not access any layer but can be access by Application and Adapters layers.
          .whereLayer("Domain").mayNotAccessAnyLayer()
          .whereLayer("Domain").mayOnlyBeAccessedByLayers("Application", "Adapters")
            // Application may access the Config and Domain layers but can be access by Adapters layer.
          .whereLayer("Application").mayOnlyAccessLayers("Config", "Domain")
          .whereLayer("Application").mayOnlyBeAccessedByLayers("Adapters")
            // Adapters may access Application, Adapters, Domain and Infrastructure but cannot be access by other layers.
          .whereLayer("Adapters").mayOnlyAccessLayers("Infrastructure", "Config", "Application", "Domain")
          .whereLayer("Adapters").mayNotBeAccessedByAnyLayer()
            // Infrastructure can only access Config layer but can be access only by Adapters.
          .whereLayer("Infrastructure").mayOnlyAccessLayers("Config")
          .whereLayer("Infrastructure").mayOnlyBeAccessedByLayers("Adapters")
            // Config may not be access any layer but can be access by Application, Adapters, Domain and Infrastructure.
          .whereLayer("Config").mayNotAccessAnyLayer()
          .whereLayer("Config").mayOnlyBeAccessedByLayers("Application", "Adapters", "Domain", "Infrastructure");

    // Then
    portsAndAdaptersLayers.check(importedClasses);
}

Summary

Here we are, that is all I wanted to share with you today. If you want some more examples you can find them either on ArchUnit GitHub or in their docs.

On the other hand, you can just start typing and see where the API will guide you.

Thank you for your time.

Blog ArchUnit Guide – How to Unit Test Your Architecture from Pask Software.

Lock-Free Programming – From Primitives To Working Structures

Bartek Żyliński — Sun, 15 Jun 2025 19:05:39 +0000

Working with multiple threads is one of the most complex problems we may encounter in our daily work. When put against the wall of multithreading most people right away reach out for blocking approaches. In Java, it takes the form synchronized keyword, or some other less painful mechanisms, like ReentrantLock. Lock are not the only option: Lock-Free programming is also the way.

In this text, I will show problems, techniques and best practices related to Lock-Free Programming. I will also provide a real life example of how to implement a Lock-Free stack. Besides, I will share common patterns on moving from Lock-Free to Wait-Free.

Here you can find only the most interesting code samples. The full source code is available in my GitHub repository.

What is Lock Free Programming?

Guarantee: Lock-free programming algorithms guarantees system-wide progress. Within a finite number of steps by all threads, at least one thread completes its operation.

Typical techniques: atomic primitives CAS, LL/SC, FAA.

Pros:

Scales well under contention because there is no kernel blocking or context-switch overhead.
Eliminate deadlocks and livelocks

Cons:

Harder to design and reason about.
Higher cost under very low contention compared with a simple mutex.
Starvation is still possible—one unlucky thread might repeatedly fail while others succeed, but the program as a whole keeps moving forward.

What is Wait-Free Programming?

Guarantee: Wait-Free algorithm provides a per-thread progress bound. Every thread finishes its operation in a finite number of its own steps, regardless of what the other threads do.

Typical techniques: Per-thread operation descriptors and finite step helper functions

Pros:

All from Lock-Free Programming
Real-time friendly
Eliminate Starvation problem

Cons:

Significantly more memory and bookkeeping.
Often slower in the uncontended “happy path”
Harder (sometimes impossible) to create for complex data structures.

Blocking vs Lock-Free vs Wait-Free – The Progress Guarantees

Deadlock

Deadlock is probably the most common error in the multithreading. Two (or more) threads each hold resources the others need and wait forever for those resources to be released. Our threads will not be able to move forward with processing. Thus, we end with stand-still — a Deadlock.

    Thread A: lock(m1); lock(m2);
    Thread B: lock(m2); lock(m1);

If A acquires m1 and B acquires a m2, both block forever on the second lock().

Livelock

A livelock is motion without progress. Like two people in a hallway, sidestepping left and right in perfect sync and never getting past each other. Threads are active—spinning, retrying, sending messages, but the system never completes a useful operation. Usually caused by over-reactive collision‐avoidance or continual retries without an eventual terminal state.

Starvation

Starvation means a particular thread never gets the resources or CPU time it needs. Even despite the fact that the system as a whole keeps making progress. If other threads hit the same data structure with high frequency. Then one of the threads may have to retry many times because its compare-and-swap keeps failing. This high contention can cause the Thread to starve.

Problem	Lock-Free Programming	Wait-Free Programming
Deadlock	Impossible. No thread ever waits while holding a lock, so circular-wait conditions can’t arise.	Impossible. Same lock-free property plus bounded completion time for every thread.
Livelock	Prevented. At least one operation must complete in a finite number of steps, so the system can’t get stuck in endless mutual retries.	Prevented. Every operation finishes in a bounded number of steps, so the whole system and each thread move forward.
Starvation	Possible. System makes progress, but an unlucky thread may be perpetually overtaken by others.	Impossible. Each thread completes within a fixed bound, so no one can be starved.

Atomic Primitives That Power Non-Blocking Algorithms

Compare-and-Swap (CAS)

Compare-and-Swap (CAS) is probably the single most famous atomic instruction. In one atomic step, the processor:

Reads a memory word
Checks whether that word still holds an expected value
Only if the comparison succeeds replace it with a new value.

That simple “check-and-set” sequence lets a thread act as if it briefly owned the variable without ever locking it.

Most if not all non-blocking datastructures rely on this instruction mix with a loop to work correctly. You will see one of them below — in form of LockFreeStack. However, there is a catch or two hidden here. First when contention is rising, the loop starts taking more and more time to complete. The other is known as a A-B-A problem.

In Java, we all Atomic classes, have compareAndSwap methods. Additionally, you can try to emulate CAS with usage of volatile keyword.

Load-Link / Store-Conditional (LL/SC)

One of the ways to address the A-B-A problem from above is the LC/SC instruction pair. It is a pair of two steps —atomic with respect to other threads. Effectively creating a “split CAS.”

Step	What happens
LL	Load the word at addr and place the CPU into a reservation state for that cache line.
SC	If reservation valid, no intervening write by another core, stores new value, otherwise it does nothing and returns false.

The main difference between CAS and LL/SC is a reservation mechanism. Thus, it dodges the ABA problem. Unfortunately,

LL/SC is not a silver bullet and has drawbacks. The biggest one is that the Reservation can be lost for reasons

other than contention. In such a case we lose all the benefits of using this primitive.

In java the closest thing to LL/SC is AtomicStampedReference.

Fetch-and-Add (FAA)

Fetch-and-Add , or AtomicAdd perform a following operations in single atomic step:

Read the current value at addr.
Adds a delta k.
Stores the result back.
Returns the old value (some variants return the new one).

There are other variants of this primitive like: Fetch-and-Subtract. FAA is used as a base in more complex techniques like Fetch-And-Store or Test-And-Set.

The main disadvantage of this primitive is, what name suggests. FAA only supports arithmetic transforms of the form old + k. To do anything more complex you will still need CAS or LL/SC.

In Java all Atomic classes, have expose getAndIncrement or incrementAndGet methods.

Double-Width CAS

This is an extension of classic CAS. Double Width CAS (or DCAS) extends the regular CAS idea to two adjacent machine words treated as a single 128-bit (or larger) unit. Both sub-words must match their expected values for the store to occur.

In pseudocode

    cas2(expected_lo, expected_hi, new_lo, new_hi);

It allows you to store a pointer in the low word and a tag in the high word. Both words must match, so resurrecting an old pointer with the same address, but a different tag fails. Thus solving the A-B-A problem.

DCAS enables part of lock-free datastructures to swap pairs (pointer/counter or value/status) atomically. The main drawback is heavier micro-architectural cost than single-word CAS (locks 16-byte cache line)

Primitives Summary

Primitive	General form	Typical uses	Common pitfalls
CAS	`old ⇒ new`	Universal lock-free building block	ABA problem , retry cost under contention
DW-CAS	`(old_lo, old_hi) ⇒ (new_lo, new_hi)`	Pointer + tag pairs, ABA defense	Heavier locks 16-byte cache line
LL/SC	`LL; …; SC` split	Portable lock-free ops when CAS not present	Reservation loss → more retries
FAA	`new = old + k`	Counters, ticket locks, ref-counts	Cache-line ping-pong, limited to arithmetic ops

Beside these 4 primitives we have more complex ones like Atomic Exchange / Test-and-Set (TAS), Fetch-and-store (FAS).

Lock-Free Programming In Java – Lock-Free Stack

Base

Simplest Lock Free Stack, with single atomic top pointer. Push and pop CAS loops operate on that pointer. It is a singly linked Stack thus the Node class only contains next reference.

import java.util.concurrent.atomic.AtomicReference;

private record Node(E value, Node next) {

}

private final AtomicReference top = new AtomicReference<>();

Lock-Free Push

Here we have a lock free push. You can see the CAS loop wrap around the top pointer.

/* ---------------- push ----------------------- */
public void push(E item) {
    Node newNode;
    Node oldTop;
    do {
        oldTop = top.get();
        newNode = new Node<>(item, oldTop);
    } while (!top.compareAndSet(oldTop, newNode));
}

What’s happening:

Read the current head pointer.
Build the replacement node – safe because nothing else can see newNode yet.
CAS attempt cas(oldTop, newNode).
If no one has changed top in the meantime, the CAS succeeds, and the loop exits.
If another thread slipped in, the CAS fails, we loop to re-snapshot the new top.

Lock-Free Pop

/* ---------------- pop ----------------------- */
public E pop() {
    Node oldTop;
    Node newTop;
    do {
        oldTop = top.get();
        if (oldTop == null) return null;
         newTop = oldTop.next();
    } while (!top.compareAndSet(oldTop, newTop));
    return oldTop.value();
}

What’s happening:

Read the current head pointer.
Null-check nothing to pop -> return immediately, no CAS needed.
Assign a new top.
CAS from oldTop -> newTop
If no one has changed top in the meantime, the CAS succeeds, we atomically detach oldTop.
If another thread slipped in, the CAS fails, we loop to re-snapshot the new top.
Return — safe because oldTop is now private to this thread, GC will eventually reclaim it.

From Lock-Free Programming to Wait-Free Programming

I won't be presenting a complete Wait-Free implementation. It is long, complex and explaining it correctly will probably double the size of this blog. However, there are a few good sources on building Wait-Free datastructures.

Queues
- Scalable Synchronous Queue
- Wait-Free Queue
Stack

There are a couple of common points shared between all Wait-Free implementations:

Op Log — Each thread writes a small record describing what it wants to do.
Per-thread slot reservation → Thread grabs a unique index with FAA; writes its OpLogs there.
Phase (ticket) numbers → Every request gets a monotonically increasing “phase”.
Bounded Help loops → A helper will finish at most k foreign operations before returning to its own.

Summary

As you could see, both lock-free and wait-free are not simple things. However, with a couple of good tools and techniques, they can be much easier.

Key takeaways:

Definitions
- Lock-free programming → from all active threads, at least one makes progress in a finite number of steps.
- Wait-free programming→ every operation by every thread completes in a bounded number of its own steps.
Atomic Primitives
- CAS
- LL/Sc
- FAA
- Double-width CAS
Best practises in implementing Lock-Free and Wait-Free structures:
- Design first, code later → Draw the state diagram and identify every location that can change concurrently.
- One atomic word to one logical invariant → Keep each CAS/LL-SC operating on the entire state you need to test.
- Use DCAS → rather than splitting an invariant across two variables.
- Write the fast path first → Attempt a cheap single-CAS path first; after N failures publish a descriptor and switch to the helping (“slow”) path.
- Bounded helping (Wait-Free) Ensure a helper can finish at most k foreign operations before returning to its own.

Thank, you for your time

Blog Lock-Free Programming – From Primitives To Working Structures from Pask Software.

Software Engineering Trade-Offs

Bartek Żyliński — Sat, 14 Jun 2025 15:43:55 +0000

In a couple of my last articles, I emphasize the importance of different software engineering trade-offs, for example here. I have been trying to point out that focusing on maxing out just one trait can cause problems in others. I believe that main part of our job as software engineers should be to min-max different software engineering trade-offs and even the trade-offs of different combinations of trade-offs.

Software engineering is an art of constantly balancing all these things. Below you can find eight trade-offs, plus their pros and cons. I will also share a very simple framework for navigating software engineering trade-offs.

First, a reality check: perfection is impossible — min-maxing is the way.

We Cannot Build Perfect

In the perfect case we could build a system that matches each and every requirement. It could also handle all the possible edge-cases and yet be simple and easy to maintain. Well, reality is often disappointing: each new case our the system can handle increasing its complexity. Each new fancy tech, tool or concept we introduce, will do the same.

If we are choosing data transfer format, we can pick one of the few—but not all of them. Of course, can decide to add support for all the formats, but again the complexity increases.

If we want our system to be based only on stateful operations, we cannot expect that the system will be easily scalable. We can then off-load part of stateful processing to other services or tools, but again complexity follows.

Unfortunately, the software engineering is far from perfect. Same as in life—each action/decision has a consequence, either short or long-lasting. We cannot run away from that fact. Luckily for us, in software engineering the boundaries are much more flexible, and the consequences are not as dire as in personal life.

In the worse case, we can always build something from scratch. It will not be cheap, easily or even fast, but it is always a possibility.

Software Engineering Trade-Offs

Let’s start with my favourite trade-off: Complexity vs Everything.

Complexity vs Everything

This one is as simple as it can. I wrote a lot about this one in the paragraph above, and I do not want to repeat myself. Almost every decision we make increases complexity, that’s it.

With time, the complexity grows, and the growth speed only increases. The system is complex enough to begin with, and we want it to support newer and newer use cases.

As software engineers, we have to keep complexity as low as possible. In ideal case we should also leave some margin for future changes and requirements.

Cons of high complexity:

Increase RTB (Run the Business) costs
Increase onboarding cost
Increase costs of new change
Chance the system will become either unmaintainable or unreplaceable (at least without huge investment of time and money)

If you discover any real benefit of increased complexity, I owe you a coffee.

To be 100 % clear, complexity is not something we can fully run away from. It is a trait of every system. We just should be aware of it and balance our choices accordingly.

Simplicity vs Flexibility

Simplicity is a key always and everywhere.

I guess that most of us will prefer to work with easy to grasp and easy to maintained systems. I also guess that most of us prefer to design systems that are just like that. However, we must not oversimply our architectures. We should always leave some design margin for future changes.

Yet, making the system too flexible is also a no-go, at least in my opinion. There is no point in making your system capable of handling all possible future scenarios from the start. Half of what you expected will not ever occur, and the other half will be significantly different from what you expected. I will quote a proverb No big design up front.

Side	Simplicity	Flexibility
Pros	- Lower onboarding time - Fewer things can fail - Easier to reason about and maintain	- Easier to extend - Easier to cover unexpected requirements
Cons	- May require architecture rewrites sooner rather than later - Less open to change	- Harder to reason about - Potentially harder to test - “Just-in-case” code bloats the codebase

Time-to-Market vs Technical Debt

Time to market vs. Technical Debt is probably the most crucial when it comes to actually delivering software.

Even the most beautiful and perfect code, does not matter if competitors are already there, and they are stealing away our to-be customers. In more corporate cases—we continuously fail to meet our deadlines and deliver on time.

Time to Market itself does not bring any value. I know that everyone want to be viral since day one but cascading software failure is probably not the desired way to achieve it. Our code has to actuality work and meet customer expectations. Also, the code itself is not the only source of tech debt. Things like observability, security, tests are among other sources.

Maybe, polishing the code for yet another time is not the best usage of time left. Instead, it may be better to focus on building a good observability pipeline or doing some performance tests.

Side	Time-to-Market	Technical Debt
Pros	- Reach customers sooner and seize fleeting market opportunities - Collect real-world feedback earlier to refine product–market fit - Generate revenue (or demonstrate traction to investors) faster	- Clean, well-tested architecture lowers long-term costs - Greater reliability, performance, and security from day one - Future features ship faster because the foundation is solid
Cons	- Technical debt raises future maintenance and refactor costs - Increased likelihood of bugs, outages, and security gaps - Major rewrites can disrupt roadmaps and morale	- Slower initial launch may cede market share to faster rivals - Delayed revenue and user feedback increase business risk - Risk of over-engineering before proving product–market fit

Horizontal vs Vertical Scaling

If you are not sure as what any of them means, I recommend reading my text on Scalability.

Picking the way of how we can scale our application is probably one of the most crucial choices we can make while designing our application. It shapes all core design choices we make in our system and has long-lasting consequences.

This choice is not a set in stone; you can change the approach later down the road. However, all the architecture changes required to make application horizontally scalable will probably make the whole undertaking long, painful and expensive.

Same is true in the other way around — if we are migrating from horizontal to vertical. In both case, it will probably end with rewriting the system from scratch, or similar level of changes.

Horizontal scaling also has drawbacks. This approach also has drawbacks. You can achieve great performance with vertical scaling only.

Side	Vertical	Horizontal
Pros	- Smaller ops overhead - Easier state management - Lower coordination overhead	- Practically unbounded scale - Inherent redundancy - Supports geo-distribution
Cons	- Hard upper limit - Single point of failure	- Higher ops overhead - Susceptible to network-related problems - Must be designed with distribution in mind

Latency vs Throughput

This trade-off may seem strange. One would think that optimization of latency—single request processing time would impact the overall throughput—number of requests we can handle per unit of time.

Surprise, surprise after a certain point it seems not to be the case.

Optimizing and fine-tuning for latency tends to concentrate extra CPU cycles, cache space, or memory bandwidth on a single request. While it may yield great results initially, after a certain, non-arbitrary, threshold this results tends to diminish. After that point achieving any measurable gains can even require hardware or architectural changes.

In the case of Throughput we tend to split the resources proportionally. Focusing on optimizing average processing time across multiple requests. Instead of aiming at absolute latency of any one request.

Side	Latency	Throughput
Pros	- Better tail behavior - More predictable	- Steady hardware utilization - Less complex (in theory)
Cons	- Computation-heavy - Extra resources tied to a single request - Throughput ceiling	- Tail spikes / less predictable UX - Slower single-request response - Susceptible to back-pressure problems

As this trade-off can be somewhat tricky I recommend deciding based on what your use case needs. If you have some type of mixed use case, or focus point is not clear then I would recommend using or slightly optimizing your SLO (e.g. p99 latency). Only then focusing on throughput subjected to that SLO.

Stateful vs Stateless

To be honest, we cannot truly run away from stateful processing. Unless we have a very specific use case, we would need some form of state. The real trade-off here is to either store state in our service, close to our logic, or we want to offload it to some 3rd party tool or platform.

As some of the other software engineering trade-offs this one will also have a major impact on our system final design. Among other, it will impact areas like scalability, load balancing and overall complexity of the system. I will dive deeper into this topic in separate text.

Side	Stateful	Stateless
Pros	- Easier to build for strong consistency - Less communication (state is on the server)	- Elastic scaling - Fault tolerance by default - Open to composability
Cons	- Harder failover - Can only scale vertically - Higher operational overhead	- Added complexity - More communication required - More complex retries and deduplication

Sync (Blocking) vs Async (Non Blocking)

Every network call, disk seek, and every RPC is happening asynchronously, in the background, at hardware level. This is the fact we cannot run away from.

The real trade-off is whether we expose that fact. Make our stack non-blocking (async) or hide this fact behind blocking (sync) API.

Opposite to the other trade-offs here this one has, relatively small impact on the overall architecture. However, it has a more significant impact on our codebase, and how our code works.

Side	Sync (Blocking)	Async (Non-Blocking)
Pros	- Easier to reason about - Easier to debug - Easier to set up	- Better resource utilization - Well-suited for concurrency/multithreading - Efficient at handling multiple I/O calls
Cons	- Wasted resources while idle - Poor performance with multiple I/O operations	- Harder to reason about - Risk of callback hell - More complex to set up and debug

While non-blocking may seem to be the clear winner here, it is not that simple. Complexity introduced by async may not seem so bad. Nonetheless, it is a totally different programming model than what we used to. In most cases it will require a completely new mindset.

Beware, the tricky part, async models do not always outperform sync models for CPU-bound tasks.

I think that a good approach is to use a sync model in the core of the code base. Then using an async model in the edges when you need to handle I/O tasks. I believe this mix will get most of the pros of both approaches. Besides, it will also leave our core/domain pure, and play very nice with hexagonal arch.

Coupling and Cohesion

Though we used to think of them when talking about microservices. These two metrics in fact can be used to describe any type of architecture. No matter its size. We can even use it to describe relations between classes in the source code of a particular service.

In short:

Coupling describes the interdependence between two modules.
Cohesion describes how well the elements of a module belong together.

It is not the trade-off per se more like a target we should aim to. No matter where we apply both of the concepts the relation between them should be the same. Our entities should have: High Cohesion and Low (Loose) Coupling. Any other relation between the concepts is unhealthy and will cause problems.

Our job is to correctly adjust the levels of Coupling and Cohesion not to overdo any of them.

Side	Coupling	Cohesion
Pros	- Fault isolation - Independent deployment and scaling	- Focused services/modules - Higher stability (fewer sources of change)
Cons	- Nano-services: too low coupling - Big ball of mud: too high coupling	- Unrelated domains mix, higher volatility (too low cohesion) - Potential duplication and limited code reuse (too high cohesion)

Other

These are not the only software engineering trade-offs out there — there are many more other ones. In fact, most if not all the decisions we make while designing the system are trade-offs.

Below a few examples:

Consistency vs Availability — probably the most famous. Microservice vs Monolith
3rd Tools vs In-house Cloud vs On-premise
Security vs Usability
Read vs Write Optimize

Navigating Software Engineering Trade-Offs

While they are not complex, long, and covers all possible edge cases, below rules are simple, cohesive, and easy to follow.

Evaluate the short-term and long-term impact of decisions

First and the most important rule — aim for the long-term.

Short-term gains are tempting — however they may have hidden costs and cause a lot of pain later on.
Estimate lifetime — services/modules/systems may not live long enough to see long-term at all.
Use data whenever you have them — without data you are just another person with an opinion.

Identify key stakeholders, their needs and act accordingly.

If you ever worked on any project with more than average complexity then you probably know that there are multiple people interested in its success (or failure). It is impossible to meet everyone's expectations.

Thus:

Map all interested parties
Prioritize the critical few who will approve or reject the outcome.
Capture their expectations and success criteria
Try to favor the side of trade-offs in such a way to meet their expectations.

Research — clarify goals and hard constraints

Requirements are not always clear, verify them.
When you have high level requirements, try to come up with an initial design.
Show you design to stakeholders and reiterate the requirements
Attempt to quantify metrics like: latency, throughput, storage whenever possible.

If you can, try doing Event Storming session. Crucial info has the tendency to show up in the most unexpected of times and places.

Remember : the more knowledge you will gather the easier it will be for you to navigate your landscape of trade-offs.

Document trade-offs, and their rationale (ADR)

Document, document and once again DOCUMENT. While it may sound trivial and repetitive it is probably the single most important thing you can do for your future coworkers.

Leaving behind even the simplest Architecture Decision Record (ADR), with:

What you chose
Why you did it
Pros and cons (optionally)
Alternatives considered and rejected

Such a document will ease up many, many things. Not to mention building your team reputation among whoever comes next to the project.

In the worst case, it spares future engineers from head-scratching, and from muttering unspeakables at 2 a.m. Which is probably the best measure of code quality.

Prototype or spike the extreme options to expose hidden pitfalls early.

If you think that you are lacking knowledge in some particular topic, or you are unsure as to how solutions would work. Try to spend some time and prepare a POC or do some spiking around the topic.

Better to drop some approaches sooner than later. It will cost less and be less painful. Just remember no to spend too much time on this, it should be POC not a fully working system, keep it simple.

Focus on simple solutions then optimize for the future.

As a final piece of advice:

Start from the simplest solution that meets current requirements
Optimize and make it more extensible only after completing the previous step.

In this way, you should end up with a well min-maxed system. It should meet all the requirements, be slightly optimized and had some free design space.

Summary

All of these may seem complex, hard or even overwhelming. Yes it is complex, there are a lot of software engineering trade-offs. However, there are multiple guides and best practices on how to navigate the problems of system design. I have even shared my own.

As with a many other things — practice makes perfect. There are multiple case-studies, books and articles on how to approach design challenges in different types of systems. I mention one of them here.

I believe that after some practice, all the problems here will sound significantly less scary.

Thank you for your time.

Blog Software Engineering Trade-Offs from Pask Software.

Building Software Engineer Library

Bartek Żyliński — Thu, 17 Apr 2025 21:21:06 +0000

I believe that every one of us, software engineers, should have our own personal library of software engineer books. Whether in old plain-text book form or in a newer, more eco-friendly electronic one is an open question. The important thing is to actually have one.

I am one of those strange people that believe that we people in general should read books. Doing so has multiple benefits, but let’s not dive too deep into this and focus on software engineering.

Well, there are a couple of problems with software engineer books:

They get old rather quickly
There are a lot of them
They are expensive
They have varying levels of quality

Given our limited time, the obvious conclusion is that it is hard to find a book worthy of reading, one we will not waste our money on. Here comes this article. It will be the first in a series focused on what books I recommend you include in your professional library.

This particular blog covers books that focus on the softer parts of our job:

How to grow your career,
How to approach your work
Various other problems that we encounter on our professional journey.

I have also included one more technical book, which I believe will be a good starting point in software architecture.

As to why I do not recommend any algorithmic books, it is simple – I just do not like them. Better go to LeetCode; they have a very good crash course on DSA. I tried it myself, and I strongly recommend it.

Disclaimer

This article is not sponsored in any way, shape or form.
I have read most of the books in this and the following articles. If this is not the case for one of the books, I will explicitly mention it.

Software Engineer Books

Software Craftsman

Software Craftsman, The: Professionalism, Pragmatism, Pride by Sandro Mancuso ends as my first recommendation.

It is one of my most favorite job-related books. Inside, the author speaks a lot about the importance of professionalism and continuous learning in our work. He also describes how applying both of these concepts can help us grow and be better engineers. Additionally, the author advocates for treating software engineering as a craft, following high coding standards, and taking pride in our work.

All of this is intersected with different retrospectives from his professional experiences and situations in his life. What is very interesting, the author puts the emphasis on being pragmatic first and idealist later. While idealism is important, the book stresses the need for practical solutions that actually work in real-world scenarios.

For me, one of the most important takeaways from this book is: Balancing idealism with pragmatism is essential for success.

Clean Coder

The Clean Coder: A Code of Conduct for Professional Programmers by Robert C. Martin gets the second place on my recommendation list.

This book is very similar to the previous one. What I mean by this is that it focuses on professionalism, praises continuous learning, discipline and advocates for keeping high quality of work. Similar to Software Craftsman, the author enriches the text with anecdotes from his own life experiences.

This is interesting in and of itself, at least for me, as some situations are pretty old, and we may see how programming looked when Uncle Bob was starting his career.

Overall, The Clean Coder serves as both a practical guide and a motivational book.

The Software Engineer’s Guidebook

Next, we have The Software Engineer’s Guidebook, by Gergely Orosz. Its main point of focus is career growth and professional development. This time there are no anecdotes and retrospectives, but we still get a lot of useful recommendations.

While this book seems more focused on growing and advancing your career as a Senior Engineer, there are a couple of chapters that can also be interesting for those of you with less experience. Especially chapters 1 and 2, which focus mostly on how to navigate your career and how to be a good Engineer.

Additionally, the third chapter speaks on the quality of a good Senior Engineer, which may also be helpful while planning your career as a junior or mid-level programmer.

This book provides one of the best career-building frameworks I have ever seen. Despite the fact that I put it here as the 3rd one, I actually recommend you read it first.

System Design Interview

With System Design Interview – An insider’s guide by Alex Xu, we move away from the area of career development software engineers books to more technical ones.

This book is probably the simplest, and most easily laid-out intro to system design and/or architecture I have seen for a long time. Besides that, it also provides some insight into how some of the systems we use in our everyday life actually work.

In addition to obvious interview tips, like back-of-the-envelope estimates or a framework for approaching the interview itself. It also gives a lot of useful information on how to scale your system, and introduces quite interesting concepts like consistent hashing or different rate-limiting algorithms. You know, the things that you would rather not meet in day-to-day work.

I strongly recommend reading it even if you are not actively preparing for an interview. I am sure it will be time well spent, besides it also quite pleasant to read. There is also a second edition of this book; however, I recommend starting with the first one, as the second edition sometimes mentions concepts introduced in the first one.

While the books above are my personal favorites, and I strongly recommend starting with them, the books below may also be interesting and informative to read. Nevertheless, I would give them a somewhat smaller priority.

Mythical Man-Month

The Mythical Man-Month. Essays on Software Engineering by Frederick P. Brooks

This book is a perfect example that some things in this world do not change, no matter how much time passes. For some companies, all the problems described in this book are still valid, even decades after its first publication. There are still people who believe in the old phrase: “nine women can have a baby in one month.”

If you look carefully enough, you may even notice some of these problems in your organization.

This fact alone is more than enough for me to recommend this book. Its main lesson—that adding more people to a project does not automatically make it faster—is still one of the most important factors that one must take into consideration while planning anything serious.

The Staff Engineer’s Path

The Staff Engineer’s Path: A Guide for Individual Contributors Navigating Growth and Change by Tanya Reilly

This book is quite a challenge for me. It is widely recommended as one of the best books focused on career development, and the challenges we can face while trying to expand beyond coding. Yet it totally did not do the trick for me; I stopped reading around 50%.

For me, the biggest problem at the time of reading was the feeling you need to work in a specific environment and have specific opportunities for some recommendations from this book to actually be useful. Nevertheless, I recommend you at least give it a try, especially if you have to become a Staff Software Engineer. Maybe you will find it more meaningful than I did.

There is one important thing about this book: It is aimed mostly at senior software engineers. While you may find it beneficial nonetheless, I believe it still makes sens to add this note.

Gang of Four

Now we have Design Patterns: Elements of Reusable Object-Oriented Software (Gang of Four) by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.

This one is a more honorable recommendation due to its impact on software engineering as a whole than anything else.

While the patterns are still valid and remain virtually unchanged through 40-odd years since the book’s publication, most of us will probably use a tiny subset of them. I believe that knowing all 23 patterns described in the book is not the best use of your mental real estate. Besides, there are easier ways to get familiar with design patterns, and they do not involve reading code samples in C++.

Despite all of this, I think that reading it may be somewhat interesting and insightful. At least it was for me a few years ago when I was reading it.

Pragmatic Programmer

Last but not least, we have The Pragmatic Programmer: Your Journey To Mastery, by Andrew Hunt and David Thomas.

While this book aged better than the previous one, it still may feel a little dated. Mostly due to the fact that a lot of the concepts and approaches described inside were considered novelties at the time of publishing are now widely adopted standards that you probably already know about.

You may find some aspects described in the book insightful and thought-provoking, but I would not recommend prioritizing it highly. There are more worthy software engineers books for your time out there.

Software Engineer Books – Summary

Here we are. Below, you can see the table with the order in which I recommend reading the books from above. Of course, do not be afraid to change it as you see fit; it is just written, not set in stone.

Book
The Software Engineer’s Guidebook
Software Craftsman
System Design Interview
Clean Coder
The Staff Engineer’s Path
Mythical Man-Month
Pragmatic Programmer
Gang of Four

Thank you for your time.

Blog Building Software Engineer Library from Pask Software.

Test Pyramid: Best Practices For A Reliable Test Suite

Bartek Żyliński — Thu, 17 Apr 2025 20:25:07 +0000

Testing our code is essential for maintaining the high quality of our code. In the long term, tests are crucial to ensure that we have maintainable software at all. Today I will dive into the Test Pyramid and present a way how you can structure your tests to get the most out of them. If you want to know other best practises for tests, check FIRST.

However, before we dive into the Test Pyramid, let’s take a look at different types of tests that we have.

Tests Taxonomy

Unit Test

Simplest test intended to verify correctness for singular methods or functions in isolation.

Integration Test

Verify the interaction between different modules our applications have, usually one at a time, identifying issues at the interfaces between integrated parts.

E2E Tests

High-level tests that verify the whole flow correctness, from providing input to validating output on the opposite end. They validate if the application works well as a whole.

Smoke Tests

Very simple tests that run on an up-and-running system, usually just after deploying a new version, to ensure that the most critical features are working as expected—a kind of sanity check of our system.

Contract Tests

Validate if two sides of some arbitrary interaction are compatible with one another. They check whether the responses from one side of the interaction match the expectations of the other, and vice versa.

Performance Tests

This type of test verifies if the performance of our applications meets the requirements, usually done on a setup as similar to production as possible and in the scope of the whole system.

Pen-Test/Security

A very diverse catch-all term for all the checks and tests that verify the security of our system.

Chaos Testing/Engineering

It is more an approach than an actual test. Chaos Engineering is aimed at testing system resilience by extreme measures. It works by introducing unpredictable but intentional and traceable failures into the working environment.

These are not all the types of tests out there, but the exact list depends on whom you ask and how far into categorizing you are willing to get. I believe that the types mentioned above are the most crucial ones, and we will focus on them in today’s text. I also believe that they are the reasonable ones.

Original Test Pyramid

Test Pyramid a concept used to describe the test setup to which a system should aspire, visually. It consists of different types of tests. The test types are sorted so that the base is represented by the test type of the highest quantity. Moving higher in the pyramid, each level is represented by the type with a lower number of tests in the overall set.

In my opinion, the best representation of this test pyramid is presented by Robert C. Martin in his book, The Clean Coder: A Code of Conduct for Professional Programmers.

Basically, we should have a high number of unit tests as a base, though having only a small set of integration and E2E tests. Performance and security tests are included under System tests.

This approach has a few good points like:

Fast and cost-effective feedback

Unit tests are fairly easy to set up and, at least by the book, should run quickly, reducing the feedback loop for the developer.

It is CI/CD friendly

Having fewer complex tests like E2E and integration tests promises that it would be simpler to set up CI/CD jobs. Besides CI runs faster with less integration and E2E tests.

Reliability

Unit, component and integration tests are less flaky and less complex than full E2E tests. Thus, we have smaller chances of any non-deterministic errors while introducing new tests and/or changing our test environment.

Additionally, as a whole, the Test Pyramid provides a clear and ready framework on how one should structure tests to get a more reliable system.

Still, while having all these benefits, it is not free of drawbacks, which I will describe in the following paragraph.

Why It Is Not Enough

Well, the first and most important problem in terms of the original test pyramid is the over-reliance on Unit Tests. Such over-reliance introduces a set of problems to our application:

Striving to have a high coverage of unit tests in your applications may not necessarily be a good idea. While fast and easy to build, it is very easy to dig too deep into unit testing your code. In such a case, any further changes related to this component may require a lot of additional work.
Unit tests are not suitable for every project life cycle phase; sometimes even writing proper unit tests may not be possible at all, thus you will have to heavily rely on mocks.
The current shape of the pyramid can give a false sense of security, as you have a few tests that actually test the “living, breathing” system. While on unit and integration levels all things may appear right, they may not work correctly as a whole unit.
In its current shape, we do not have a large space for non-functional tests, like security tests or performance tests. It also does not mention contract or smoke tests.

Last but not least, remember that the test pyramid is a concept, and as with every concept, there is no need to blindly adhere to it if you do not see any sense. Remove one layer or more of the pyramid if it does not make sense for you.

Test Pyramid Per Use Case

If the original test pyramid is not enough, and I still want to have some guidelines for tests, what then? Well, let’s throw the test pyramid away and just make a priority list of tests. Let’s iterate from the most to the least important type of tests that you need to have. Additionally, let’s make it on a case-by-case basis.

Change Heavy

Let’s start from the change heavy case. It does not have to be startup, it can be anytype of greenfield or just a new service. Well, here you can go with even zero tests; you probably need velocity and quick customer feedback, not tests. You need freedom to break stuff and rebuild them quickly, not rewriting all the tests from the ground up.

Here I would recommend focusing on E2E tests for paths that are the most crucial for you. Paths that are your main selling points and competitive advantages. While problematic in case of need for more velocity, I believe such a setup will benefit you the most, and will give you feedback on the operation of your most important parts.

I would recommend some unit tests if you have some algorithm-heavy or complex logic inside your codebase, especially if it is crucial for your operations and impacts customers directly.

What is more, I would suggest doing some performance tests before going live—going viral on day one in this way is probably not a desired result.

If, by some miracle, you still have time to spare, set up some monitoring for the service. Trust me, it will be worth the time and the effort.

Stable

Opposite to the change heavy API, where everything may need to be changed and rewritten from scratch, here we have a system without such events—at least not frequently. We have infrequent changes, or the change impacts only a small subset of features.

In such a case, I would recommend going into the following structure: required integration tests, E2E tests, smoke tests, maybe security and performance tests, and consider contract tests if you are exposing an API.

Following such a structure will give you:

Real-life guarantees as to your system’s operations.
Freedom to change underlying implementation without the need to change your tests.
A tool for finding problems in your integrations with 3rd party providers.
A tool to quickly ensure your system is working correctly after deploying the system.
A lot of insight from security and performance tests.

Service Oriented Architecture

This case is kind of a tricky one, as different services may be owned by different teams, and in general, it should be their decision how they want to test their component. However, I believe that there should be a recommendation or best practice to have contract tests for every component, which exposes any type of API. Thanks to following this you will have extract guarantees after any type of change in one of your services.

If your design is mature enough, you can try introducing chaos engineering and see what results it will yield. System-wide pen-tests can also be a good idea, better done collectively rather than individually. Some additional problems may occur in service as a whole.

Besides that, I would recommend having systems wide requirements for observability—maybe some preset dashboards, alerts, system-wide best practices. I think that it will give the teams some frameworks they can easily adopt for their unique cases.

As for the individual services, I would not recommend anything specific; pick the tests that suits your use case the best.

Monolith

This case is a kind of mix of all the previous ones. I recommend choosing your approach based on how frequent the changes are and what is changing. Remember to take into consideration the coupling between different components inside the monolith.

If you frequently change the inside of the monolith, not the interface, then go for E2E tests. On the other hand, if you frequently revise the API, then go for whatever is closer to unit tests you can get. Do the same if you cannot set up E2E in any way, or it is too complex to be actually worth it.

If there is a high coupling between different components, or the boundaries between them are blurry, maybe try writing something akin to “E2E tests” on a higher component level.

If it is not there yet, try to set up well-defined logs, metrics, and possible alerts, as close to per-component basis as possible.

Test Pyramid Common Parts

Besides structures that I mentioned before, there are a couple of different tools that may help you build more reliable systems. Not all of them are mandatory—maybe besides monitoring (this one, in my opinion, is a must-have). Pick the ones that you think will help you.

However, try to think through all of them; I believe that it will be time well spent nevertheless.

Performance Tests

While not all systems and modules have strict performance requirements, it may be beneficial to have some performance tests.

We can provide additional insights for our product or business:

We know how far we can scale if the need arises at some point.
We can notice that some feature negatively impacts our performance.

I know it may not be the most crucial part for non-critical systems. However, at least we know about the issue and can make a decision on what to do with it instead of just letting it through.

Pen-tests / Security Tests

Again, as with performance tests, not all services and systems require these. Nevertheless, it may be beneficial to at least entertain the idea. You may find some interesting insights along the way. The exact scope and scale greatly depends on a number of various factors. If you want to know more about security, I write on this topic in more detail elsewhere.

ArchUnit Tests

I think that for all four cases it may be worth to try writing some tests in ArchUnit fashion. At least when your code structure will stabilize. While it may seem like a wasted time, it will for sure help you keep your code in shape for longer.

Observability

Tests are not the only thing that you will need to create robust systems. The whole infrastructure part around your system may be even more crucial than the tests in ensuring flawless operation of your systems.

As an addition to your tests, you should also have good logging, metrics, and possibly alerts. They will give you additional insight into the operations of your systems. They will also polish some rough edges around your tests and may help identify some bottlenecks not caught in the tests.

Chaos Engineering/Testing

Probably the most complex concept to implement correctly. While deliberately introducing any type of disruptions or failures into otherwise perfectly working system seem not the brightest idea. It can help identify weaknesses and problems that will not show up in any other case.

However, this type of “tests” is very, very complex. Introducing failures—no matter if they are intentional or not—is never fully safe. Before going head-on with this, double-check that your software and infrastructure are actually ready to live it through.

Test Pyramid Trade-off & Considerations

Before we jump to the conclusion, there are a couple of trade-offs and assumptions that I think you should take into consideration while picking the tests that you want to use:

Time limits

One of the considerations when picking, which tests to focus on is time restrictions. If you have very strict limitations on how long your tests can run, then focusing on unit tests, and some integrations would be better than going for a full E2E test set, and vice versa.

Integration tests

In my opinion, a database is not a good case for integration tests nowadays. Integration tests should be used only for 3rd-party services that have complex behavior and cannot be easily tested in E2E tests. If you have such dependencies in your system, then that is, in my opinion, the only valid point to write integration tests. The database layer can be tested in the E2E test layer.

Unit tests

I believe that unit tests should only cover the algorithm/logic-heavy pieces of code. There is no point in trying to reach higher coverage tiers with unit tests. In my opinion, it is better to focus on E2E tests. Sometimes, especially for poorly design architectures, writing actual unit tests is much harder than it looks.

Setup complexity

In some cases, it may not be an option to create E2E or unit tests. In such a case, pick the one, which is easier to set up and maintain and gives you more reliability. It may be reasonable to change your architecture/design to be more testable.

Over-reliance on mocks

While writing any type of test, be careful not to overuse mocking and/or stubbing. You can easily start testing mock and stub behaviors instead of the actual code.

Test implementation

For unit tests, do not go too deep into testing your behavior. Try to test interfaces, not the content of your methods. For E2E tests, try to use as much of the actual components as you can. Do not write your own stubs until you have to, testcontainersmay come in very handy here.

Summary

Let’s start with a table to show concepts from previous paragraphs in a clear and concise manner.

Per Type Of Environment You Want To Run Your Tests

Type	Base	Optional
Change Heavy	- E2E for crucial parts of API - Good observability pipeline (from logs to alerts) - Smoke tests for crucial paths	- Performance tests for crucial parts - Security tests - Unit tests for logic/algorithm-heavy parts - Integration tests for 3rd-party services
Stable	- E2E - Good observability pipeline (from logs to alerts) - Integration tests for 3rd-party services - Unit tests for logic/algorithm-heavy parts	- Performance tests for crucial parts - Security tests - Consider if you need Smoke Tests and their scope
Service Oriented Architecture	- Contract tests for services exposing APIs used by other services - Choose exact test setup per service - Design base observability approaches for each team to adopt and extend	- System-wide Performance tests - System-wide Security tests - Consider Chaos Engineering
Monolith	- Pick the tests that are easier to set up and maintain - Good observability pipeline (from logs to alerts) - Smoke tests	- System-wide Performance tests - System-wide Security tests

Per Test Type

Test Type / Environment	Change Heavy	Stable	Service Based	Monolith
Unit	No	Logic heavy methods	Per service basis	Depends on the setup cost
Integration	Consider for 3rd party service	3rd party service	Per service basis	3rd party service
E2E	For critical paths	Mandatory	Per service basis	Depends on the setup cost
Contract	No	When and where applicable	Recommended for all services	No
Performance	For consideration	Yes	Per service basis	System wide
Smoke	Consider for critical path	Consider for critical path	Per service basis	Consider for critical path
Security	For consideration	Yes	System wide	System wide
Observability	Yes	Yes	Predefined rules	Yes

It is not a perfect silver bullet for every case—there is no such thing or recommendation. Everything here is based on different trade-offs, some of them are mentioned in the paragraphs above.

My final recommendation is: Just write the best tests that you can, given your design and possibilities.

Thank you for your time.

Blog Test Pyramid: Best Practices For A Reliable Test Suite from Pask Software.

Monolith: The Good, The Bad and The Ugly

Bartek Żyliński — Sun, 13 Apr 2025 18:10:31 +0000

After initial very warm welcome, and a wave of hype microservices are no longer considered a silver bullet for all software pitfalls. The plain old monolith approach started to get mainstream attention once again. Especially Modular Monolith, which seem to mix both approaches in the best way. That is why, today I would like to bring it and other monolith subtypes to your attention in more detail.

Why It Matters?

Well, there are a couple of possible traps, which you may fall prey to while working with the monolith. These traps and their possible consequences, are strongly related to the monolith subtypes I will cover below. These subtypes have their respective names, but I prefer to name them: the good, the bad, and the ugly according to the number of headaches they may cause for its maintainers and owners.

In a more reasonable way, they go like the following:

The good – the modular monolith
The bad – the distributed monolith
The ugly – the traditional monolith

The Ugly

The plain old monolith or the ugly. It has its own problems and quirks – it is hard to maintain, deploy, and has some performance bottlenecks, but at the end of the day, it is doing its job. That’s why I call it ugly. Not the prettiest, but the job is done.

It is the most common case in monolith implementations as it is a kind of natural way of software progress. When there is not too much to think about long-term problems and consequences, and “now” is more important than “later”.

Some traits of this subtype:

Single codebase and deployment unit.
In-system communications, no outgoing requests.
Tightly coupled but simpler to understand, up to a certain point.

The Bad

The distributed monolith or the bad. It is probably the most common anti-pattern when dealing with migration of a monolith to microservices. In particular, it is a result of failed transitions to microservices.

Essentially, it is a monolith but split across multiple services. There may be more pitfalls hidden behind this, but this one is the most important. While in some cases a distributed monolith may still do its designed job, it is more likely than it will fail.

Some traits of this subtype:

Split into multiple services.
Communication over the network, usually a synchronous one.
Still tightly coupled, like a monolith, with badly designed responsibilities inside each service.

The Good

The modular monolith or the good. It is a monolith with clearly designed boundaries between the modules inside—thus a name, modular monolith. Different modules can basically be developed independently, using only in-place interfaces to call one another.

It is the best subtype of monolith that exists; usually, it is also ready to be split into microservices when the need arises. It turns some of a monolith’s disadvantages into advantages, yet keeps the whole application as a single deployable unit.

Such an approach greatly reduces the cognitive and economic burden needed to manage and run microservices. This architecture is easier to develop, test, and deploy while providing a clear path for gradual evolution into microservices if needed.

Some traits of this subtype:

Single codebase and deployment unit.
High cohesion inside the modules and loose coupling between the modules, at least as possible inside a de facto single service.
In-system communications, no outgoing requests.

Monolith Migrations Tips&Tricks

If you ever decide to migrate from monolith to microservice here is a few good tips, for you to avoid ending as bad.

Foremost Analyze only after carefully analyzing different flows and dependencies you will be able to make correct decision later.
Migrate By Domain Not Layer it will make whole process less mind-blowing while also reduce the chances of creating high coupling between components.
Do It Step By Step do not try to migrate everything at once, do it domain by domain, or by any other approach that comes to your mind, maybe migration to "The Good" is a good idea for a start.
Contract Test Right Away thought about contract test, and all necessary setup from the start.
Talk, Talk and Talk even more be even over-communicative with what you are doing, be 100% sure that all the involved teams know what you are doing and how it affects their work.
Build Clear Ownership each service/module need to have clearly assignee owner, there should not be situation that multiple team are responsible for a single service.
Know When To Stop chances are that you are Netflix and probably do not need an architecture of 100+ microservice, think carefully what need to have a separate microservice and what may be merged with some other service.

Summary

Putting the whole text into a more readable and understandable format:

Monolith Subtype	Deployment	Communication	Coupling	Complexity
Traditional Monolith (ugly)	Single deployment unit.	In-process communication.	Tight coupling between different components.	Grows with the growth of the code base.
Distributed Monolith (bad)	Multiple services – changes in one often require redeploying others.	Synchronous communication over the network.	Tight coupling between services.	High complexity due to distributed nature of deployment.
Modular Monolith (good)	Single deployment unit, but internally organized into well-defined modules.	In-process communication.	Loosely coupled modules with clear boundaries.	Moderately complex due to strong modularity and clear responsibilities.

Now you know more about different approaches to building and working with monoliths.

Thank you for your time.

Blog Monolith: The Good, The Bad and The Ugly from Pask Software.

ACID vs BASE: Choosing the Right Transactional Model

Bartek Żyliński — Sun, 13 Apr 2025 17:35:50 +0000

ACID vs BASE principles are two main approaches to handling transactions. All other approaches are just variations of the two; we can even say that, to a certain degree, BASE is a variation of ACID. Furthermore, some databases may pick to support ACID transactions for part of operations, while not providing the same quarantine for others – just like MongoDB here.

In today’s text, I will cover the description of both abbreviations, and their use cases, closing with an in-depth summary of the differences between them. For now, let’s say that the biggest difference between the two is: that ACID prioritizes consistency over availability, while BASE prioritizes availability over consistency.

Why Databases Are Important?

Databases are all around us, just as distributed systems. Whether a particular database is more of a relational or non-relational kind does not matter, as we barely see them. We only see our graph of friends on Facebook, our history of messages in Messenger, or “just” our money in a bank account. We do not see all the processing and storage behind them. Essentially, we are focused on some pretty things that are not always stored and processed in such a pretty fashion.

Transactions, and more specifically the transaction models that we will cover today, are the cornerstone of how our data are being processed and to a degree stored. But what is even a transaction? The definition here is straightforward. From the database perspective, the transaction is any set of operations that can be performed as a single logical unit of work. Besides the classic case of bank transfer between accounts, processing the order, or submitting the review can also be valid transaction examples.

Transaction models are one of the most important things for database engines. It is a kind of very low level (or high level) API of our database. Following a particular set of principles will result in our database exposing a certain set of features and behaviors. In the end, a particular approach will limit its use cases and make our system more complex than it should be. Knowing the trade-offs of transaction models can be viewed as an advantage, as we can pick the best tool for a job.

Moreover, while ACID is quite common, and most software engineers have at least heard of it, BASE is quite the opposite – and knowledge is power.

What Is ACID?

ACID is an abbreviation for Atomicity, Consistency, Isolation, Durability.

Atomicity

A transaction must always succeed or fail, with no intermediate results. If a transaction contains any changes of state in multiple tables, then in case of failure none of them should be persisted.

Following the example from above, assume that we want to process an order for a customer. To do this, we need to perform two operations: update the number of items from the order and create a new order record. If we fail on the second task of the transaction, then the results of the first should be rolled back.

For example, if after updating the item quantity we fail to create a new order record in our database, then the first part of the transaction should be rolled back – item quantity should remain unchanged.

Consistency

Each transaction should preserve all integrity constraints, entity relationships, and business rules present in the database. All these mechanisms should work regardless of the number of concurrent transactions or possible failures that may occur in the meantime.

It means that any transaction should move the database from one valid version of state to the other.

For example, before the order creation, the quantity of items was: A=7, B=10, and the customer ordered 5 pieces of A and 3 pieces of B. Then after the transaction, we should have 2 pieces of A and 7 pieces of B in store. The total amount after the transaction should be the same as before.

Isolation

In a concurrent environment, transactions should not interfere with one another. It means that changes from one transaction will not be visible to another transaction before it commits. Such an approach gives us the impression that the transactions are executed one by one, while underneath they are executed simultaneously.

Additionally, there is one more thing that needs to be mentioned here: Isolation Level. It is a database-wide setting that describes how much a transaction is affected by other transactions. In most of the databases, we can tune up (or down) the Isolation level. Typically, the higher we go up with Isolations, the more we degrade the database performance, and vice versa.

The official SQL documentation describes four different Isolation Levels:

READ UNCOMMITTED – Transaction can view data changed but yet uncommitted by other transactions. In such a case, when rollback occurs, we might use data that does not exist in our database. Usually, you should not use this Isolation Level. It only makes sense when you are querying a dataset that will not change at all in any way.
READ COMMITTED – Transactions can only read data that are committed at the moment they are read. It offers a good balance between consistency and performance in most of the use cases. Thus, it is the most common default Isolation Level used in multiple databases.
REPEATABLE READ – This Isolation Level aims to address the issue of different read values of the same queries within a single transaction. It is an ideal Isolation Level for read-only transactions, as it guarantees that if the row is read twice in the same transaction, it will return the same value each time.
SERIALIZABLE – It is the highest Isolation Level, here all the transactions are completely isolated from one another, supporting the most demanding consistency guarantees. However, it effectively makes database reads serial and may produce more transaction retry errors coming from interference between them. In most cases, the combination of these two factors results in a significant performance decrease.

Each of the Levels after the READ UNCOMMITTED aims to cover specific Read phenomena , with the higher Levels also covering the issues of predecessors.

Isolation Level	Read Phenomena
READ COMMITTED	Dirty Reads
REPEATABLE READ	Dirty Reads, Repeatable Read
SERIALIZABLE	Dirty Reads, Repeatable Read, Phantom Read

The detailed description of these problems is kind of out of the scope of this text, but you can read more here.

Durability

After the commit of the transaction, its results are guaranteed to be permanent. Even in the case of system failure, crash or power loss, there should be no after-commit data loss. Usually, it is done by saving the results on some form of persistent storage; in most cases, it is a plain old hard drive.

What Is BASE?

BASE is an abbreviation for Basically Available, Soft State, and Eventually Consistent. The acronym serves to highlight their difference. In chemistry, ACID is the opposite of BASE.

Basically Available

In a concurrent environment, the database guarantees availability. It will try to respond to the request even with somewhat stale or incomplete data. What this means is that in the case of a sudden spike of requests, it may choose to focus on handling read requests while slowing down updates and inserts.

Soft State

The state of the system may change even without immediate reason, due to the approach built over the eventual consistency model. When multiple applications update databases, the particular record (or records) is in an intermediate state. The final consistent state will be calculated when all transactions for a particular record are complete. All intermediate states are also visible to all interloping reads or writes.

Eventually Consistent

The database will eventually reach a consistent state when no more updates will be present. At some random point, we may see some inconsistencies, but they should be resolved in a longer time span. Consistency is not guaranteed at a transaction level.

For example, in the case of a geo-distributed database, when some record is updated, the new state may not initially be visible to users in different regions due to network latency. However, after some time, the new state will be propagated to all nodes and thus the change becomes visible.

ACID vs BASE

Data Integrity

ACID properties provide a more strict guarantee as to the consistency of our state. Random occurrences of some data inconsistency are very rare. Also, the potential chances for data loss are relatively low. Though they still exist, it is just quite hard to cause them – at least without some bad luck.

BASE , on the other hand provides, little to no guarantees as to the state of our database. We only know that at some point in the future, it will be consistent, but when it will be – nobody knows. All intermediate state changes are still visible and may impact other queries. The BASE also does not guarantee the data durability by principle. Without such a guarantee, the event of partial or total data loss is more likely to happen.

Integration Complexity

ACID , because of its high consistency guarantees, it takes a lot of burden off of software engineers’; shoulders. With ACID, we usually do not have to worry too much about an inconsistency between transactions, “seemingly random” data changes, and or losing part of our data on the fly.
BASE in this case, we have to cover most of such cases by ourselves. We have to know that some inconsistency may – and thus will – occur. Thus, we must implement their resolution accordingly.

Performance

ACID -complaints databases usually use locks to synchronize access to particular resources in a concurrent environment. As the transactions on conflicting records are processed in a strict order, you should be ready for some delays in transaction resolution.
BASE -compliant systems are somewhat simpler – there are no locks. The database is synchronizing the operations eventually without hard time guarantees. Nevertheless, the time of transaction resolution may also become quite long.

Scalability

By principle, the BASE compliant database scale is noticeably better than ACID ones – mostly due to a more relaxed consistency model. In general, most BASE databases are designed to support horizontal scaling, while ACID ones are limited by vertical scaling with the possibility for read replicas.

BASE vs ACID In A Single Database

Such behavior is virtually impossible due to the vastly different consistency models of both approaches. Additionally, in the case of a network partition, we can choose to be either available or consistent; we cannot be both at the same time.

Despite this fact, there are a couple of NoSQL databases that try to mix both paradigms. Thus, they expose ACID properties in certain parts, or for certain features. This list is quite long, so I will not put the whole, but it includes databases like MongoDB, Cassandra (ACID support for single row operations), and Couchbase.

Summary

As always, the summary is quick and simple — just look below.

Feature	ACID	BASE
Data Integrity	High consistency and durability	Eventual consistency, no durability guarantee
Integration Complexity	Lower, most of the inconsistencies are handled on the DB side.	Higher, we have to be aware of inconsistencies and their impact.
Performance	May degrade rapidly in concurrent environment	Should handle more load without performance degradation
Scalability	Vertical scaling with read replicas	Horizontal scaling

Please keep in mind that this comparison is only focused on the theoretical difference between both approaches. It aims to deepen your knowledge of both of them.

When choosing an exact database, I would recommend focusing more on a product-to-product comparison. Most of the features and behaviors may differ depending on the exact database implementation.

Thank you for your time.

Blog ACID vs BASE: Choosing the Right Transactional Model from Pask Software.