When I started building this project, I wanted to learn by building something similar to how backend systems in payment processing apps work.
I wanted to build something that made me think carefully about throughput, ordering, idempotency, auditability, and failure boundaries together.
That led me to build HVTP (High Volume Transaction Processor) — a portfolio-grade, event-driven transaction processor that behaves more like a small transaction backend.
What made this project valuable for me wasn’t just wiring Kafka into a system.
It was learning how to shape the system so the right work happened in the right place.
What the project actually is
At a practical level, HVTP is a signed transaction ingestion pipeline.
A merchant client sends a transaction request over HTTP. The system validates the request at ingress, accepts it quickly, and then hands it off for asynchronous processing.
From there, the system:
- validates and processes the transaction
- enforces idempotency
- persists ledger state
- stores immutable audit events
- exposes a status API
- supports reconciliation between stores
- emits terminal outcomes through downstream flows
The stack looks like this:
- Kafka for event flow
- Valkey (open-source Redis fork) for idempotency and some read-path control
- PostgreSQL for the ledger / queryable durable state
- MongoDB for immutable audit events
- Spring Boot services split by responsibility
- k6 for ingress load testing
This project is not about reproducing a regulated payments platform.
It is about building a system shape where correctness, isolation of responsibility, and observable behavior matter.
Why I kept the request path small
One option was to do everything in the request path:
- Receive HTTP request
- Validate everything in the same service
- Write directly to PostgreSQL
- Also write to MongoDB
- Return success
That would have been simpler to build at first.
But for this project, I wanted to separate request acceptance from downstream processing. I wanted the ingress layer to stay focused on validating, accepting, and handing work off quickly, instead of taking on ledger writes, audit writes, and every other downstream concern synchronously.
That decision shaped the rest of the architecture.
The architecture I ended up with
I split the write path into a small event-driven pipeline:
That split gave each service one main responsibility:
-
api-service→ signed ingress + fast acceptance -
processor-service→ validation + idempotency -
ledger-writer-service→ durable ledger persistence -
audit-service→ immutable audit history
What I liked about this structure was that each boundary had a clear reason to exist.
Main architecture and design decisions
1) Keep the API fast
The api-service just accepts the request and returns 202 Accepted.
I did this to keep the HTTP layer as an intake boundary, and not process the full transaction.
In HVTP, the ingress path is intentionally limited to:
- validate request shape
- verify signature
- publish to Kafka
- return acceptance
That means the API is not waiting on:
- Idempotency checks
- PostgreSQL ledger persistence
- MongoDB audit writes
- downstream webhook behavior
This was one of the most important decisions in the project because it kept the front door responsive even when downstream work had different timing characteristics.
2) Use Kafka for decoupling
I used Kafka because I wanted request acceptance, transaction processing, ledger persistence, and audit persistence to move at different speeds without being tightly bound to one another.
HVTP currently uses:
transaction_requeststransaction_log- dead-letter topics for failure paths
That gave me a few concrete benefits:
- the API can accept requests without waiting for downstream writes
- the ledger writer and audit service can scale independently
- replay becomes possible
- failure handling becomes clearer
I also used accountId as the key for the main topics.
That was deliberate.
For this project, the ordering boundary I cared about was not global ordering across every transaction.
It was preserving ordering for transactions belonging to the same account.
3) Treat idempotency as a correctness concern
To deal with:
- retries
- duplicate submissions
- consumer reprocessing
and to make the system idempotent, I used an idempotency key.
Each request sent from the client includes an Idempotency-Key.
Without it, processing the same request twice could result in:
- Duplicate ledger updates
- Duplicate audit events
- Inconsistent downstream outcomes
I used Valkey (open-source Redis fork) to store and check this idempotency key in the processor service.
One of the most useful mindset shifts from this project was moving from:
“How do I process this request?”
to:
“What must remain true even if this request appears more than once?”
That question improved the architecture more than any individual framework decision.
4) Let PostgreSQL and MongoDB do different jobs
I used two stores intentionally because the write patterns and query needs are different.
PostgreSQL is the ledger
PostgreSQL stores the durable transaction state that the system can query through the status path.
It holds the queryable record of a transaction in a ledger-style structure, including fields like:
transaction_ididempotency_keymerchant_idaccount_idamountcurrencytypestatusprocessed_at
That is the durable store for the transaction state I want to query directly.
MongoDB is the audit trail
MongoDB stores immutable audit events, including values such as:
- transaction IDs
- merchant/account IDs
- correlation IDs
- statuses
- source topic
- timestamps
These stores answer different questions.
The ledger answers:
“What is the durable transaction state?”
The audit store answers:
“What happened around this transaction over time?”
Separating those concerns made the model cleaner and easier to reason about.
5) Design for replay and reconciliation
The ledger writer and audit service consume from the same event stream, but they write to different storage systems.
That means there is always some possibility of drift, timing gaps, or mismatched writes across stores.
So I added reconciliation support.
The project includes a reconciliation model that compares recent ledger and audit state and records summary runs like:
- audit count
- ledger count
- missing in ledger
- run status
- notes
I also wanted replay support to exist in the architecture before it became necessary.
That decision made the system feel more operationally realistic.
It shifted the design from “write to multiple places” toward “write, verify, and recover.”
6) Measure ingress behavior under overload
I also ran k6 load tests against the signed transaction ingestion endpoint at multiple offered rates, including 50K RPS and 100K RPS.
The purpose was not to describe the whole system as completing transactions at those rates end to end.
The goal was more specific:
“How does the ingress layer behave when offered far more traffic than the machine can sustain?”
That framing was important to me because it matched what I was actually measuring.
What the numbers showed
In local testing on a single machine, the API maintained:
- 0% HTTP failure rate
- 100%
202 Acceptedfor completed HTTP requests - accepted ingress throughput that leveled off around 3.1K–3.2K req/sec
A few highlights:
- At 1K offered RPS, it handled 60,001 accepted requests in 60s
- At 50K offered RPS, accepted throughput peaked at about 3,172.5 req/sec
- At 100K offered RPS, it still completed 189,936 accepted requests in 60s
- P95/P99 latency increased under overload, but the HTTP layer remained responsive
What I liked about that result was not the raw offered rate, but the saturation behavior.
The ingress layer stayed usable, throughput leveled off in a predictable way, and latency rose before failure.
That is a useful property in an asynchronous system.
The important caveat
These are HTTP ingress acceptance results, not end-to-end transaction completion metrics.
So the correct interpretation is:
- the API accepted the requests
- downstream completion happens asynchronously
- the numbers describe front-door behavior, not full workflow completion
For this project, that was the honest and useful performance story to tell.
Real implementation friction and subtle problems
The architecture diagram is the clean version.
Implementation is where the edge cases become visible.
1) 202 Accepted creates a visibility obligation
Returning 202 Accepted simplified the ingress path, but it also meant the system needed to answer follow-up questions such as:
- did it persist?
- did it fail?
- was it rejected?
- is it still in-flight?
That is why HVTP includes:
- a status endpoint
- correlation IDs
- downstream event flow for terminal outcomes and tracing
Moving work out of the synchronous path reduced coupling, but it also increased the need for visibility.
2) Ordering had to be defined carefully
Early on, I had to be specific about what “ordering” meant in this system.
For HVTP, global ordering across all transactions was not the target.
Per-account ordering was the meaningful boundary.
That is why Kafka messages are keyed by accountId.
It gives the ordering guarantee I actually needed without forcing all traffic through one serialized path.
3) Multi-store systems introduce operational edges
Using PostgreSQL for ledger state and MongoDB for audit events was the right choice for this project.
It also meant I had to care whether both stores continued to reflect the same logical transaction stream.
That is why reconciliation became part of the design rather than an afterthought.
There was also a useful implementation lesson here: the Mongo mapping used for reconciliation has to stay aligned with the collection the audit service is actually writing to.
That kind of mismatch does not always fail loudly.
It can quietly reduce trust in operational checks.
4) Performance framing matters
Once I added the higher offered-RPS tests, I spent time thinking about how to describe the results precisely.
The more useful framing was not a headline number.
It was explaining what the tests actually demonstrated:
- the ingress layer remains stable under overload
- throughput saturates at a predictable point
- latency rises as load increases
- the asynchronous boundary protects the front door on this hardware
That framing is more useful because it stays aligned with what the measurements actually represent.
What changed in how I think
Before building this, I mostly thought about high throughput as a performance problem.
After building it, I think about it much more as a boundary design problem.
The question that stayed with me was not:
“How fast can one service go?”
It was:
“Where should work happen, where should it not happen, and what must remain true when parts of the system are delayed, retried, duplicated, or partially broken?”
That shift changed how I think about backend systems.
A few things became much clearer to me:
- Async systems need strong visibility
- Idempotency is part of the design, not just an implementation detail
- Storage choices should follow write semantics
- Graceful saturation is a useful success condition
- Good architecture is often about clean responsibility boundaries
One practical lesson from this project was that precision matters.
202 Accepted should mean something specific.
A benchmark should measure something specific.
And each service should have a clearly defined responsibility.
That mindset ended up being one of the most useful outcomes of the project.
Final takeaway
If I had to compress the whole project into one sentence, it would be this:
I built HVTP to practice designing a system that can accept load quickly while keeping correctness, separation of concerns, and recovery paths in view.
That is what this project gave me.
It helped me think more clearly about how to keep the front door fast, how to handle duplicates intentionally, how to separate durable state from audit history, and how to design for verification instead of assuming everything will always stay aligned.
For me, that was far more valuable than just assembling a stack.
Final thoughts
If you’ve built something in this space, I’d be genuinely interested in how you approached trade-offs around:
-
202 Acceptedvs synchronous confirmation - Redis idempotency boundaries
- ledger vs audit store separation
- what you consider a useful throughput benchmark
Those design choices ended up being the most interesting part of the project for me.
If you want to explore the implementation, docs, and load tests, the full repo is here:
kaustubh-26
/
high-volume-transaction-processor
Event-driven transaction processor with signed ingress, Kafka workflows, ledger persistence, audit storage, and webhook notifications
High Volume Transaction Processor
High Volume Transaction Processor — An event-driven transaction processor
A production-style, event-driven transaction pipeline showcasing signed API ingestion, asynchronous Kafka processing, Redis idempotency, PostgreSQL ledger writes, and MongoDB audit persistence.
The repository is structured like a small payment platform:
- signed transaction ingestion over HTTP
- asynchronous processing over Kafka
- Redis-backed idempotency protection
- ledger persistence in PostgreSQL
- immutable audit persistence in MongoDB
- dead-letter topics for failed records
- webhook notifications for transaction state changes
- Actuator and Prometheus endpoints on every service
What This Project Demonstrates
- Event-driven microservices with clearly separated write responsibilities
- Per-account ordering by using
accountIdas the Kafka message key - Idempotency enforcement in the processor with Redis TTL-backed keys
- PostgreSQL as the ledger source of truth for persisted transactions
- MongoDB as an append-only audit store
- Reconciliation between the audit store and the ledger
- Replay support for rebuilding ledger state from
transaction_log - Signed ingress requests and API-key-protected status…

Top comments (1)
the decision to keep the API path thin and push complexity downstream is underrated. most early-stage systems do the opposite, synchronously write to everything in the request handler, and then wonder why latency spikes under load.
the idempotency key treatment stood out too. treating it as a correctness concern rather than a best-effort retry mechanism is a meaningful distinction. once you model it that way, the architecture shifts toward asking "what does processing this twice actually break?" which surfaces failure modes early.
curious how you handle visualization of the event flows as the system evolves. the architecture diagram looks clean, but keeping those updated as services drift is a common pain point.