I’m designing audit logging for a microservices platform running on Kubernetes with Go services communicating via REST and gRPC. I’m researching how teams typically implement audit logging across distributed systems.
The requirements are simple:
capture who triggered an action, what changed, and the before/after state, along with timestamps and results.
Areas I’m exploring:
- Capture point
At the API Gateway?
Inside each microservice?
A hybrid?
- Delivery model
Synchronous writes?
Asynchronous pipelines (Kafka, NATS, SQS, etc.)?
- Aggregation
Central audit logging service
Shared database
Event log / stream
- Failure strategies
Fail the business operation?
Buffer and retry?
- Performance
Avoiding bottlenecks
Batching, buffering, and backpressure patterns
If you’ve built audit logging across multiple services, I’d appreciate insights on what worked well.
Top comments (0)