In Part Two, we integrated Postgres using GORM, along with migrations and seeders, discussed Service layer pattern, created UserService example which we integrated with SayHello RPC, and ended the article with an integration test using testcontainers.
In this part, we will look at Redis integration for caching, Healthchecks for service monitoring, Observability with Prometheus metrics and OpenTelemetry tracing to make our microservice production-ready. We’ll also walk through a sample Grafana dashboard for metrics visualization and spin up a second service to demonstrate end-to-end distributed tracing in action.
Table of Contents
Redis Integration
Caching is an important part of scaling any service. It reduces database load and speeds up response times by storing frequently accessed data in a fast in-memory store. We’ll use Redis, the most popular in-memory key–value store, for this purpose.
Redis supports far more than simple caching — features like Pub/Sub messaging, geospatial queries, and time series are built in — but for this project, we’ll keep things focused on basic caching functions that can easily be extended later.
Setup
We start by updating our config.go with a new Redis struct:
type Config struct {
GRPCServer *GRPCServer `validate:"required"`
Database *Database `validate:"required"`
Redis *Redis `validate:"required"`
}
type Redis struct {
Addr string `validate:"required"`
Password string `validate:"required"`
DB int `validate:"gte=0"`
DialTimeout time.Duration `validate:"gte=0"`
ReadTimeout time.Duration `validate:"gte=0"`
WriteTimeout time.Duration `validate:"gte=0"`
PoolSize int `validate:"gte=0"`
MinIdleConns int `validate:"gte=0"`
}
Then we load it inside NewConfigWithOptions():
cfg := &Config{
Redis: &Redis{
Addr: getEnv("REDIS_ADDR", "localhost:6379"),
Password: getEnv("REDIS_PASSWORD", "default"),
DB: getEnvInt("REDIS_DB", 0),
DialTimeout: getEnvDuration("REDIS_DIAL_TIMEOUT", time.Second*5),
ReadTimeout: getEnvDuration("REDIS_READ_TIMEOUT", time.Second*3),
WriteTimeout: getEnvDuration("REDIS_WRITE_TIMEOUT", time.Second*3),
PoolSize: getEnvInt("REDIS_POOL_SIZE", 20),
MinIdleConns: getEnvInt("REDIS_MIN_IDLE_CONNECTIONS", 5),
},
//...
}
Next, we’ll create internal/cache/cache.go where our Redis implementation will live:
package cache
import (
"fmt"
"time"
"context"
"github.com/redis/go-redis/v9"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/config"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/logger"
)
type CacheService interface {
Set(ctx context.Context, key string, value interface{}, expiration time.Duration) error
Get(ctx context.Context, key string) (string, error)
Delete(ctx context.Context, key string) error
Ping(ctx context.Context) error
Close() error
}
type RedisCache struct {
client *redis.Client
}
type Opts struct {
Config *config.Redis
Logger logger.Logger
}
func NewRedisCache(ctx context.Context, opts *Opts) (CacheService, error) {
cfg := opts.Config
rdb := redis.NewClient(&redis.Options{
Addr: cfg.Addr,
Password: cfg.Password,
DB: cfg.DB,
DialTimeout: cfg.DialTimeout,
ReadTimeout: cfg.ReadTimeout,
WriteTimeout: cfg.WriteTimeout,
PoolSize: cfg.PoolSize,
MinIdleConns: cfg.MinIdleConns,
})
r := &RedisCache{client: rdb}
if err := r.Ping(ctx); err != nil {
return nil, fmt.Errorf("failed to connect to redis: %v", err)
}
opts.Logger.Info("Redis connected")
return r, nil
}
func (r *RedisCache) Set(ctx context.Context, key string, value interface{}, expiration time.Duration) error {
return r.client.Set(ctx, key, value, expiration).Err()
}
func (r *RedisCache) Get(ctx context.Context, key string) (string, error) {
return r.client.Get(ctx, key).Result()
}
func (r *RedisCache) Delete(ctx context.Context, key string) error {
return r.client.Del(ctx, key).Err()
}
func (r *RedisCache) Ping(ctx context.Context) error {
return r.client.Ping(ctx).Err()
}
func (r *RedisCache) Close() error {
if r == nil || r.client == nil {
return fmt.Errorf("cannot close: redis is not initialized")
}
return r.client.Close()
}
Here, we’ve followed the same modular pattern used across other components. We define a CacheService interface that can be easily mocked in unit tests, and a concrete RedisCache struct implementing basic caching functions along with Ping for health checks and Close for graceful shutdown. Finally, we provide a NewRedisCache() constructor to instantiate the cache service cleanly.
Implementation
Now we initialize our cache service in main.go:
package main
import (
"os"
"context"
"os/signal"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/config"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/cache"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/logger"
)
func main() {
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
defer stop()
log := logger.NewZerologLogger("info", os.Stderr)
cfg, err := config.NewConfig(log)
if err != nil {
log.Fatal(err.Error())
}
redisCache, err := cache.NewRedisCache(ctx, &cache.Opts{
Config: cfg.Redis,
Logger: log,
})
if err != nil {
log.Fatal(err.Error())
}
//...Database, gRPC server etc
<-ctx.Done()
// Gracefully close redis client
if err := redisCache.Close(); err != nil {
log.Error("failed to close cache client", logger.Field{Key: "error", Value: err.Error()})
}
}
This setup ensures that Redis starts alongside our other dependencies and closes gracefully when the service shuts down.
To integrate caching into our application logic, we’ll update the UserService.FindByID method. It will now first look for the user data in Redis before querying the database. If the record isn’t found in cache, it’s fetched from the database and then written back to Redis for subsequent requests.
Since we’ve added another dependency, we’ll extend the UserService constructor to accept the cache service:
package service
import (
"gorm.io/gorm"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/cache"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/database"
)
type userService struct {
database *gorm.DB
cache cache.CacheService
}
type UserServiceOpts struct {
Database database.DatabaseService
Cache cache.CacheService
}
func NewUserService(opts *UserServiceOpts) UserService {
return &userService{
database: opts.Database.DB(),
cache: opts.Cache,
}
}
Next, we update our gRPC server to pass the cache instance from main.go:
type Opts struct {
Config *config.GRPCServer
Logger logger.Logger
Database database.DatabaseService
Cache cache.CacheService
}
func NewServer(opts *Opts) *GRPCServer {
srv := grpc.NewServer(grpc.UnaryInterceptor(interceptor.LoggerInterceptor(opts.Logger)))
helloworld.RegisterGreeterServer(srv, handler.NewGreeterServer(
service.NewUserService(&service.UserServiceOpts{
Database: opts.Database,
Cache: opts.Cache,
}),
))
return &GRPCServer{
Server: srv,
Config: opts.Config,
Logger: opts.Logger,
}
}
Finally, here’s how caching looks inside FindByID:
func (s *userService) FindByID(ctx context.Context, id uint) (*model.User, error) {
cacheKey := fmt.Sprintf("user:%d", id)
// Try cache
if cached, err := s.cache.Get(ctx, cacheKey); err == nil && cached != "" {
var u model.User
if err := json.Unmarshal([]byte(cached), &u); err != nil {
return nil, err
}
return &u, nil
}
u := &model.User{}
if err := s.database.First(u, id).Error; err != nil {
return nil, err
}
data, _ := json.Marshal(u)
if err := s.cache.Set(ctx, cacheKey, data, time.Minute); err != nil { // cache for 1 minute
return nil, err
}
return u, nil
}
Now, when you call the SayHello RPC, the user data will come from Redis on the second request (after being cached during the first). The cache expires after one minute, after which a new request will repopulate it. You can also log cache hits/misses to observe behavior.
Integration Test
Since we added caching to UserService, we’ll update our integration tests accordingly. To ensure realistic conditions, we’ll spin up a real Redis instance using Testcontainers. Let’s start by creating a helper function SetupRedis in testutils:
package testutils
import (
"io"
"fmt"
"log"
"time"
"testing"
"context"
"github.com/stretchr/testify/require"
"github.com/testcontainers/testcontainers-go"
"github.com/testcontainers/testcontainers-go/modules/redis"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/cache"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/config"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/logger"
)
func SetupRedis(t *testing.T) cache.CacheService {
ctx := context.Background()
redisContainer, err := redis.Run(ctx,
"redis:7.2",
redis.WithSnapshotting(10, 1),
redis.WithLogLevel(redis.LogLevelVerbose),
)
require.NoError(t, err)
t.Cleanup(func() {
if err := testcontainers.TerminateContainer(redisContainer); err != nil {
log.Printf("failed to terminate container: %s", err)
}
})
host, _ := redisContainer.Host(ctx)
port, _ := redisContainer.MappedPort(ctx, "6379")
redisCache, err := cache.NewRedisCache(ctx, &cache.Opts{
Config: &config.Redis{
Addr: fmt.Sprintf("%s:%s", host, port.Port()),
Password: "",
DB: 0,
DialTimeout: time.Second * 5,
ReadTimeout: time.Second * 3,
WriteTimeout: time.Second * 3,
PoolSize: 20,
MinIdleConns: 5,
},
Logger: logger.NewZerologLogger("info", io.Discard),
})
require.NoError(t, err)
return redisCache
}
And then update the FindByID test to include Redis:
package service_test
import (
"context"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/service"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/database/model"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/tests/testutils"
)
func TestUserService_FindByID(t *testing.T) {
db := testutils.SetupPostgres(t)
redis := testutils.SetupRedis(t)
// Seed test data
u := &model.User{Name: "Alice", Email: "alice@example.com"}
require.NoError(t, db.DB().Create(u).Error)
userService := service.NewUserService(&service.UserServiceOpts{
Database: db,
Cache: redis,
})
got, err := userService.FindByID(context.Background(), u.ID)
require.NoError(t, err)
assert.Equal(t, "Alice", got.Name)
assert.Equal(t, "alice@example.com", got.Email)
}
For this test, we only need to confirm that FindByID behaves correctly — whether the data came from the database or cache isn’t important here. The goal is to ensure the service works seamlessly with both persistence and caching layers under test conditions.
Healthchecks
Service monitoring is critical in production environments. Although gRPC includes a built-in health check protocol, it only exposes a single RPC endpoint. This makes it less flexible if you want to differentiate between basic service liveness and full readiness to serve traffic.
In this project, we’ll implement two simple REST endpoints — /livez and /readyz — for clarity and flexibility.
We’ll start by defining a HealthService in internal/service/health.go. This service will be used by our /readyz endpoint to verify that all dependencies (like the database and cache) are healthy before declaring the service ready.
package service
import (
"context"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/cache"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/database"
"gorm.io/gorm"
)
type HealthService interface {
Check(ctx context.Context) HealthStatus
}
type HealthStatus struct {
Status string `json:"status"`
Details map[string]string `json:"details,omitempty"`
}
type healthService struct {
database *gorm.DB
cache cache.CacheService
}
type HealthServiceOpts struct {
Database database.DatabaseService
Cache cache.CacheService
}
func NewHealthService(opts *HealthServiceOpts) HealthService {
return &healthService{
database: opts.Database.DB(),
cache: opts.Cache,
}
}
func (h *healthService) Check(ctx context.Context) HealthStatus {
status := HealthStatus{Status: "ready", Details: map[string]string{}}
if err := h.database.Exec("SELECT 1").Error; err != nil {
status.Status = "unready"
status.Details["database"] = err.Error()
} else {
status.Details["database"] = "ok"
}
if err := h.cache.Ping(ctx); err != nil {
status.Status = "unready"
status.Details["cache"] = err.Error()
} else {
status.Details["cache"] = "ok"
}
return status
}
The Check method inside HealthService validates the health of connected dependencies and returns their status, which we can later expose through our API.
Next, we’ll implement the REST handlers for /livez and /readyz inside internal/transports/http/server/handler/health.go:
package handler
import (
"encoding/json"
"net/http"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/logger"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/service"
)
type Opts struct {
HealthService service.HealthService
Logger logger.Logger
}
type HealthHandler struct {
healthService service.HealthService
logger logger.Logger
}
func NewHealthHandler(opts *Opts) *HealthHandler {
return &HealthHandler{healthService: opts.HealthService, logger: opts.Logger}
}
func (h *HealthHandler) Livez(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
writeJSON(w, map[string]string{"status": "ok"}, h.logger)
}
func (h *HealthHandler) Readyz(w http.ResponseWriter, r *http.Request) {
status := h.healthService.Check(r.Context())
w.Header().Set("Content-Type", "application/json")
if status.Status == "ready" {
w.WriteHeader(http.StatusOK)
} else {
w.WriteHeader(http.StatusServiceUnavailable)
}
writeJSON(w, status, h.logger)
}
//common JSON helper (can be separated in a file if you add more apis)
func writeJSON(w http.ResponseWriter, data any, log logger.Logger) {
w.Header().Set("Content-Type", "application/json")
if err := json.NewEncoder(w).Encode(data); err != nil {
log.Error("failed to write JSON: " + err.Error())
}
}
The /livez handler simply returns "ok" with an HTTP 200 — it’s only responsible for confirming that the service process is running.
The /readyz handler, on the other hand, uses the HealthService to confirm that all required dependencies (like the database and Redis) are operational.
Now, let’s wire up our HTTP server in internal/transports/http/server/server.go:
package server
import (
"net"
"net/http"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/cache"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/config"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/database"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/logger"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/service"
)
type Opts struct {
Config *config.HTTPServer
Logger logger.Logger
Database database.DatabaseService
Cache cache.CacheService
}
type HTTPServer struct {
Config *config.HTTPServer
Server *http.Server
Logger logger.Logger
}
func NewServer(opts *Opts) *HTTPServer {
mux := http.NewServeMux()
healthService := service.NewHealthService(&service.HealthServiceOpts{
Database: opts.Database,
Cache: opts.Cache,
})
healthHandler := handler.NewHealthHandler(&handler.Opts{
HealthService: healthService,
Logger: opts.Logger,
})
mux.HandleFunc("/livez", healthHandler.Livez)
mux.HandleFunc("/readyz", healthHandler.Readyz)
return &HTTPServer{
Config: opts.Config,
Server: &http.Server{
Addr: opts.Config.URL,
Handler: mux,
},
Logger: opts.Logger,
}
}
func (h *HTTPServer) ServeListener(listener net.Listener) error {
h.Logger.Info("HTTP server started", logger.Field{Key: "address", Value: listener.Addr().String()})
if err := h.Server.Serve(listener); err != nil && err != http.ErrServerClosed {
h.Logger.Error("HTTP server failed", logger.Field{Key: "error", Value: err.Error()})
return err
}
return nil
}
func (h *HTTPServer) Serve() error {
listener, err := net.Listen("tcp", h.Config.URL)
if err != nil {
h.Logger.Error("Failed to create HTTP listener",
logger.Field{Key: "address", Value: h.Config.URL},
logger.Field{Key: "error", Value: err.Error()},
)
return err
}
return h.ServeListener(listener)
}
This follows the same structure as our GRPCServer. We define both ServeListener (for custom listeners) and Serve (for normal TCP-based startup from configuration).
Next, update your configuration to fetch the HTTP server’s host and port from environment variables:
type Config struct {
GRPCServer *GRPCServer `validate:"required"`
HTTPServer *HTTPServer `validate:"required"`
Database *Database `validate:"required"`
Redis *Redis `validate:"required"`
}
//Define struct
type HTTPServer struct {
URL string `validate:"required,hostname_port"`
}
//Load from env
func NewConfigWithOptions(opts LoaderOptions) (*Config, error) {
//...
cfg := &Config{
HTTPServer: &HTTPServer{
URL: getEnv("HTTP_SERVER_URL", ":4000"),
},
//...
}
//...
}
Finally, let’s initialize and start the HTTP server in main.go:
package main
func main() {
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
defer stop()
log := logger.NewZerologLogger("info", os.Stderr)
cfg, err := config.NewConfig(log)
if err != nil {
log.Fatal(err.Error())
}
//...
httpServer := httpserver.NewServer(&httpserver.Opts{
Config: cfg.HTTPServer,
Logger: log,
Database: db,
Cache: redisCache,
Metrics: metricsService,
})
go func() {
err = httpServer.Serve()
if err != nil && !errors.Is(err, http.ErrServerClosed) {
stop()
}
}()
<-ctx.Done()
//shutdown grpc, db, redis etc
// Shut down the health server last so it can continue responding to liveness checks
// (e.g., /livez) while marking the service as not ready (/readyz) during shutdown.
if err := httpServer.Server.Shutdown(ctx); err != nil {
log.Error("failed to close http server", logger.Field{Key: "error", Value: err.Error()})
}
}
Once everything is wired up, we can test the endpoints locally with curl:
Livez:
curl localhost:4000/livez
Expected response:
{ "status": "ok" }
Readyz:
curl localhost:4000/readyz
Expected response:
{ "status": "ready", "details": { "cache": "ok", "database": "ok" } }
With these two endpoints, you now have both basic and dependency-aware healthchecks that can be used by Kubernetes probes, load balancers, or monitoring tools.
Observability (Metrics and Tracing)
In Part 1, we added structured logging to capture what’s happening inside our services. Logs are great for understanding what happened in a specific instance — but in a distributed system, we also need to understand how things behave and how requests flow across multiple services.
That’s where metrics and tracing come in.
- Metrics capture quantitative data about our services — like request counts, error rates, and latency. They’re lightweight, easy to visualize, and ideal for alerting and performance dashboards.
- Tracing gives us a request-level view of our system — showing how a request moves between microservices and how much time is spent in each hop.
Together, logging, metrics, and tracing give us full observability into our system’s behavior. In this part, we’ll implement metrics and tracing, then spin up Prometheus, Grafana, and Jaeger to see them in action.
Prometheus Metrics
Prometheus is a popular open-source monitoring system that scrapes metrics from your services, stores them as time-series data, and makes them queryable through PromQL. We'll use the official prometheus/client_golang library to instrument our services and exposes metrics for scraping.
We’ll begin by creating a MetricsService inside internal/observability/metrics/metrics.go. This service will handle Prometheus setup, expose default metrics, and provide a simple way to register custom collectors.
package metrics
import (
"net/http"
"github.com/prometheus/client_golang/prometheus"
promcollectors "github.com/prometheus/client_golang/prometheus/collectors"
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/config"
)
type MetricsService interface {
Register(collectors ...MetricCollector)
RegisterDefault()
Handler() http.Handler
}
type metricsService struct {
Config *config.Metrics
Registry *prometheus.Registry
}
func NewMetricsService(cfg *config.Metrics, collectors ...MetricCollector) MetricsService {
registry := prometheus.NewRegistry()
m := &metricsService{
Registry: registry,
Config: cfg,
}
if cfg.EnableDefaultMetrics {
m.RegisterDefault()
}
m.Register(collectors...)
return m
}
func (m *metricsService) Register(collectors ...MetricCollector) {
for _, c := range collectors {
c.Register(m.Registry)
}
}
func (m *metricsService) RegisterDefault() {
m.Registry.MustRegister(
promcollectors.NewGoCollector(),
promcollectors.NewProcessCollector(promcollectors.ProcessCollectorOpts{}),
)
}
func (m *metricsService) Handler() http.Handler {
return promhttp.HandlerFor(
m.Registry,
promhttp.HandlerOpts{},
)
}
To keep metrics modular, we define a small MetricCollector interface. Each component — like gRPC, HTTP, or Redis — can implement this interface to expose its own metrics while keeping code organized.
type MetricCollector interface {
Register(r *prometheus.Registry)
}
Next, we add gRPC-specific metrics inside a new metrics/grpc.go file. This includes counters and histograms for request counts, and latency — useful for tracking performance trends across gRPC calls.
package metrics
import "github.com/prometheus/client_golang/prometheus"
var (
GRPCRequestCounter = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "grpc_requests_total",
Help: "Total number of gRPC requests received, labeled by method and status.",
},
[]string{"method", "status"},
)
GRPCRequestLatency = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "grpc_request_duration_seconds",
Help: "Histogram of gRPC request latencies (seconds).",
Buckets: prometheus.DefBuckets,
},
[]string{"method"},
)
)
type GRPCMetrics struct{}
func (GRPCMetrics) Register(r *prometheus.Registry) {
r.MustRegister(GRPCRequestCounter, GRPCRequestLatency)
}
We then create a gRPC interceptor that automatically records these metrics for every incoming request. This way, we don’t need to manually instrument every handler — metrics are collected transparently as part of the request lifecycle.
package interceptor
import (
"context"
"time"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/observability/metrics"
"google.golang.org/grpc"
"google.golang.org/grpc/status"
)
func MetricsInterceptor() grpc.UnaryServerInterceptor {
return func(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
start := time.Now()
res, err := handler(ctx, req)
method := info.FullMethod
statusCode := status.Code(err).String()
metrics.GRPCRequestCounter.WithLabelValues(method, statusCode).Inc()
metrics.GRPCRequestLatency.WithLabelValues(method).Observe(time.Since(start).Seconds())
return res, err
}
}
We then attach the interceptor to gRPC server via grpc.ChainUnaryInterceptor:
srv := grpc.NewServer(
grpc.ChainUnaryInterceptor(
interceptor.LoggerInterceptor(opts.Logger),
interceptor.MetricsInterceptor(),
),
)
Once metrics are collected, we need a way for Prometheus to access them. We expose a /metrics endpoint in our HTTP server, which Prometheus will scrape periodically.
package server
type Opts struct {
Config *config.HTTPServer
Logger logger.Logger
Database database.DatabaseService
Cache cache.CacheService
Metrics metrics.MetricsService
}
func NewServer(opts *Opts) *HTTPServer {
mux := http.NewServeMux()
healthService := service.NewHealthService(&service.HealthServiceOpts{
Database: opts.Database,
Cache: opts.Cache,
})
healthHandler := handler.NewHealthHandler(&handler.Opts{
HealthService: healthService,
Logger: opts.Logger,
})
mux.HandleFunc("/livez", healthHandler.Livez)
mux.HandleFunc("/readyz", healthHandler.Readyz)
mux.Handle("/metrics", opts.Metrics.Handler())
return &HTTPServer{
Config: opts.Config,
Server: &http.Server{
Addr: opts.Config.URL,
Handler: mux,
},
Logger: opts.Logger,
}
}
Now Let's register metrics to HTTPServer in main.go:
package main
import (...)
func main() {
//...
metricsService := metrics.NewMetricsService(cfg.Metrics, metrics.GRPCMetrics{})
httpServer := httpserver.NewServer(&httpserver.Opts{
Config: cfg.HTTPServer,
Logger: log,
Database: db,
Cache: redisCache,
Metrics: metricsService,
})
go func() {
err = httpServer.Serve()
if err != nil && !errors.Is(err, http.ErrServerClosed) {
stop()
}
}()
}
Finally, we update our configuration to include a METRICS_ENABLE_DEFAULT_METRICS option. This flag lets us control whether we also expose Go’s built-in runtime metrics like garbage collection and goroutine counts — useful in production for monitoring resource usage.
type Metrics struct {
EnableDefaultMetrics bool
}
func NewConfigWithOptions(opts LoaderOptions) (*Config, error) {
cfg := &Config{
Metrics: &Metrics{
EnableDefaultMetrics: getEnvBool("METRICS_ENABLE_DEFAULT_METRICS", false),
},
}
}
//New helper for parsing boolean values
func getEnvBool(key string, defaultVal bool) bool {
if val, err := strconv.ParseBool(os.Getenv(key)); err == nil {
return val
}
return defaultVal
}
With this in place, our services now export rich, structured metrics that can be scraped by Prometheus and visualized in Grafana. These metrics will serve as the backbone for our dashboards and alerts once we deploy to production.
OpenTelemetry Tracing
Tracing is the third pillar of observability, alongside metrics and logs. While metrics tell you how your system behaves overall, and logs describe what happened at a specific moment, tracing shows you how a request flows across services. It’s especially valuable in microservice architectures where a single client request might touch multiple services, databases, or queues before completing.
We’ll be using OpenTelemetry — a popular open-source standard for collecting distributed traces — and connect it with Jaeger for visualization. You can also integrate it later with other backends like Zipkin, Tempo, or AWS X-Ray if needed.
Let’s start by defining our TracerService in internal/observability/tracing/tracing.go:
package tracing
import (
"context"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/config"
"github.com/sagarmaheshwary/go-microservice-boilerplate/internal/logger"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.24.0"
)
type Opts struct {
Config *config.Tracing
Logger logger.Logger
}
type TracerService struct {
Config *config.Tracing
Logger logger.Logger
tp *sdktrace.TracerProvider
}
func NewTracerService(ctx context.Context, opts *Opts) (*TracerService, error) {
cfg := opts.Config
exporter, err := otlptrace.New(
ctx,
otlptracehttp.NewClient(
otlptracehttp.WithEndpoint(cfg.CollectorURL),
otlptracehttp.WithInsecure(),
),
)
if err != nil {
return nil, err
}
res, err := resource.New(ctx,
resource.WithAttributes(semconv.ServiceName(cfg.ServiceName)),
)
if err != nil {
return nil, err
}
tp := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(res),
)
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(propagation.TraceContext{})
opts.Logger.Info("Tracing initialized (exporter: otlptracehttp)",
logger.Field{Key: "serviceName", Value: cfg.ServiceName},
)
return &TracerService{
Config: cfg,
Logger: opts.Logger,
tp: tp,
}, nil
}
func (t *TracerService) Shutdown(ctx context.Context) error {
return t.tp.Shutdown(ctx)
}
Next, we’ll update our configuration to include tracing settings such as service name and exporter in config.go:
type Tracing struct {
ServiceName string `validate:"required"`
CollectorURL string `validate:"required,hostname_port"`
}
func NewConfigWithOptions(opts LoaderOptions) (*Config, error) {
cfg := &Config{
Tracing: &Tracing{
ServiceName: getEnv("TRACING_SERVICE_NAME", "go-microservice-boilerplate"),
CollectorURL: getEnv("TRACING_COLLECTOR_URL", "localhost:4318"),
},
//...
}
}
Now that the config is ready, we can initialize the tracer in main.go so that every part of our application has access to it:
func main() {
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
defer stop()
//...
tracerService, err := tracing.NewTracerService(ctx, &tracing.Opts{
Config: cfg.Tracing,
Logger: log,
})
if err != nil {
log.Fatal(err.Error())
}
//...
<-ctx.Done()
//gracefully shutdown the client
if err := tracerService.Shutdown(ctx); err != nil {
log.Error("failed to close tracing client", logger.Field{Key: "error", Value: err.Error()})
}
}
To propagate trace context across service boundaries, we add a gRPC StatsHandler. This ensures that when a request comes into our service, it either starts a new trace or continues an existing one passed from another service:
srv := grpc.NewServer(
grpc.ChainUnaryInterceptor(
interceptor.LoggerInterceptor(opts.Logger),
interceptor.MetricsInterceptor(),
),
grpc.StatsHandler(otelgrpc.NewServerHandler(
otelgrpc.WithTracerProvider(otel.GetTracerProvider()),
otelgrpc.WithPropagators(otel.GetTextMapPropagator()),
)),
)
From now on, every gRPC call will automatically create a trace that’s sent to Jaeger.
Jaeger UI: http://localhost:16686/search
Each trace is composed of multiple spans — where a trace represents the entire journey of a request (for example, a user calling your API), and spans represent individual operations within that request (like “fetch user from DB” or “send email”).
To demonstrate this, let’s add spans to our SayHello handler and the UserService. Each span will capture the work done by that specific function, and together they’ll form a full picture of a single request’s flow through the system.
func (g *GreeterServer) SayHello(ctx context.Context, in *helloworld.SayHelloRequest) (*helloworld.SayHelloResponse, error) {
tr := otel.Tracer("hello_world.Greeter")
ctx, span := tr.Start(ctx, "helloworld.SayHello")
defer span.End()
//...
}
func (s *userService) FindByID(ctx context.Context, id uint) (*model.User, error) {
tr := otel.Tracer("UserService")
ctx, span := tr.Start(ctx, "FindByID")
span.SetAttributes(attribute.String("UserId", strconv.Itoa(int(id))))
defer span.End()
//...
}
When you look at the trace in Jaeger, you’ll see a clear chain of execution — the main request trace leading into the SayHello span, which then calls into the UserService span.
The SetAttributes method lets you attach useful context (like user IDs, order IDs, or error states) that can make debugging much easier when viewing a trace.
Context propagation here is critical — notice that we always pass ctx forward. This context carries the traceID created by our gRPC StatsHandler, ensuring that every downstream function and service is linked to the same request chain.
End to End Observability in Action
The repo comes with a ready-to-use observability setup using Docker Compose, available in the examples branch. If you want to follow along, clone the repo and switch to that branch.
We’ll first bring up our core application stack — which includes the service itself, PostgreSQL, and Redis:
docker compose up
Once the application is running, we can start the observability stack. This will spin up Grafana, Prometheus, and Jaeger — all preconfigured to work with our service.
docker compose -f docker-compose.observability.yml up
Jaeger runs via the jaeger/all-in-one image using its default in-memory storage to keep things lightweight and simple. Prometheus uses a basic scrape configuration that pulls metrics from our service’s /metrics endpoint every eight seconds:
global:
scrape_interval: 8s
scrape_configs:
- job_name: go-microservice-boilerplate
static_configs:
- targets:
- "go-microservice-boilerplate:4000"
Grafana comes with a sample dashboard already wired up to Prometheus, showcasing the gRPC metrics we defined earlier. Open a browser and navigate to http://localhost:3000 (default credentials are admin / admin).
The dashboard has four panels visualizing data from the grpc_requests_total and grpc_request_duration_seconds metrics:
- Average gRPC Request Duration — time series
- gRPC Request Latency (95th Percentile) — time series
- gRPC Requests per Method — time series
- Total gRPC Requests — counter/gauge
This sample dashboard offers a quick glimpse into request throughput and latency trends — perfect for demonstration and local testing. In a real production environment, you’d typically design a more comprehensive dashboard tailored to your service’s specific KPIs, error rates, and resource utilization patterns.
Now that we’ve seen metrics in action, let’s turn to tracing. We already explored what a trace looks like for a single service, but the real power of tracing emerges when requests flow through multiple microservices. Distributed traces help you visualize the entire request path, pinpoint slow hops, and quickly identify where failures occur.
To demonstrate distributed tracing, let’s spin up a second instance of our service that will act as another microservice in the request chain. Clone the same repository inside your project directory, switch to the examples branch, and create an environment file:
git clone https://github.com/SagarMaheshwary/go-microservice-boilerplate.git go-microservice-boilerplate2
cd go-microservice-boilerplate2
git checkout examples
cp .env.example .env
Now let’s add this new service to our main docker-compose.yml file:
services:
app2:
build:
context: ./go-microservice-boilerplate2
target: development
container_name: go-microservice-boilerplate2
ports:
- 4001:4000
- 5001:5000
volumes:
- ./go-microservice-boilerplate2:/app
networks:
- microservices_net
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
Next, open the SayHello() handler of our main service and replace it with the following code.
This is just for demonstration — in a real-world setup, you would create a dedicated client inside transports/grpc/client and reuse it across the service instead of creating a new connection for every request (since gRPC clients are long-lived):
func (g *GreeterServer) SayHello(ctx context.Context, in *helloworld.SayHelloRequest) (*helloworld.SayHelloResponse, error) {
tr := otel.Tracer("hello_world.Greeter")
ctx, span := tr.Start(ctx, "helloworld.SayHello")
defer span.End()
conn, err := grpc.NewClient(
"go-microservice-boilerplate2:5000",
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithStatsHandler(otelgrpc.NewClientHandler()),
)
if err != nil {
return nil, status.Error(codes.Internal, err.Error())
}
defer conn.Close()
greeter := helloworld.NewGreeterClient(conn)
res, err := greeter.SayHello(ctx, &helloworld.SayHelloRequest{UserId: in.UserId})
if err != nil {
return nil, status.Error(codes.Internal, err.Error())
}
return res, nil
}
Note: In the
internal/tracing/tracing.gofile, make sure to includeotel.SetTextMapPropagator(propagation.TraceContext{}). This line is missing from the part-3 source, and without it, traces between multiple services won’t connect properly. You can use either themasterorexamplesbranch, as both already include this fix.
Now, let’s trigger a request from the first service, which will internally call the second service. We’ll use grpcurl for this:
grpcurl -d '{"user_id": 1}' -proto ./proto/hello_world/hello_world.proto -plaintext localhost:5000 hello_world.Greeter/SayHello
Open Jaeger in your browser, and you should now see a single trace spanning across two services — one initiating the request, and the other handling it downstream:
Clicking into the trace reveals the chain of spans showing each step of the process:
This connected trace gives a clear view of how the request travels through the system — from the first service’s SayHello() call, to the gRPC client invocation, and finally to the second service’s SayHello() and its database lookup via UserService.
And that wraps up our Observability section.
We’ve now covered all three pillars — logging, metrics, and tracing — giving you complete visibility into your services. With this setup, you can detect issues early, measure system health, and trace requests across microservices, building a solid foundation for a production-grade monitoring stack.
Wrapping Up the Series
In this final part, we extended our microservice with Redis integration, health checks, and full observability using Prometheus metrics and OpenTelemetry tracing. We visualized key performance metrics in Grafana and explored distributed tracing with Jaeger — connecting multiple services to see how requests flow end-to-end.
With this, our Designing Production-Ready Microservices in Go series comes to a close. You now have a solid foundation that combines clean project structure, containerization, configuration management, database integration, service-to-service communication, and observability — everything you need to kickstart a production-grade Go microservice project.
The repository is meant to be a reusable boilerplate for spinning up new Go microservices — giving you a clean, production-ready starting point for any project.
The master branch contains a clean, production-ready setup, while the examples branch includes all the working examples and demos featured throughout the series.
You can check out the complete code for Part Three here.
Thanks for reading! I hope you found this series helpful. If you have any questions, feedback, or suggestions, feel free to drop a comment — I’d love to hear your thoughts.





Top comments (0)