DEV Community

Cover image for Everyone Logs Wrong with slog. 7 Patterns for 3 AM
Gabriel Anhaia
Gabriel Anhaia

Posted on

Everyone Logs Wrong with slog. 7 Patterns for 3 AM


The aggregator returns two thousand lines that all say msg="error processing request". No user. No order. No upstream. No trace id. The pager has been buzzing for half an hour, the checkout error rate is climbing past 4%, and the playbook said "check the logs."

You scroll. You give up. You roll back the last deploy because you cannot tell from the logs which code path is failing.

That is not a logging library problem. That is a logging discipline problem. log/slog has shipped in the standard library since Go 1.21. It produces structured JSON. The primitives for making 3 AM logs answerable are all there. Most teams still use it like log with extra steps — free-form messages, no context, secrets accidentally serialized, debug levels either silent or drowning everyone.

Pattern 1: Stop using the package-level logger as if it were log

slog ships with a default logger at slog.Default() and convenience functions like slog.Info that call into it. They feel like drop-in replacements for log.Println. That is the trap.

Anti-pattern:

package payments

import "log/slog"

func Charge(card Card, cents int) error {
    slog.Info("charging card",
        "last4", card.Last4,
        "amount", cents)
    // ...
    return nil
}
Enter fullscreen mode Exit fullscreen mode

This works. It produces JSON. It also discards every piece of context the caller was carrying: request id, user id, trace id, deadline. The handler has no idea this line came from inside a request that started 80ms ago.

Idiomatic:

package payments

import (
    "context"
    "log/slog"
)

func Charge(ctx context.Context, log *slog.Logger, card Card, cents int) error {
    log.InfoContext(ctx, "charging card",
        slog.String("last4", card.Last4),
        slog.Int("amount_cents", cents),
    )
    return nil
}
Enter fullscreen mode Exit fullscreen mode

Two small changes. InfoContext passes the context to the handler, which is how a context-aware handler pulls request-scoped attributes back out (pattern 3 below). The slog.String / slog.Int constructors skip the any boxing the variadic key-value form does on every call.

Why it matters at 3 AM: when the line you need is buried in a million siblings, the difference between grep "charging card" and jq 'select(.request_id == "abc-123")' is the difference between finding the bug and rolling back blind. InfoContext is what makes the second query possible.

Pattern 2: Make the handler the configuration boundary

The biggest waste in a slog migration is sprinkling format concerns through the application: every call deciding whether to log JSON, whether to add the source file, whether to redact a key. That is not how slog is meant to work.

Anti-pattern:

// scattered across the codebase
slog.Info("user signed in",
    "user", user.ID,
    "json_ts", time.Now().Format(time.RFC3339),
    "src", "auth/handler.go:42",
)
Enter fullscreen mode Exit fullscreen mode

Configure once, in main, and never again:

package main

import (
    "log/slog"
    "os"
)

func newLogger() *slog.Logger {
    level := new(slog.LevelVar) // dynamic, see pattern 6
    level.Set(slog.LevelInfo)

    opts := &slog.HandlerOptions{
        AddSource:   true,
        Level:       level,
        ReplaceAttr: redactSensitive, // see pattern 4
    }

    h := slog.NewJSONHandler(os.Stdout, opts)
    return slog.New(h)
}
Enter fullscreen mode Exit fullscreen mode

HandlerOptions.AddSource attaches file:line to every record (docs). Level accepts a *slog.LevelVar so the level is mutable at runtime. ReplaceAttr runs once per attribute and lets you transform or drop keys before they reach the wire.

Why it matters at 3 AM: when a customer reports leaked PII, you want one place to grep, one place to fix, one deploy — not five hundred call sites to audit. The handler is that one place. Push policy down into it.

Pattern 3: Carry the logger in context.Context for request-scoped attributes

This is the pattern that pays the biggest 3 AM dividend, and the one most slog tutorials skip. The middleware generates a request id, attaches it to the context, and attaches a logger that already has that id baked in. Every downstream call gets a logger that knows who is asking.

Without context, the request id is a magic string everyone has to remember to pass:

func (h *OrderHandler) Create(w http.ResponseWriter, r *http.Request) {
    reqID := r.Header.Get("X-Request-ID")
    slog.Info("creating order", "request_id", reqID, "path", r.URL.Path)

    if err := h.svc.Create(r.Context(), reqID, /* ... */); err != nil {
        slog.Error("create failed", "request_id", reqID, "err", err)
    }
}
Enter fullscreen mode Exit fullscreen mode

Every layer wanting the request id takes it as an extra parameter. Half forget. The ones that do not forget will pass the wrong id when they fan out to a worker.

Compare the version where the logger lives in the context:

package logctx

import (
    "context"
    "log/slog"
)

type ctxKey struct{}

func With(ctx context.Context, log *slog.Logger) context.Context {
    return context.WithValue(ctx, ctxKey{}, log)
}

func From(ctx context.Context) *slog.Logger {
    if l, ok := ctx.Value(ctxKey{}).(*slog.Logger); ok {
        return l
    }
    return slog.Default()
}
Enter fullscreen mode Exit fullscreen mode
// middleware
func RequestLogger(base *slog.Logger) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            reqID := r.Header.Get("X-Request-ID")
            if reqID == "" {
                reqID = newRequestID()
            }
            log := base.With(
                slog.String("request_id", reqID),
                slog.String("path", r.URL.Path),
                slog.String("method", r.Method),
            )
            ctx := logctx.With(r.Context(), log)
            log.InfoContext(ctx, "request received")
            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}
Enter fullscreen mode Exit fullscreen mode

Now any service or repository pulls the logger back out and every line it emits already has the request id, path, method, and anything else the middleware attached:

func (s *OrderService) Create(ctx context.Context, req CreateOrderRequest) error {
    log := logctx.From(ctx)
    log.InfoContext(ctx, "validating order", slog.Int("items", len(req.Items)))
    // ...
}
Enter fullscreen mode Exit fullscreen mode

The Better Stack contextual-logging guide walks the same shape, and veqryn/slog-context wraps it as a handler if you would rather not write the helpers yourself.

When a request fails, every line it produced is one jq query away. You do not need to know which functions ran or which goroutines fanned out. The context carries the id, and every line emits it.

Pattern 4: Make redaction a property of the type

The naive setup, where the SDK author has String()-ed the whole struct:

type APIToken struct {
    Value     string
    ExpiresAt time.Time
}

// and then somewhere
slog.Info("calling vendor", "token", token)
Enter fullscreen mode Exit fullscreen mode

That serializes token.Value straight into the JSON. The vendor's full token is now in your log aggregator, indexed and replicated, and archived for as long as your retention policy keeps it — often years.

What slog actually wants you to do is implement slog.LogValuer on the secret type itself:

type APIToken struct {
    Value     string
    ExpiresAt time.Time
}

func (t APIToken) LogValue() slog.Value {
    return slog.GroupValue(
        slog.String("value", "REDACTED"),
        slog.Time("expires_at", t.ExpiresAt),
    )
}
Enter fullscreen mode Exit fullscreen mode

Now slog.Info("calling vendor", "token", token) produces "token":{"value":"REDACTED","expires_at":"..."} no matter who calls it, no matter how deep the call site is. The standard library's own example uses this exact pattern, and Arcjet's redaction post walks the trade-offs.

Pair it with a ReplaceAttr for cases where the secret is on a primitive (a string field called password on a generic map you do not control):

func redactSensitive(groups []string, a slog.Attr) slog.Attr {
    switch a.Key {
    case "password", "authorization", "api_key", "ssn":
        return slog.String(a.Key, "REDACTED")
    }
    return a
}
Enter fullscreen mode Exit fullscreen mode

The next engineer who copies slog.Info("payment", "card", card) is not going to think about whether card has a LogValue method. The type does the thinking for them.

Pattern 5: Discipline the levels

slog ships four levels with deliberate gaps: Debug=-4, Info=0, Warn=4, Error=8 (docs). Most teams need to use the four they have correctly before adding custom ones.

A typical, undisciplined codebase logs everything as Info, with the occasional Error for "something went wrong-ish":

slog.Info("found 0 results")               // not info-worthy
slog.Info("retrying after timeout")        // this is a warn
slog.Error("user submitted invalid email") // this is a 400, not a page
Enter fullscreen mode Exit fullscreen mode

The cost: the dashboard alerts on level=error at 4 AM because someone sent a malformed JSON body. Legitimate errors drown out, the team stops trusting the alarm.

Idiomatic discipline:

Level When to use it What it should mean to on-call
Debug Loop bodies, per-row trace, off in prod by default. Nothing. Off.
Info One line per request lifecycle event: received, accepted, completed. The system is working.
Warn Recoverable degradation. Retry succeeded after one attempt. Cache miss for a hot key. Something is suboptimal. Look in business hours.
Error The request failed in a way the user will notice and a human needs to know about. Page-worthy in aggregate.

A caller-fault 4xx is Info. A backend-fault 5xx is Error:

log.InfoContext(ctx, "rejected invalid email",
    slog.String("input", req.Email),
    slog.String("reason", err.Error()),
)

log.ErrorContext(ctx, "database write failed",
    slog.String("table", "orders"),
    slog.Any("err", err),
)
Enter fullscreen mode Exit fullscreen mode

The on-call cost is direct: when your alert is count(level=error) > 50 in 5m, levels are the only thing deciding what wakes you up. If a 400 is an Error, the alert is broken on day one. Treat levels as part of the contract between your service and the person carrying the pager.

Pattern 6: Make the level dynamic with slog.LevelVar

You will eventually need Debug logging on one pod for ten minutes without redeploying. slog.LevelVar is built for this.

The brittle version hardcodes the level as a constant in main:

opts := &slog.HandlerOptions{Level: slog.LevelInfo}
Enter fullscreen mode Exit fullscreen mode

A Heisenbug fires at 3 AM. You redeploy with LevelDebug. The bug disappears for forty minutes because the deploy itself reset state. You go back to bed. The bug returns.

Bind a *slog.LevelVar instead and expose a knob:

var logLevel = new(slog.LevelVar) // defaults to Info

func init() {
    if v := os.Getenv("LOG_LEVEL"); v != "" {
        var l slog.Level
        if err := l.UnmarshalText([]byte(v)); err == nil {
            logLevel.Set(l)
        }
    }
}

// admin endpoint, behind auth, ideally on a separate port
func setLevelHandler(w http.ResponseWriter, r *http.Request) {
    var l slog.Level
    if err := l.UnmarshalText([]byte(r.URL.Query().Get("level"))); err != nil {
        http.Error(w, err.Error(), http.StatusBadRequest)
        return
    }
    logLevel.Set(l)
    fmt.Fprintf(w, "level set to %s\n", l)
}
Enter fullscreen mode Exit fullscreen mode

The Level field on HandlerOptions accepts any slog.Leveler, and *slog.LevelVar satisfies the interface. Setting the var is goroutine-safe and takes effect on the next log call. Curl the admin endpoint, set debug, capture two minutes, set it back. The pod never restarted.

Why it matters at 3 AM: redeploying to flip a config bit during an active incident adds a variable. Removing variables is the whole job. A live toggle keeps the deploy off the table.

Pattern 7: Group attributes per subsystem so keys do not collide

Two subsystems both log user_id. One means the authenticated principal, the other means the user the operation targets. Both end up in the same JSON object. The alert query is wrong about which one it is filtering.

Anti-pattern:

log.InfoContext(ctx, "admin updated user",
    slog.String("user_id", admin.ID),  // who did it
    slog.String("user_id", target.ID), // who got updated — overwrites
)
Enter fullscreen mode Exit fullscreen mode

JSON does not enforce unique keys at the producer. The handler emits both, and the consumer (Loki, Elasticsearch, CloudWatch and similar) typically keeps one of the values — which one depends on the parser, and you cannot rely on it.

WithGroup qualifies every attribute with a namespace:

log.With(slog.String("event", "user.updated")).
    WithGroup("actor").With(slog.String("id", admin.ID), slog.String("role", admin.Role)).
    WithGroup("target").With(slog.String("id", target.ID)).
    InfoContext(ctx, "admin updated user")
Enter fullscreen mode Exit fullscreen mode

The JSON handler emits this as nested objects:

{
  "level": "INFO",
  "msg": "admin updated user",
  "event": "user.updated",
  "actor": { "id": "u_admin_1", "role": "support" },
  "target": { "id": "u_target_42" }
}
Enter fullscreen mode Exit fullscreen mode

Now the alert query is actor.role == "support" and there is no ambiguity. From the package docs: "A group's keys qualify the keys of the attributes added after it."

When a security review asks "show every action a support agent took on a customer record in 24 hours," the answer is one query against actor.role. Without groups it is a regex that mis-matches and an apology email.

A handler skeleton that ties it together

The main.go shape that uses all seven patterns:

package main

import (
    "context"
    "log/slog"
    "net/http"
    "os"
    "time"
)

var logLevel = new(slog.LevelVar)

func main() {
    logLevel.Set(parseLevel(os.Getenv("LOG_LEVEL"), slog.LevelInfo))

    h := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
        AddSource:   os.Getenv("LOG_SOURCE") == "true",
        Level:       logLevel,
        ReplaceAttr: redactSensitive,
    })
    base := slog.New(h).With(
        slog.String("service", "checkout"),
        slog.String("version", buildVersion),
    )
    slog.SetDefault(base) // for libraries that fall back to Default

    mux := http.NewServeMux()
    mux.HandleFunc("/admin/loglevel", setLevelHandler)

    handler := RequestLogger(base)(mux)
    srv := &http.Server{
        Addr:              ":8080",
        Handler:           handler,
        ReadHeaderTimeout: 5 * time.Second,
    }
    ctx := context.Background()
    base.InfoContext(ctx, "server starting", slog.String("addr", srv.Addr))
    if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
        base.ErrorContext(ctx, "server exited", slog.Any("err", err))
        os.Exit(1)
    }
}

func parseLevel(s string, fallback slog.Level) slog.Level {
    var l slog.Level
    if err := l.UnmarshalText([]byte(s)); err == nil {
        return l
    }
    return fallback
}
Enter fullscreen mode Exit fullscreen mode

The base logger has service and version baked in, so every line is attributable. slog.SetDefault makes third-party libraries inherit the same handler. The level var is exposed via env and admin endpoint, giving both deploy-time and runtime control.

The one-line test for each log call

When you write a new slog line, ask:

If this line is the only thing on screen at 3 AM, does it answer who, what, where, and why?

If the answer is "I would have to grep three other lines," the line is broken. Fix it before the merge.

Find the noisiest log call in your service and walk through the seven patterns against that one line. You will probably fix three. Then run the same exercise on whatever feeds your error rate alert — that one is the one waking you up.

slog will not save you from a 3 AM page. It will determine whether the page is forty minutes of scrolling or four minutes of jq.

Question for the comments: what is the worst log line you have had to debug from at 3 AM, and what attribute would have made it answerable in one query?


For more production-Go discipline, the Complete Guide to Go Programming is the long version: how context, slog, errors, and the runtime fit together under load. Its companion, Hexagonal Architecture in Go, takes over once the language stops being the bottleneck and the service layout starts.

Thinking in Go — the 2-book series on Go programming and hexagonal architecture

Top comments (0)