DEV Community

Cover image for Streaming JSON in Go With encoding/json/v2: What's Different, What's Faster
Gabriel Anhaia
Gabriel Anhaia

Posted on

Streaming JSON in Go With encoding/json/v2: What's Different, What's Faster


You have a 4 GB JSON log file. The shape is the obvious one: a top-level array with a few million objects in it. The job is to read each object, look at one field, and either drop the object or stream it onward.

The first version of the code reaches for json.Unmarshal into a []Event. The process OOMs. You switch to json.NewDecoder(r).Decode(&v) in a loop and reach for dec.Token() to skip past the opening bracket.

It works. It is also slower than reading the file twice with bufio and writing your own scanner, which is the moment a lot of Go code I've read stops trusting the standard library for big JSON.

encoding/json/v2 is the part of the standard library that is trying to fix that. It is available in Go 1.25 today behind GOEXPERIMENT=jsonv2, with a sibling encoding/json/jsontext package. The design notes are explicit about what v1 got wrong. As of this writing the API is still experimental and outside the Go 1 compatibility promise, so treat what is in this post as preview, not production lock-in. The proposal lives at golang/go#71845 and the Go-team writeup is the one to bookmark: "A new experimental Go API for JSON".

What v2 changes that actually matters

The package splits into two. encoding/json/v2 does the value-to-Go reflection work. encoding/json/jsontext does the token-level read and write, and that is where streaming lives. jsontext.Encoder and jsontext.Decoder are the new types you call directly when you want token-by-token control.

Every entry point now takes an Options slice. The shape goes from Marshal(v) to Marshal(v, opts...), and the same for Unmarshal, plus three streaming pairs:

// Buffer-mode (what you already know).
func Marshal(in any, opts ...Options) ([]byte, error)
func Unmarshal(in []byte, out any, opts ...Options) error

// Reader/writer wrappers.
func MarshalWrite(out io.Writer, in any, opts ...Options) error
func UnmarshalRead(in io.Reader, out any, opts ...Options) error

// Direct streaming through jsontext.
func MarshalEncode(out *jsontext.Encoder, in any, opts ...Options) error
func UnmarshalDecode(in *jsontext.Decoder, out any, opts ...Options) error
Enter fullscreen mode Exit fullscreen mode

The streaming pair is the one to learn. UnmarshalDecode reads exactly one JSON value out of the decoder and stops. Run it in a loop and you have a streaming consumer that does not buffer the whole document.

Defaults are stricter. v2 rejects duplicate keys by default, where v1 silently accepted them. Case-insensitive name matching is off by default. Unknown fields can be made fatal with one option instead of a manual second-pass check. v1 cannot flip these on. v2 ships them by default.

Streaming a giant array, one element at a time

This is the canonical case. A top-level array, millions of items, you want to walk it without reading the whole thing into memory. The v2 shape is a jsontext.Decoder, a ReadToken call to consume the [, then UnmarshalDecode per element until you hit ].

package logscan

import (
    "encoding/json/jsontext"
    "encoding/json/v2"
    "io"
)

type Event struct {
    ID     string `json:"id"`
    Status string `json:"status"`
    Bytes  int64  `json:"bytes,omitzero"`
}

func StreamFailures(r io.Reader, out chan<- Event) error {
    dec := jsontext.NewDecoder(r)

    if _, err := dec.ReadToken(); err != nil {
        return err // expected '['
    }

    for dec.PeekKind() != ']' {
        var e Event
        if err := json.UnmarshalDecode(dec, &e); err != nil {
            return err
        }
        if e.Status == "failed" {
            out <- e
        }
    }

    _, err := dec.ReadToken() // consume ']'
    return err
}
Enter fullscreen mode Exit fullscreen mode

A few things are worth stopping on. PeekKind looks at the next token without consuming it, which is the loop guard you reach for. UnmarshalDecode advances the decoder by exactly one JSON value, so the next iteration starts on the next element or on the closing bracket. The Go-team blog post describes this as the part that gives you fully streaming decoding. The v1 Decoder.Decode only ever streamed inside a top-level array if you wrote the bracket-handling yourself, and the v1 decoder's reflect path was famously expensive per-element.

In production you also break out of the loop when PeekKind returns 0 — the kind for an unrecoverable decoder state. Otherwise a truncated input will spin the loop forever instead of surfacing the error.

omitzero (added to v1 in Go 1.24, carried into v2) is the tag worth knowing. It checks the value's IsZero() method if present, otherwise the Go zero value. omitempty lies on time.Time (the zero value does not look "empty" to v1), and omitzero is the fix. You can also pass OmitZeroStructFields as an option to the marshal call if you want it applied globally without changing every tag.

Encoding a bounded-buffer HTTP response

Same idea, the other direction. You have a handler that needs to stream a long array out without holding the whole result set in memory. v1 lets you do this too, but you write the brackets, the commas, and the encoder dance yourself; v2 gives you jsontext.Encoder which keeps the nesting honest.

package handler

import (
    "context"
    "encoding/json/jsontext"
    "encoding/json/v2"
    "net/http"
)

type Row struct {
    ID    string `json:"id"`
    Email string `json:"email"`
}

func ListRows(rows func(context.Context) (<-chan Row, error)) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        ch, err := rows(r.Context())
        if err != nil {
            http.Error(w, err.Error(), 500)
            return
        }

        w.Header().Set("Content-Type", "application/json")
        enc := jsontext.NewEncoder(w)

        if err := enc.WriteToken(jsontext.BeginArray); err != nil {
            return
        }
        for row := range ch {
            if err := json.MarshalEncode(enc, row); err != nil {
                return
            }
        }
        _ = enc.WriteToken(jsontext.EndArray)
    }
}
Enter fullscreen mode Exit fullscreen mode

(API names match the design as of Go 1.25.x — check pkg.go.dev/encoding/json/jsontext for the current shape before you ship.)

The jsontext.Encoder keeps the brace and bracket nesting honest. If you forget the closing ], the encoder's own state machine catches it before the client does. Back-pressure works: the handler stops reading when the client stops, the channel send blocks the producer, the producer stops pulling from the database.

The performance numbers, hedged

The Go team is careful with the framing, and so is this section. The blog post describes marshal performance in v2 as roughly at parity with v1 (sometimes faster, sometimes slower). Unmarshal is where the wins are: the same writeup reports improvements of up to 10x on representative benchmarks, and a Go discussion thread on switching from UnmarshalJSON to the streaming UnmarshalJSONFrom interface describes orders-of-magnitude improvements from killing the recursive parse-then-reparse pattern v1 forced on custom unmarshalers.

What you should take from those numbers: encode-heavy workloads might not move much. Decode-heavy workloads are where v2 is worth measuring on your own data: log ingestion, event replay, anything that pulls big JSON documents apart and uses a few fields. The benchmarks live at go-json-experiment/jsonbench and the Go 1.25 release notes link the official numbers. Run them on your shape, not on a synthetic struct, before you quote a multiplier in a Slack thread.

There is also at least one open performance regression worth tracking: golang/go#75026 reports a memory-allocation jump in some encode paths when jsonv2 is enabled. The proposal at golang/go#71845 lays out a path where v1's encoding/json is eventually reimplemented on top of v2 once the experiment stabilises. Until then, gate v2 behind a build tag and benchmark the path that matters to you.

What to actually do on Monday

Run your existing tests with GOEXPERIMENT=jsonv2 go test ./... and see what fails. Most code keeps working, because v2 keeps the v1 entry points. The failures you do see are usually a duplicate key your test fixture happened to contain or a case-insensitive field match you were depending on without knowing it. Both are bugs the v1 default was hiding.

Pick the hottest decode path you have and write a parallel handler against UnmarshalDecode plus jsontext.Decoder. Bench both. If the path is allocation-heavy or wraps a custom UnmarshalJSON, this is where v2 has the most room to win. If the path is a small struct unmarshaled once per request, the win is probably noise.

Keep v2 out of your public APIs and stable contracts for now. The package is experimental, the option set is still moving, and the eventual plan is for encoding/json itself to be reimplemented on top of v2. When that happens you get the wins for free. Reach for v2 today for the streaming shape and the stricter defaults. The multiplier is not the part to depend on.

If this was useful

The streaming JSON path is one of those places where Go's standard library finally caught up to what people had been writing by hand for more than a decade. Thinking in Go (the 2-book series linked below) covers the io.Reader/Writer composition, struct tags, the reflect cost model, and the architectural call on when streaming earns the extra ceremony.

Thinking in Go — the 2-book series on Go programming and hexagonal architecture

Top comments (0)