James Lee

Posted on May 19

client-go Deep Dive: Indexer — Fast In-Memory Resource Storage with Custom Indexing

#architecture #go #kubernetes #performance

In the previous article we saw that HandleDeltas routes every event to two destinations — one of them is Indexer. Indexer is the local in-memory cache that makes resource queries fast and keeps the API Server load-free. In this article we'll look at how it's built and how to extend it with custom index functions.

Source: k8s.io/client-go/tools/cache/index.go

1. What Is Indexer?

Indexer is the local read cache of the Informer framework. Once the initial List completes and the cache is synced, all resource queries go through Indexer — never hitting the API Server.

┌─────────────────────────────────────────────────────┐
│                   Indexer                           │
│                                                     │
│  ┌───────────────────────────────────────────────┐  │
│  │              ThreadSafeMap                    │  │
│  │  (concurrent-safe storage backend)            │  │
│  │                                               │  │
│  │  items: map[string]interface{}                │  │
│  │  key = "namespace/name"  value = resource obj │  │
│  └───────────────────────────────────────────────┘  │
│                                                     │
│  ┌───────────────────────────────────────────────┐  │
│  │           Index Layer (on top)                │  │
│  │  Indexers: map[indexName]IndexFunc            │  │
│  │  Indices:  map[indexName]Index                │  │
│  │  Index:    map[indexKey]sets.String           │  │
│  └───────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────┘

Indexer has two layers:

ThreadSafeMap — the raw concurrent-safe key/value store
Index layer — pluggable custom index functions for efficient filtered queries

2. ThreadSafeMap: The Storage Backend

Source: k8s.io/client-go/tools/cache/thread_safe_store.go

type threadSafeMap struct {
    lock     sync.RWMutex          // read-write lock for concurrent safety
    items    map[string]interface{} // primary storage: key → resource object
    indexers Indexers               // registered index functions
    indices  Indices                // computed index data
}

Key format

The default key function is MetaNamespaceKeyFunc, which produces keys in the format:

namespace/name    →  "default/nginx-pod"
cluster-scoped    →  "my-node"   (no namespace prefix)

CRUD operations

ThreadSafeMap exposes standard storage operations, all protected by sync.RWMutex:

Operation	Lock type	What it does
`Add(key, obj)`	Write	Insert object + update all index entries
`Update(key, obj)`	Write	Replace object + rebuild index entries
`Delete(key)`	Write	Remove object + clean up index entries
`Get(key)`	Read	Fetch single object by key
`List()`	Read	Return all objects
`ListKeys()`	Read	Return all keys
`ByIndex(indexName, indexKey)`	Read	Return all objects matching an index query

Write path (Add/Update/Delete):
┌─────────────────────────────────────────────────┐
│  lock.Lock()                                    │
│  ├── modify items map                           │
│  ├── updateIndices() or deleteFromIndices()     │
│  └── lock.Unlock()                              │
└─────────────────────────────────────────────────┘

Read path (Get/List/ByIndex):
┌─────────────────────────────────────────────────┐
│  lock.RLock()   ← multiple readers in parallel  │
│  ├── read from items or indices                 │
│  └── lock.RUnlock()                             │
└─────────────────────────────────────────────────┘

3. The Four Index Data Structures

Indexer's power comes from its pluggable index system. Four types work together:

// A registry of named index functions
type Indexers map[string]IndexFunc

// An index function: given an object, return the index keys it belongs to
type IndexFunc func(obj interface{}) ([]string, error)

// A registry of computed index data (one per registered IndexFunc)
type Indices map[string]Index

// The actual index: maps an index key to the set of object keys that match
type Index map[string]sets.String

How they relate

Indexers["byNamespace"] = namespaceIndexFunc
Indexers["byLabel"]     = labelIndexFunc
      │
      │  when an object is added/updated, each IndexFunc is called
      ▼
Indices["byNamespace"] = Index{
    "default":     {"default/nginx", "default/redis"},
    "kube-system": {"kube-system/coredns"},
}
Indices["byLabel"] = Index{
    "app=web":   {"default/nginx", "default/frontend"},
    "app=cache": {"default/redis"},
}
      │
      │  query: ByIndex("byLabel", "app=web")
      ▼
returns: [nginx object, frontend object]   ← O(1) lookup via set

Visual summary

┌──────────────────────────────────────────────────────────────┐
│  Indexers (function registry)                                │
│  ┌────────────────┬──────────────────────────────────────┐   │
│  │ "byNamespace"  │ func(obj) → [namespace]              │   │
│  │ "byLabel/foo"  │ func(obj) → [label value of "foo"]   │   │
│  └────────────────┴──────────────────────────────────────┘   │
│                          │ applied on every Add/Update        │
│                          ▼                                    │
│  Indices (computed data)                                      │
│  ┌────────────────┬──────────────────────────────────────┐   │
│  │ "byNamespace"  │ {"default": {"ns/pod1","ns/pod2"}}   │   │
│  │ "byLabel/foo"  │ {"bar": {"ns/pod1"}, "biz":{"ns/pod3"}}  │
│  └────────────────┴──────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘

4. Custom Index Functions in Practice

The real power of Indexer is that you can register any index function at initialization time. Here's the complete example from the client-go test suite:

// Custom index function: index Pods by the value of their "foo" label
func testIndexFunc(obj interface{}) ([]string, error) {
    pod := obj.(*v1.Pod)
    return []string{pod.Labels["foo"]}, nil
}

func TestGetIndexFuncValues(t *testing.T) {
    // Initialize Indexer with:
    //   key function:   MetaNamespaceKeyFunc  → "namespace/name"
    //   index function: testIndexFunc         → registered as "testmodes"
    index := NewIndexer(MetaNamespaceKeyFunc, Indexers{"testmodes": testIndexFunc})

    pod1 := &v1.Pod{ObjectMeta: metav1.ObjectMeta{
        Name: "one", Labels: map[string]string{"foo": "bar"}}}
    pod2 := &v1.Pod{ObjectMeta: metav1.ObjectMeta{
        Name: "two", Labels: map[string]string{"foo": "bar"}}}
    pod3 := &v1.Pod{ObjectMeta: metav1.ObjectMeta{
        Name: "tre", Labels: map[string]string{"foo": "biz"}}}

    index.Add(pod1)
    index.Add(pod2)
    index.Add(pod3)

    // List all distinct values produced by the "testmodes" index function
    keys := index.ListIndexFuncValues("testmodes")
    // keys = ["bar", "biz"]  — the two distinct label values

    // Query all Pods where label "foo" == "bar"
    pods, _ := index.ByIndex("testmodes", "bar")
    // pods = [pod1, pod2]
}

What happens internally when `index.Add(pod1)` is called

index.Add(pod1)
     │
     ├── items["default/one"] = pod1        ← store in primary map
     │
     └── for each registered IndexFunc:
           testIndexFunc(pod1) → ["bar"]
           Indices["testmodes"]["bar"].Insert("default/one")

After adding all 3 pods:

items = {
    "default/one": pod1,
    "default/two": pod2,
    "default/tre": pod3,
}

Indices["testmodes"] = {
    "bar": {"default/one", "default/two"},
    "biz": {"default/tre"},
}

Query flow: `ByIndex("testmodes", "bar")`

ByIndex("testmodes", "bar")
     │
     ├── look up Indices["testmodes"]["bar"]
     │   → set: {"default/one", "default/two"}
     │
     └── for each key in set:
           items["default/one"] → pod1
           items["default/two"] → pod2

Result: [pod1, pod2]   ← O(1) index lookup + O(k) object fetch

5. Built-in Index Functions

client-go ships with a standard index function for the most common query pattern:

// MetaNamespaceIndexFunc — index objects by namespace
// This is the default index registered in most Informers
func MetaNamespaceIndexFunc(obj interface{}) ([]string, error) {
    meta, err := meta.Accessor(obj)
    if err != nil {
        return []string{""}, fmt.Errorf("object has no meta: %v", err)
    }
    return []string{meta.GetNamespace()}, nil
}

Usage:

// Get all Pods in the "production" namespace — from local cache, no API call
pods, err := podsLister.Pods("production").List(labels.Everything())

// Internally this calls:
// indexer.ByIndex("namespace", "production")

6. Indexer vs Direct API Query

	Indexer (local cache)	Direct API Server call
Speed	Microseconds (in-memory)	Milliseconds (network + etcd)
API Server load	Zero	One request per query
Consistency	Eventually consistent (synced via Watch)	Strongly consistent
Best for	Controller reconciliation loops	One-time admin queries

Rule of thumb: In any controller hot path — reconciliation loops, event handlers, health checks — always use the Lister (backed by Indexer). Only use Clientset for writes (Create/Update/Delete/Patch).

7. Summary

Indexer = ThreadSafeMap + pluggable Index functions

┌──────────────────────────────────────────────────────────┐
│  Write path (from HandleDeltas):                         │
│  Add/Update/Delete → update items + rebuild index data   │
│                                                          │
│  Read path (from your controller via Lister):            │
│  Get(key)              → O(1) direct lookup              │
│  List()                → O(n) full scan                  │
│  ByIndex(name, value)  → O(1) index lookup + O(k) fetch  │
└──────────────────────────────────────────────────────────┘

Concept	Detail
ThreadSafeMap	`sync.RWMutex` + `map[string]interface{}` — the raw storage layer
MetaNamespaceKeyFunc	Default key function: produces `"namespace/name"` keys
Indexers	Registry of named index functions — define what you can query by
IndexFunc	User-defined function: `obj → []indexKey`
Indices	Computed index data: `indexName → indexKey → set of object keys`
ByIndex	Efficient filtered query: O(1) index lookup + O(k) object retrieval

Indexer is what makes Kubernetes controllers fast. By keeping a fully indexed in-memory snapshot of cluster state, it eliminates the need for controllers to poll the API Server — turning what would be thousands of network calls per second into zero-latency local memory reads.

Next in this series: WorkQueue: The Reliable Task Queue for Kubernetes Controllers (Part 5)

Follow the series for more deep dives into Kubernetes development.

DEV Community

client-go Deep Dive: Indexer — Fast In-Memory Resource Storage with Custom Indexing

1. What Is Indexer?

2. ThreadSafeMap: The Storage Backend

Key format

CRUD operations

3. The Four Index Data Structures

How they relate

Visual summary

4. Custom Index Functions in Practice

What happens internally when `index.Add(pod1)` is called

Query flow: `ByIndex("testmodes", "bar")`

5. Built-in Index Functions

6. Indexer vs Direct API Query

7. Summary

Top comments (0)

1. What Is Indexer?

2. ThreadSafeMap: The Storage Backend

Key format

CRUD operations

3. The Four Index Data Structures

How they relate

Visual summary

4. Custom Index Functions in Practice

What happens internally when index.Add(pod1) is called

Query flow: ByIndex("testmodes", "bar")

5. Built-in Index Functions

6. Indexer vs Direct API Query

7. Summary

What happens internally when `index.Add(pod1)` is called

Query flow: `ByIndex("testmodes", "bar")`