DEV Community

James Lee
James Lee

Posted on

client-go Deep Dive: Indexer — Fast In-Memory Resource Storage with Custom Indexing

In the previous article we saw that HandleDeltas routes every event to two destinations — one of them is Indexer. Indexer is the local in-memory cache that makes resource queries fast and keeps the API Server load-free. In this article we'll look at how it's built and how to extend it with custom index functions.

Source: k8s.io/client-go/tools/cache/index.go


1. What Is Indexer?

Indexer is the local read cache of the Informer framework. Once the initial List completes and the cache is synced, all resource queries go through Indexer — never hitting the API Server.

┌─────────────────────────────────────────────────────┐
│                   Indexer                           │
│                                                     │
│  ┌───────────────────────────────────────────────┐  │
│  │              ThreadSafeMap                    │  │
│  │  (concurrent-safe storage backend)            │  │
│  │                                               │  │
│  │  items: map[string]interface{}                │  │
│  │  key = "namespace/name"  value = resource obj │  │
│  └───────────────────────────────────────────────┘  │
│                                                     │
│  ┌───────────────────────────────────────────────┐  │
│  │           Index Layer (on top)                │  │
│  │  Indexers: map[indexName]IndexFunc            │  │
│  │  Indices:  map[indexName]Index                │  │
│  │  Index:    map[indexKey]sets.String           │  │
│  └───────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Indexer has two layers:

  • ThreadSafeMap — the raw concurrent-safe key/value store
  • Index layer — pluggable custom index functions for efficient filtered queries

2. ThreadSafeMap: The Storage Backend

Source: k8s.io/client-go/tools/cache/thread_safe_store.go

type threadSafeMap struct {
    lock     sync.RWMutex          // read-write lock for concurrent safety
    items    map[string]interface{} // primary storage: key → resource object
    indexers Indexers               // registered index functions
    indices  Indices                // computed index data
}
Enter fullscreen mode Exit fullscreen mode

Key format

The default key function is MetaNamespaceKeyFunc, which produces keys in the format:

namespace/name    →  "default/nginx-pod"
cluster-scoped    →  "my-node"   (no namespace prefix)
Enter fullscreen mode Exit fullscreen mode

CRUD operations

ThreadSafeMap exposes standard storage operations, all protected by sync.RWMutex:

Operation Lock type What it does
Add(key, obj) Write Insert object + update all index entries
Update(key, obj) Write Replace object + rebuild index entries
Delete(key) Write Remove object + clean up index entries
Get(key) Read Fetch single object by key
List() Read Return all objects
ListKeys() Read Return all keys
ByIndex(indexName, indexKey) Read Return all objects matching an index query
Write path (Add/Update/Delete):
┌─────────────────────────────────────────────────┐
│  lock.Lock()                                    │
│  ├── modify items map                           │
│  ├── updateIndices() or deleteFromIndices()     │
│  └── lock.Unlock()                              │
└─────────────────────────────────────────────────┘

Read path (Get/List/ByIndex):
┌─────────────────────────────────────────────────┐
│  lock.RLock()   ← multiple readers in parallel  │
│  ├── read from items or indices                 │
│  └── lock.RUnlock()                             │
└─────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

3. The Four Index Data Structures

Indexer's power comes from its pluggable index system. Four types work together:

// A registry of named index functions
type Indexers map[string]IndexFunc

// An index function: given an object, return the index keys it belongs to
type IndexFunc func(obj interface{}) ([]string, error)

// A registry of computed index data (one per registered IndexFunc)
type Indices map[string]Index

// The actual index: maps an index key to the set of object keys that match
type Index map[string]sets.String
Enter fullscreen mode Exit fullscreen mode

How they relate

Indexers["byNamespace"] = namespaceIndexFunc
Indexers["byLabel"]     = labelIndexFunc
      │
      │  when an object is added/updated, each IndexFunc is called
      ▼
Indices["byNamespace"] = Index{
    "default":     {"default/nginx", "default/redis"},
    "kube-system": {"kube-system/coredns"},
}
Indices["byLabel"] = Index{
    "app=web":   {"default/nginx", "default/frontend"},
    "app=cache": {"default/redis"},
}
      │
      │  query: ByIndex("byLabel", "app=web")
      ▼
returns: [nginx object, frontend object]   ← O(1) lookup via set
Enter fullscreen mode Exit fullscreen mode

Visual summary

┌──────────────────────────────────────────────────────────────┐
│  Indexers (function registry)                                │
│  ┌────────────────┬──────────────────────────────────────┐   │
│  │ "byNamespace"  │ func(obj) → [namespace]              │   │
│  │ "byLabel/foo"  │ func(obj) → [label value of "foo"]   │   │
│  └────────────────┴──────────────────────────────────────┘   │
│                          │ applied on every Add/Update        │
│                          ▼                                    │
│  Indices (computed data)                                      │
│  ┌────────────────┬──────────────────────────────────────┐   │
│  │ "byNamespace"  │ {"default": {"ns/pod1","ns/pod2"}}   │   │
│  │ "byLabel/foo"  │ {"bar": {"ns/pod1"}, "biz":{"ns/pod3"}}  │
│  └────────────────┴──────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

4. Custom Index Functions in Practice

The real power of Indexer is that you can register any index function at initialization time. Here's the complete example from the client-go test suite:

// Custom index function: index Pods by the value of their "foo" label
func testIndexFunc(obj interface{}) ([]string, error) {
    pod := obj.(*v1.Pod)
    return []string{pod.Labels["foo"]}, nil
}

func TestGetIndexFuncValues(t *testing.T) {
    // Initialize Indexer with:
    //   key function:   MetaNamespaceKeyFunc  → "namespace/name"
    //   index function: testIndexFunc         → registered as "testmodes"
    index := NewIndexer(MetaNamespaceKeyFunc, Indexers{"testmodes": testIndexFunc})

    pod1 := &v1.Pod{ObjectMeta: metav1.ObjectMeta{
        Name: "one", Labels: map[string]string{"foo": "bar"}}}
    pod2 := &v1.Pod{ObjectMeta: metav1.ObjectMeta{
        Name: "two", Labels: map[string]string{"foo": "bar"}}}
    pod3 := &v1.Pod{ObjectMeta: metav1.ObjectMeta{
        Name: "tre", Labels: map[string]string{"foo": "biz"}}}

    index.Add(pod1)
    index.Add(pod2)
    index.Add(pod3)

    // List all distinct values produced by the "testmodes" index function
    keys := index.ListIndexFuncValues("testmodes")
    // keys = ["bar", "biz"]  — the two distinct label values

    // Query all Pods where label "foo" == "bar"
    pods, _ := index.ByIndex("testmodes", "bar")
    // pods = [pod1, pod2]
}
Enter fullscreen mode Exit fullscreen mode

What happens internally when index.Add(pod1) is called

index.Add(pod1)
     
     ├── items["default/one"] = pod1         store in primary map
     
     └── for each registered IndexFunc:
           testIndexFunc(pod1)  ["bar"]
           Indices["testmodes"]["bar"].Insert("default/one")

After adding all 3 pods:

items = {
    "default/one": pod1,
    "default/two": pod2,
    "default/tre": pod3,
}

Indices["testmodes"] = {
    "bar": {"default/one", "default/two"},
    "biz": {"default/tre"},
}
Enter fullscreen mode Exit fullscreen mode

Query flow: ByIndex("testmodes", "bar")

ByIndex("testmodes", "bar")
     
     ├── look up Indices["testmodes"]["bar"]
         set: {"default/one", "default/two"}
     
     └── for each key in set:
           items["default/one"]  pod1
           items["default/two"]  pod2

Result: [pod1, pod2]    O(1) index lookup + O(k) object fetch
Enter fullscreen mode Exit fullscreen mode

5. Built-in Index Functions

client-go ships with a standard index function for the most common query pattern:

// MetaNamespaceIndexFunc — index objects by namespace
// This is the default index registered in most Informers
func MetaNamespaceIndexFunc(obj interface{}) ([]string, error) {
    meta, err := meta.Accessor(obj)
    if err != nil {
        return []string{""}, fmt.Errorf("object has no meta: %v", err)
    }
    return []string{meta.GetNamespace()}, nil
}
Enter fullscreen mode Exit fullscreen mode

Usage:

// Get all Pods in the "production" namespace — from local cache, no API call
pods, err := podsLister.Pods("production").List(labels.Everything())

// Internally this calls:
// indexer.ByIndex("namespace", "production")
Enter fullscreen mode Exit fullscreen mode

6. Indexer vs Direct API Query

Indexer (local cache) Direct API Server call
Speed Microseconds (in-memory) Milliseconds (network + etcd)
API Server load Zero One request per query
Consistency Eventually consistent (synced via Watch) Strongly consistent
Best for Controller reconciliation loops One-time admin queries

Rule of thumb: In any controller hot path — reconciliation loops, event handlers, health checks — always use the Lister (backed by Indexer). Only use Clientset for writes (Create/Update/Delete/Patch).


7. Summary

Indexer = ThreadSafeMap + pluggable Index functions

┌──────────────────────────────────────────────────────────┐
│  Write path (from HandleDeltas):                         │
│  Add/Update/Delete → update items + rebuild index data   │
│                                                          │
│  Read path (from your controller via Lister):            │
│  Get(key)              → O(1) direct lookup              │
│  List()                → O(n) full scan                  │
│  ByIndex(name, value)  → O(1) index lookup + O(k) fetch  │
└──────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode
Concept Detail
ThreadSafeMap sync.RWMutex + map[string]interface{} — the raw storage layer
MetaNamespaceKeyFunc Default key function: produces "namespace/name" keys
Indexers Registry of named index functions — define what you can query by
IndexFunc User-defined function: obj → []indexKey
Indices Computed index data: indexName → indexKey → set of object keys
ByIndex Efficient filtered query: O(1) index lookup + O(k) object retrieval

Indexer is what makes Kubernetes controllers fast. By keeping a fully indexed in-memory snapshot of cluster state, it eliminates the need for controllers to poll the API Server — turning what would be thousands of network calls per second into zero-latency local memory reads.


Next in this series: WorkQueue: The Reliable Task Queue for Kubernetes Controllers (Part 5)


Follow the series for more deep dives into Kubernetes development.

Top comments (0)