Cristian Sifuentes

Posted on Dec 8

C# Lists & Dictionaries Mental Model — From `names.Add("Ana")` to LLM‑Ready Collections

#dotnet #csharp #performance #datastructures

C# Lists & Dictionaries Mental Model — From `names.Add("Ana")` to LLM‑Ready Collections

Most .NET developers use List<T> and Dictionary<TKey,TValue> every day.

But when you actually sit down to reason about performance, memory layout, or even how to explain your intent to an LLM, the questions start:

What really happens when I call Add on a List<T>?
Why is Dictionary<TKey,TValue> sometimes slower than a simple List<T>?
How do buckets, entries, and hash codes map to CPU caches and branches?
When should I use List<T> vs Dictionary<TKey,TValue> in real systems?
How can I describe these tradeoffs to an LLM so it generates better code?

In this post we’ll build a systems‑level mental model of List<T> and Dictionary<TKey,TValue>, and then turn that knowledge into LLM‑ready prompts and patterns you can drop into your own codebase and GitHub repos.

If you can write this:

var names = new List<string> { "Ana", "Carlos", "Juan" };
names.Add("Lucia");

var students = new Dictionary<int, string>
{
    [1] = "Ana",
    [2] = "Felipe",
    [3] = "Elena"
};

You can follow this.

Mental Model: What You Actually Need from Collections
List<T> Internals: Contiguous Memory, Capacity & Growth
Iteration & Bounds Checks: Why Loops Matter
Dictionary<TKey,TValue> Internals: Buckets, Entries & Hash Codes
Advanced Dictionary Patterns: TryAdd, TryGetValue & Custom Keys
List vs Dictionary: Micro‑Benchmarks and Real Tradeoffs
LLM‑Ready Prompts: How to Ask for Collection‑Heavy Code
Full Demo File: CollectionsListDictionaryDeepDive.cs

1. Mental Model: What You Actually Need from Collections

Forget the API surface for a moment.

For day‑to‑day backend and systems work, you really need just three mental models:

List<T> as “Resizable Array”
- Backed by a T[] on the managed heap.
- Keeps a Count and a Capacity.
- Extremely cache‑friendly: data is contiguous in memory.
Dictionary<TKey,TValue> as “Hash Table”
- Internally holds buckets and entries.
- Uses GetHashCode() + Equals() to find keys.
- Great for random key lookups, less great for tight, sequential scans.
JIT & CPU Reality
- List<T> iteration is usually branch‑predictable and cache‑friendly.
- Dictionary<TKey,TValue> is pointer chasing + branches + hash cost.
- Big‑O is not the whole story — constant factors and caches matter.

Conceptually:

var names = new List<string> { "Ana", "Carlos", "Juan" };
names.Add("Lucia");

var students = new Dictionary<int, string>
{
    [1] = "Ana",
    [2] = "Felipe",
    [3] = "Elena"
};

Roslyn (the C# compiler) lowers this to IL types like:

System.Collections.Generic.List1[System.String]`
System.Collections.Generic.Dictionary2[System.Int32,System.String]`

The CLR then manages arrays, hash tables, and memory under the hood.

2. `List<T>` Internals: Contiguous Memory, Capacity & Growth

2.1. The Shape of `List<T>`

Conceptually, List<T> looks like this:

// Rough mental model (not official source code)
class List<T>
{
    private T[] _items;  // backing array on the heap
    private int _size;   // Count
    private int _version; // changes on Add/Remove for enumeration safety
}

The List object itself is tiny.

The T[] backing array is where the real payload lives.

Your basic sample:

List<string> names = new List<string> { "Ana", "Carlos", "Juan" };
names.Add("Lucia");
Console.WriteLine($"Total de nombres: {names.Count}");

2.2. Capacity vs Count

Count = logical number of elements.
Capacity = size of the backing array (_items.Length).

When you call Add:

If Count < Capacity: store in _items[Count], increment _size → O(1) no allocation.
If Count == Capacity:
- Allocate a new array (usually 2 * oldCapacity, or 4 if capacity was 0).
- Copy existing elements.
- Point _items to the new array.
- Then write the new element.

From the demo:

var list = new List<int>(); // default capacity 0
Console.WriteLine($"Initial: Count={list.Count}, Capacity={list.Capacity}");

for (int i = 0; i < 10; i++)
{
    list.Add(i);
    Console.WriteLine($"After Add({i}): Count={list.Count}, Capacity={list.Capacity}");
}

2.3. High‑Performance Pattern: Pre‑Allocate Capacity

If you know you’re going to add ~1,000,000 items, don’t grow from 0:

var bigList = new List<int>(capacity: 1_000_000);

This:

Avoids a cascade of re‑allocations + copies.
Reduces pressure on the GC.
Keeps your profile clean in BenchmarkDotNet & dotnet‑trace.

LLM‑Hint: when asking an LLM to generate high‑performance code, explicitly say:

“Pre‑allocate List<T> with an appropriate capacity to avoid repeated resizing.”

3. Iteration & Bounds Checks: Why Loops Matter

foreach over List<T> is compiled into a pattern roughly equivalent to:

foreach (int x in list)
{
    Console.WriteLine(x);
}

becoming something like:

for (int i = 0; i < list.Count; i++)
{
    Console.WriteLine(list[i]);
}

Plus version checks to detect modification during enumeration.

From the demo:

Console.WriteLine("foreach version:");
foreach (int x in list)
{
    Console.Write($"{x} ");
}

Console.WriteLine("for version:");
for (int i = 0; i < list.Count; i++)
{
    Console.Write($"{list[i]} ");
}

3.1. Bounds Check Elimination

Normally, list[i] or array[i] emits:

if ((uint)i >= (uint)length) throw;

But in simple, tight loops, the JIT can prove that i < length and remove that check, generating very tight machine code.

High‑performance style:

int count = list.Count;
int sum = 0;
for (int i = 0; i < count; i++)
{
    sum += list[i];
}
Console.WriteLine($"Sum (tight loop) = {sum}");

Cache Count in a local (avoids repeated property calls).
Allow the JIT to do range‑check elimination.

LLM‑Hint:

“Generate a tight loop over List<T>; cache .Count in a local and rely on the JIT to eliminate bounds checks.”

4. `Dictionary<TKey,TValue>` Internals: Buckets, Entries & Hash Codes

A Dictionary<TKey,TValue> is a hash table with:

int[] buckets
Entry[] entries where each entry is roughly:

struct Entry
{
    public int hashCode;
    public int next;
    public TKey key;
    public TValue value;
}

Lookups are basically:

value = map[key];

→ under the hood:

hashCode = key.GetHashCode() & 0x7FFFFFFF;
bucketIndex = hashCode % buckets.Length;
index = buckets[bucketIndex] - 1;
Walk the entries[index].next chain until you find entries[i].key.Equals(key).

From the demo:

var map = new Dictionary<int, string>
{
    [1] = "Ana",
    [42] = "Carlos",
    [1001] = "Elena"
};

Console.WriteLine($"Lookup 42 → {map[42]}");
Console.WriteLine($"ContainsKey(1001) = {map.ContainsKey(1001)}");

4.1. Custom Keys & Comparers

For custom keys (like a struct), you must define hashing & equality:

readonly struct Point2D
{
    public readonly int X;
    public readonly int Y;
    public Point2D(int x, int y) => (X, Y) = (x, y);
}

sealed class Point2DComparer : IEqualityComparer<Point2D>
{
    public int GetHashCode(Point2D p) => HashCode.Combine(p.X, p.Y);
    public bool Equals(Point2D p1, Point2D p2) => p1.X == p2.X && p1.Y == p2.Y;
}

var pointDict = new Dictionary<Point2D, string>(new Point2DComparer())
{
    [new Point2D(1, 2)] = "P1",
    [new Point2D(10, 20)] = "P2"
};

Console.WriteLine($"Point (1,2) → {pointDict[new Point2D(1, 2)]}");

4.2. CPU Reality

Dictionary lookups are O(1) on average, but involve:
- Hash computation.
- Indirections into buckets + entries arrays.
- Branches for collision chains.
Collisions + poor hash distribution = more pointer chasing + cache misses.

This is why, in very hot paths, a sorted array + binary search or a dense List<T> can beat a dictionary.

5. Advanced Dictionary Patterns: TryAdd, TryGetValue & Counters

5.1. Case‑Insensitive Cache

var cache = new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase);

cache.TryAdd("Ana", 1);
cache.TryAdd("ANA", 99); // ignored: comparer is case-insensitive

Console.WriteLine($"Cache['ana'] = {cache["ana"]}");

5.2. Word Counting Pattern

var wordCounts = new Dictionary<string, int>(StringComparer.Ordinal);

void AddWord(string word)
{
    if (wordCounts.TryGetValue(word, out var count))
    {
        wordCounts[word] = count + 1;
    }
    else
    {
        wordCounts[word] = 1;
    }
}

AddWord("csharp");
AddWord("csharp");
AddWord("dotnet");

foreach (var kv in wordCounts)
{
    Console.WriteLine($"{kv.Key} → {kv.Value}");
}

Why TryGetValue?

To avoid doing hashing twice:

// Worse: double hashing
if (dict.ContainsKey(k)) v = dict[k];

// Better: single hashing
if (dict.TryGetValue(k, out var v)) { ... }

5.3. Read‑Heavy Workloads

For read‑heavy scenarios with mostly static keys, consider:

ImmutableDictionary<TKey,TValue>
specialized “frozen” dictionaries (precomputed layout)

These can provide better cache behavior and thread‑safety.

6. List vs Dictionary: Micro‑Benchmarks and Real Tradeoffs

We can sketch a micro‑benchmark shape (for illustration only — use BenchmarkDotNet in real life):

const int N = 200_000;
const int targetKey = N - 1;

var list = new List<int>(N);
var dict = new Dictionary<int, int>(N);

for (int i = 0; i < N; i++)
{
    list.Add(i);
    dict[i] = i;
}

// Warm-up JIT
for (int i = 0; i < 5_000; i++)
{
    _ = list[list.Count - 1];
    _ = dict[targetKey];
}

static long MeasureTicks(string label, Action action)
{
    var sw = Stopwatch.StartNew();
    action();
    sw.Stop();
    Console.WriteLine($"{label}: {sw.ElapsedTicks} ticks");
    return sw.ElapsedTicks;
}

// Linear search in list (O(N))
MeasureTicks("List linear search (Contains)", () =>
{
    bool found = list.Contains(targetKey);
    if (!found) throw new Exception("Should find target.");
});

// Direct index in list (O(1))
MeasureTicks("List direct index", () =>
{
    int value = list[targetKey];
    if (value != targetKey) throw new Exception("Wrong value.");
});

// Dictionary key lookup (O(1) average)
MeasureTicks("Dictionary key lookup", () =>
{
    if (!dict.TryGetValue(targetKey, out var v) || v != targetKey)
        throw new Exception("Wrong value.");
});

6.1. Scientist‑Level Takeaways

Big‑O is not the whole story:
- O(1) dictionary lookup still has constant overhead (hashing, branches).
- O(N) linear scan on a small list can be faster than a dictionary lookup.
Design rule:
- Use Dictionary<K,V> when you truly need random key access over a large set.
- Use List<T> / arrays when you mostly iterate sequentially or care about raw throughput on dense data.
LLM‑Ready Insight:

When prompting an LLM, say things like:

“Use a List<T> for tight, sequential scans over dense data (cache‑friendly).
Use Dictionary<TKey,TValue> for sparse key lookups where O(1) access by key matters.”

This nudges the model toward intentional data‑structure choices, not default “Dictionary everywhere”.

7. LLM‑Ready Prompts: How to Ask for Collection‑Heavy Code

Now that you understand the internals, you can write much more precise prompts.

7.1. Example Prompt: Hot Path Over Dense Data

“Generate a C# method that processes 1M sensor readings using a List<SensorSample>.

Pre‑allocate the list with an appropriate capacity. Use a tight for loop that caches .Count in a local variable, so the JIT can eliminate bounds checks and the loop is cache‑friendly.”

7.2. Example Prompt: ID‑Based Lookup

“I need a lookup from int userId to a small POCO. Use Dictionary<int, UserInfo> with TryGetValue for reads, and don’t double‑hash keys with ContainsKey. Assume keys are moderately sparse.”

7.3. Example Prompt: Custom Struct Key with Good Hashing

“Model 2D grid positions as a readonly struct Point2D and use it as a key in Dictionary<Point2D, Tile>. Implement IEqualityComparer<Point2D> using HashCode.Combine(X, Y) and equality by value. Add a small demo that shows lookups and explains why a good hash is important for performance.”

7.4. Example Prompt: Compare List vs Dictionary

“Show me a small micro‑benchmark that compares linear search in List<int> vs key lookup in Dictionary<int,int> for 200k elements, using Stopwatch for rough timing. Warm up the JIT first.”

Each of these prompts encodes systems‑level detail and intent.

LLMs respond much better when you talk like this instead of just saying “use a dictionary”.

8. Full Demo File: `CollectionsListDictionaryDeepDive.cs`

Here’s the complete demo file you can drop into a console project and publish to GitHub.

// File: CollectionsListDictionaryDeepDive.cs
// Author: Cristian Sifuentes + ChatGPT
//
// GOAL
// ----
// Explain C# List<T> and Dictionary<TKey,TValue> like a systems / compiler /
// performance engineer, not just as "things that hold objects".
//
// MENTAL MODEL
// ------------
// When you write:
//
//   var names = new List<string> { "Ana", "Carlos", "Juan" };
//   names.Add("Lucia");
//   var students = new Dictionary<int, string>
//   {
//       [1] = "Ana",
//       [2] = "Felipe",
//       [3] = "Elena"
//   };
//
// the following layers are involved:
//
//   1. Roslyn (C# compiler) lowers the syntax into IL, using concrete types
//      like System.Collections.Generic.List`1[System.String] and
//      System.Collections.Generic.Dictionary`2[System.Int32,System.String].
//
//   2. List<T> is essentially:
//        - A reference to a contiguous T[] array on the managed heap.
//        - An integer Count.
//        - A version integer for enumeration safety.
//      Adding elements grows the backing array with Array.Resize and
//      Buffer.Memmove-style copies.
//
//   3. Dictionary<TKey,TValue> is a *hash table* implemented with:
//        - An int[] buckets array (indices into Entries).
//        - An Entry[] entries array (structs storing hashCode, key, value, next).
//        - A load factor threshold that triggers Resize() when exceeded.
//      Lookups use GetHashCode + equality comparison + collision chain walking.
//
//   4. JIT + CPU:
//        - List<T> is cache-friendly (contiguous memory).
//        - Dictionary<TKey,TValue> is cache-unfriendly (pointer chasing).
//        - foreach loops are lowered to index-based for loops (List<T>)
//          or an enumerator struct/class (Dictionary<K,V>).
//
// This file aims to give you a **top 1% engineer** mental model for List/Dictionary:
// syntax → IL → runtime data structures → CPU caches and branches.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.CompilerServices;

partial class Program
{
    // ---------------------------------------------------------------------
    // PUBLIC ENTRY POINT FOR THIS MODULE
    // ---------------------------------------------------------------------
    // Call CollectionsListDictionaryDeepDive() from your main Program.
    static void CollectionsListDictionaryDeepDive()
    {
        Console.WriteLine("=== List<T> & Dictionary<TKey,TValue> Deep Dive ===");

        BasicListAndDictionarySample();     // Your original idea, upgraded
        ListInternalsAndCapacity();         // How List<T> actually stores data
        ListIterationAndBoundsChecks();     // foreach → for, bounds-check elimination
        DictionaryInternalsAndHashing();    // Buckets, entries, collisions
        DictionaryAdvancedPatterns();       // TryGetValue, custom comparers, etc.
        ListVsDictionaryMicroBenchmark();   // Rough performance intuition
    }

    // ---------------------------------------------------------------------
    // 0. BASIC SAMPLE – starting from your original example, commented
    // ---------------------------------------------------------------------
    static void BasicListAndDictionarySample()
    {
        Console.WriteLine();
        Console.WriteLine("=== 0. Basic List & Dictionary Sample ===");

        // LIST<T>
        // -------
        // Backed internally by a T[] array in managed heap memory.
        // List<string> has fields (conceptually):
        //    T[] _items;
        //    int _size;      // Count
        //    int _version;   // increments on mutation (Add/Remove)
        //
        // The collection itself (the List object header + fields) is small;
        // the heavy part is the separate T[] backing array.
        List<string> names = new List<string> { "Ana", "Carlos", "Juan" };

        // Add() may or may not allocate:
        //   - If Count < Capacity: append at O(1) time, no allocation.
        //   - If Count == Capacity: allocate new array (typically 2x size),
        //     copy existing items, then append.
        names.Add("Lucia");

        Console.WriteLine($"Total de nombres: {names.Count}");
        foreach (var name in names)
        {
            Console.WriteLine(name);
        }

        // Remove() performs linear search (O(n)) to find the item, then shifts
        // the tail of the array one step left (memmove). On large lists, prefer
        // data structures where removal is cheaper or track indices yourself.
        names.Remove("Ana");

        bool isPresent = names.Contains("Ana"); // linear O(n) search
        Console.WriteLine($"¿Ana está en la lista? {isPresent}");

        // DICTIONARY<TKey,TValue>
        // -----------------------
        // Implemented as a hash table with separate arrays for buckets + entries.
        Dictionary<int, string> students = new Dictionary<int, string>
        {
            { 1, "Ana" },
            { 2, "Felipe" },
            { 3, "Elena" }
        };

        Console.WriteLine($"El estudiante con ID 1 es: {students[1]}");

        foreach (var student in students)
        {
            Console.WriteLine($"ID: {student.Key}, Nombre: {student.Value}");
        }

        // Indexer [] throws if the key is not present.
        // For "maybe present" keys, use TryGetValue to avoid exceptions:
        if (students.TryGetValue(99, out var unknown))
        {
            Console.WriteLine($"ID 99: {unknown}");
        }
        else
        {
            Console.WriteLine("ID 99 no encontrado (TryGetValue evita excepción).");
        }
    }

    // ---------------------------------------------------------------------
    // 1. LIST<T> INTERNALS – capacity, growth, and memory layout
    // ---------------------------------------------------------------------
    static void ListInternalsAndCapacity()
    {
        Console.WriteLine();
        Console.WriteLine("=== 1. List<T> Internals & Capacity ===");

        var list = new List<int>(); // default capacity is usually 0

        // Capacity is the length of the backing array; Count is how many
        // elements are logically in the list.
        Console.WriteLine($"Initial: Count={list.Count}, Capacity={list.Capacity}");

        for (int i = 0; i < 10; i++)
        {
            list.Add(i);
            Console.WriteLine($"After Add({i}): Count={list.Count}, Capacity={list.Capacity}");
        }

        // PERFORMANCE HACK:
        // If you know roughly how many elements you will add,
        // set Capacity or use the constructor with capacity.
        var bigList = new List<int>(capacity: 1_000_000);
        Console.WriteLine($"BigList: Count={bigList.Count}, Capacity={bigList.Capacity}");

        // This avoids multiple resize/copy cycles and reduces GC pressure.
        // Internally Resize() does something like:
        //
        //   int newCapacity = oldCapacity == 0 ? 4 : oldCapacity * 2;
        //   var newArray = new T[newCapacity];
        //   Array.Copy(_items, 0, newArray, 0, _size);
        //   _items = newArray;
        //
        // JIT & CPU VIEW:
        //   - List<T> gives you a contiguous memory block for elements.
        //   - This is extremely cache-friendly: iterating over List<T> is
        //     similar to iterating over a plain T[] in terms of locality.
        //   - On hot paths, this can be **orders of magnitude faster** than
        //     scattered allocations (e.g., linked lists).
    }

    // ---------------------------------------------------------------------
    // 2. LIST ITERATION & BOUNDS CHECKS – foreach → for
    // ---------------------------------------------------------------------
    static void ListIterationAndBoundsChecks()
    {
        Console.WriteLine();
        Console.WriteLine("=== 2. List Iteration & Bounds Checks ===");

        var list = new List<int>();
        for (int i = 0; i < 10; i++)
            list.Add(i);

        // foreach over List<T> is lowered roughly to:
        //
        //   for (int i = 0; i < list.Count; i++)
        //       Console.WriteLine(list[i]);
        //
        // plus a "version" check to detect modifications during enumeration.
        Console.WriteLine("foreach version:");
        foreach (int x in list)
        {
            Console.Write($"{x} ");
        }
        Console.WriteLine();

        // Manual for-loop is often what the JIT ends up with.
        // BOUNDS CHECK ELIMINATION:
        //   - Normally, list[i] and arr[i] insert a range check:
        //         if ((uint)i >= (uint)length) throw;
        //   - In tight, simple loops the JIT can prove "i < length" and remove
        //     the bounds check, generating extremely tight machine code.
        Console.WriteLine("for version:");
        for (int i = 0; i < list.Count; i++)
        {
            Console.Write($"{list[i]} ");
        }
        Console.WriteLine();

        // High-performance style:
        //   - Cache Count in a local variable (avoids repeated property calls).
        //   - Use index-based access for arrays and lists.
        int count = list.Count;
        int sum = 0;
        for (int i = 0; i < count; i++)
        {
            sum += list[i];
        }
        Console.WriteLine($"Sum (tight loop) = {sum}");
    }

    // ---------------------------------------------------------------------
    // 3. DICTIONARY<K,V> INTERNALS – buckets, entries, hash codes
    // ---------------------------------------------------------------------
    static void DictionaryInternalsAndHashing()
    {
        Console.WriteLine();
        Console.WriteLine("=== 3. Dictionary Internals & Hashing ===");

        var map = new Dictionary<int, string>
        {
            [1] = "Ana",
            [42] = "Carlos",
            [1001] = "Elena"
        };

        // LOGICAL OPERATION:
        //   var value = map[key];
        //
        // INTERNAL STEPS (simplified):
        //   1. Compute hashCode = key.GetHashCode() & 0x7FFFFFFF; // ensure non-negative
        //   2. bucketIndex = hashCode % buckets.Length;
        //   3. index = buckets[bucketIndex] - 1; // head of collision chain
        //   4. Walk entries[index].next chain until key.Equals(entries[i].key).
        //
        //   where:
        //     buckets: int[]    (each entry is index+1 into entries[] or 0 if empty)
        //     entries: Entry[]  (struct { int hashCode; int next; TKey key; TValue value; })
        //
        // Collisions are resolved by chaining through the "next" field.
        //
        Console.WriteLine($"Lookup 42 → {map[42]}");
        Console.WriteLine($"ContainsKey(1001) = {map.ContainsKey(1001)}");

        // HASH CODE QUALITY:
        //   - Good distribution of GetHashCode() is critical.
        //   - For custom types, override GetHashCode & Equals consistently.
        //
        //   struct Point { public int X, Y; }
        //   public override int GetHashCode() => HashCode.Combine(X, Y);
        //
        var pointDict = new Dictionary<Point2D, string>(new Point2DComparer())
        {
            [new Point2D(1, 2)] = "P1",
            [new Point2D(10, 20)] = "P2"
        };

        Console.WriteLine($"Point (1,2) → {pointDict[new Point2D(1, 2)]}");

        // CPU REALITY:
        //   - Dictionary<K,V> accesses are O(1) on average, but with a large
        //     constant factor: multiple array loads, branches, and pointer chasing.
        //   - Collisions and poor hash distribution cause longer chains, more
        //     cache misses, and branch mispredictions.
        //   - In **hot, high-frequency paths**, a carefully designed List<T> or
        //     sorted array + binary search can beat Dictionary<K,V>.
    }

    // Simple value-type key for demonstration.
    readonly struct Point2D
    {
        public readonly int X;
        public readonly int Y;

        public Point2D(int x, int y) => (X, Y) = (x, y);
    }

    // Custom comparer: shows how we control hashing & equality semantics.
    sealed class Point2DComparer : IEqualityComparer<Point2D>
    {
        // HashCode.Combine() is a high-quality, well-distributed hash combiner.
        public int GetHashCode(Point2D p) => HashCode.Combine(p.X, p.Y);

        public bool Equals(Point2D p1, Point2D p2) => p1.X == p2.X && p1.Y == p2.Y;
    }

    // ---------------------------------------------------------------------
    // 4. ADVANCED DICTIONARY PATTERNS – TryAdd, TryGetValue, value semantics
    // ---------------------------------------------------------------------
    static void DictionaryAdvancedPatterns()
    {
        Console.WriteLine();
        Console.WriteLine("=== 4. Advanced Dictionary Patterns ===");

        var cache = new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase);

        // TryAdd() → O(1) average insertion but without throwing on duplicates.
        cache.TryAdd("Ana", 1);
        cache.TryAdd("ANA", 99); // ignored because comparer is case-insensitive

        Console.WriteLine($"Cache['ana'] = {cache["ana"]}");

        // PATTERN: Increment counters in a dictionary-like way.
        var wordCounts = new Dictionary<string, int>(StringComparer.Ordinal);

        void AddWord(string word)
        {
            // Using TryGetValue to avoid double hashing (ContainsKey(hash) + indexer(hash)).
            if (wordCounts.TryGetValue(word, out var count))
            {
                wordCounts[word] = count + 1;
            }
            else
            {
                wordCounts[word] = 1;
            }
        }

        AddWord("csharp");
        AddWord("csharp");
        AddWord("dotnet");

        foreach (var kv in wordCounts)
        {
            Console.WriteLine($"{kv.Key} → {kv.Value}");
        }

        // HIGH-PERFORMANCE NOTE:
        //
        //   - Prefer TryGetValue over:
        //        if (dict.ContainsKey(k)) v = dict[k];
        //     because it computes the hash code **once**.
        //
        //   - Use specialized comparers (StringComparer.OrdinalIgnoreCase, etc.)
        //     rather than rolling your own string comparisons.
        //
        //   - For read-heavy workloads where the set of keys is mostly static,
        //     consider ImmutableDictionary or frozen dictionaries for better
        //     cache behavior and thread-safety.
    }

    // ---------------------------------------------------------------------
    // 5. MICRO-BENCHMARK SHAPE – List vs Dictionary lookup cost
    // ---------------------------------------------------------------------
    static void ListVsDictionaryMicroBenchmark()
    {
        Console.WriteLine();
        Console.WriteLine("=== 5. List vs Dictionary Micro-benchmark (Conceptual) ===");

        const int N = 200_000;
        const int targetKey = N - 1;

        var list = new List<int>(N);
        var dict = new Dictionary<int, int>(N);

        for (int i = 0; i < N; i++)
        {
            list.Add(i);
            dict[i] = i;
        }

        // Warm-up JIT
        for (int i = 0; i < 5_000; i++)
        {
            _ = list[list.Count - 1];
            _ = dict[targetKey];
        }

        static long MeasureTicks(string label, Action action)
        {
            var sw = Stopwatch.StartNew();
            action();
            sw.Stop();
            Console.WriteLine($"{label}: {sw.ElapsedTicks} ticks");
            return sw.ElapsedTicks;
        }

        // Linear search in list (O(N))
        MeasureTicks("List linear search (Contains)", () =>
        {
            bool found = list.Contains(targetKey);
            if (!found) throw new Exception("Should find target.");
        });

        // Direct index in list (O(1)) – but only if the index is known.
        MeasureTicks("List direct index", () =>
        {
            int value = list[targetKey]; // requires knowing the index
            if (value != targetKey) throw new Exception("Wrong value.");
        });

        // Dictionary key lookup (O(1) average)
        MeasureTicks("Dictionary key lookup", () =>
        {
            if (!dict.TryGetValue(targetKey, out var v) || v != targetKey)
                throw new Exception("Wrong value.");
        });

        // SCIENTIST-LEVEL TAKEAWAYS:
        //
        //   - Big-O is not the whole story:
        //       * O(1) dictionary lookup has non-trivial constant costs
        //         (hashing, pointer chasing, branches).
        //       * O(N) list search can be faster for small N because it is
        //         branch-predictable and cache-friendly.
        //
        //   - DESIGN RULE:
        //       * When you need random key-based access over a large set,
        //         Dictionary<K,V> is the right mental model.
        //       * When you mostly iterate sequentially, List<T> (or T[]) gives
        //         you maximum cache utilization and higher raw throughput.
        //
        //   - LLM-READY INSIGHT:
        //       * When prompting an LLM to generate data-structure-heavy code,
        //         you can explicitly state these tradeoffs:
        //             "Use a List<T> for hot loops over dense data,
        //              use Dictionary<K,V> only for sparse key lookups."
        //         That moves you closer to *intentional* architecture instead of
        //         "Dictionary for everything".
    }
}

Happy optimizing — and may your lists stay contiguous and your hash tables well‑distributed. 🚀

DEV Community

C# Lists & Dictionaries Mental Model — From `names.Add("Ana")` to LLM‑Ready Collections

C# Lists & Dictionaries Mental Model — From `names.Add("Ana")` to LLM‑Ready Collections

Table of Contents

1. Mental Model: What You Actually Need from Collections

2. `List<T>` Internals: Contiguous Memory, Capacity & Growth

2.1. The Shape of `List<T>`

2.2. Capacity vs Count

2.3. High‑Performance Pattern: Pre‑Allocate Capacity

3. Iteration & Bounds Checks: Why Loops Matter

3.1. Bounds Check Elimination

4. `Dictionary<TKey,TValue>` Internals: Buckets, Entries & Hash Codes

4.1. Custom Keys & Comparers

4.2. CPU Reality

5. Advanced Dictionary Patterns: TryAdd, TryGetValue & Counters

5.1. Case‑Insensitive Cache

5.2. Word Counting Pattern

5.3. Read‑Heavy Workloads

6. List vs Dictionary: Micro‑Benchmarks and Real Tradeoffs

6.1. Scientist‑Level Takeaways

7. LLM‑Ready Prompts: How to Ask for Collection‑Heavy Code

7.1. Example Prompt: Hot Path Over Dense Data

7.2. Example Prompt: ID‑Based Lookup

7.3. Example Prompt: Custom Struct Key with Good Hashing

7.4. Example Prompt: Compare List vs Dictionary

8. Full Demo File: `CollectionsListDictionaryDeepDive.cs`

Top comments (0)

C# Lists & Dictionaries Mental Model — From names.Add("Ana") to LLM‑Ready Collections

Table of Contents

1. Mental Model: What You Actually Need from Collections

2. List<T> Internals: Contiguous Memory, Capacity & Growth

2.1. The Shape of List<T>

2.2. Capacity vs Count

2.3. High‑Performance Pattern: Pre‑Allocate Capacity

3. Iteration & Bounds Checks: Why Loops Matter

3.1. Bounds Check Elimination

4. Dictionary<TKey,TValue> Internals: Buckets, Entries & Hash Codes

4.1. Custom Keys & Comparers

4.2. CPU Reality

5. Advanced Dictionary Patterns: TryAdd, TryGetValue & Counters

5.1. Case‑Insensitive Cache

5.2. Word Counting Pattern

5.3. Read‑Heavy Workloads

6. List vs Dictionary: Micro‑Benchmarks and Real Tradeoffs

6.1. Scientist‑Level Takeaways

7. LLM‑Ready Prompts: How to Ask for Collection‑Heavy Code

7.1. Example Prompt: Hot Path Over Dense Data

7.2. Example Prompt: ID‑Based Lookup

7.3. Example Prompt: Custom Struct Key with Good Hashing

7.4. Example Prompt: Compare List vs Dictionary

8. Full Demo File: CollectionsListDictionaryDeepDive.cs

C# Lists & Dictionaries Mental Model — From `names.Add("Ana")` to LLM‑Ready Collections

2. `List<T>` Internals: Contiguous Memory, Capacity & Growth

2.1. The Shape of `List<T>`

4. `Dictionary<TKey,TValue>` Internals: Buckets, Entries & Hash Codes

8. Full Demo File: `CollectionsListDictionaryDeepDive.cs`