DEV Community

Cristian Sifuentes
Cristian Sifuentes

Posted on

C# Data Structures Mental Model — From `User pedro` to LLM‑Ready Types

C# Data Structures Mental Model — From  raw `User pedro` endraw  to LLM‑Ready Types

C# Data Structures Mental Model — From User pedro to LLM‑Ready Types

Most C# developers know that class, struct, and record exist.

But when you actually sit down to design a model for a real system (or ask an LLM to generate one), the hard questions start:

  • When should I use a class vs a struct vs a record?
  • What really changes at the IL / CLR / CPU level?
  • How do these choices affect GC, cache locality, copying cost, and boxing?
  • How can I tell an LLM to generate good data structures instead of “works but slow” code?

In this post we’ll build a mental model of C# data structures like a systems engineer, using this small example as our playground:

User pedro = new User { Name = "Pedro", Age = 33 };
pedro.Greet();

Point punto = new Point { X = 30, Y = 20 };
Console.WriteLine($"Punto ({punto.X},{punto.Y})");

CellPhone nokia = new CellPhone("Nokia 225", 2024);
Console.WriteLine(nokia);
Enter fullscreen mode Exit fullscreen mode

We’ll connect syntax → IL → CLR → CPU, and then turn that into LLM‑ready prompts you can reuse in your own projects.

If you can write new User { Name = "Pedro" }, you can follow this.


Table of Contents

  1. Mental Model: What You Actually Need from Data Structures
  2. class vs struct vs record: The Real Differences
  3. Reference Types: Heap Layout, Method Tables, and GC
  4. Value Types: Copy Semantics, Stack / Inline Layout, and in / ref
  5. Records: Value‑Like Semantics on Top of Classes
  6. Composition: Structs Inside Classes for Cache‑Friendly Data
  7. Boxing & Interfaces: Hidden Allocations That Hurt Performance
  8. Micro‑Benchmark Shape: Classes vs Structs (Conceptual)
  9. LLM‑Ready Patterns: How to Explain This to an LLM So It Generates Better Code
  10. Production Checklist for Data Structure Design

1. Mental Model: What You Actually Need from Data Structures

Forget the syntax for a moment.

For most real systems (and interview questions), you really need just a few clear concepts:

  • Reference type:

    • Lives on the managed heap.
    • Variables hold a pointer to the object.
    • Copying the variable copies the pointer (O(1)).
    • Example: class User
  • Value type:

    • Represents the data itself (the bits).
    • Can live on the stack, in registers, or inline inside other objects.
    • Copying the variable copies the entire value (O(N) in size).
    • Example: struct Point
  • Record:

    • Syntax sugar for value‑based equality + immutability.
    • By default record is a reference type; record struct is a value type.
    • Example: record CellPhone(string Model, int Year);

Conceptually:

var pedro = new User { Name = "Pedro", Age = 33 };
var p     = new Point { X = 30, Y = 20 };
var nokia = new CellPhone("Nokia 225", 2024);
Enter fullscreen mode Exit fullscreen mode
  • pedro: stack/register → pointer → [heap: User object]
  • p: stack/register → [X][Y] bits directly
  • nokia: stack/register → pointer → [heap: CellPhone { Model, Year }]

If you keep this mental picture in mind, you’ll make better decisions and you’ll be able to guide an LLM with much more precision.


2. class vs struct vs record: The Real Differences

Let’s start from the teaching sample:

class User
{
    public string? Name { get; set; }
    public int Age { get; set; }

    public void Greet()
    {
        Console.WriteLine($"Hola, soy el usuario {Name} y tengo una edad de {Age} años");
    }
}

struct Point
{
    public int X { get; set; }
    public int Y { get; set; }
}

record CellPhone(string Model, int Year);
Enter fullscreen mode Exit fullscreen mode

2.1 Reference vs Value at the Language Level

  • class

    • Reference type
    • Default equality: reference equality (ReferenceEquals)
    • Lives on the managed heap, managed by the GC
    • Great for large, shared, mutable objects
  • struct

    • Value type
    • Default equality: field‑by‑field
    • Can live on stack / inline / in registers
    • Great for small, frequently used, often immutable values (e.g., DateTime, Vector3, Point).
  • record

    • By default: record → class; record struct → struct
    • Adds value‑based equality, ToString, deconstruction, with expressions.
    • Great for DTOs, event payloads, and immutable models.

2.2 How Roslyn Sees It

When you compile:

var pedro = new User { Name = "Pedro", Age = 33 };
Enter fullscreen mode Exit fullscreen mode

Roslyn turns it into IL using concrete types like:

  • Userclass with fields, methods, and metadata
  • Pointvaluetype with a specific layout
  • CellPhone → compiler‑generated class (for record) with extra members

Later, the JIT decides whether a given instance lives in a stack slot, in a CPU register, or on the managed heap.


3. Reference Types: Heap Layout, Method Tables, and GC

When you do:

var u = new User { Name = "Carlos", Age = 40 };
u.Greet();
Enter fullscreen mode Exit fullscreen mode

At a high level, the memory layout (64‑bit) looks like this:

[Object Header][Method Table Ptr][Name*][Age]
Enter fullscreen mode Exit fullscreen mode
  • Object header
    • Contains sync block index (used for lock), GC info.
  • Method table pointer
    • Points to metadata for the type: vtable, interface maps, etc.
  • Fields
    • Name = reference (pointer to another object)
    • Age = a 32‑bit integer stored inline

The variable u is just a pointer that lives on the stack (or in a register).

The GC:

  • Allocates objects on the managed heap.
  • Occasionally compacts memory and updates all references:
    • Reduces fragmentation
    • Improves cache locality

When to Prefer Classes

Classes are ideal when:

  • You want shared mutable state.
  • You have large objects (copying would be expensive).
  • You need polymorphism / inheritance (base class, virtual methods).

LLM prompt hint:

“Use class for aggregate entities with shared mutable state and potentially large size (e.g., User, Order, Account).”


4. Value Types: Copy Semantics and in / ref

Now look at the struct path:

Point p = new Point { X = 1, Y = 2 };

MovePoint(p);          // copy
MovePointByRef(ref p); // no copy
MovePointByIn(in p);   // no copy, read‑only
Enter fullscreen mode Exit fullscreen mode
static void MovePoint(Point p)
{
    // Local copy of the struct.
    p.X += 10;
    p.Y += 10;
}

static void MovePointByRef(ref Point p)
{
    // Modifies caller’s instance, no copy.
    p.X += 100;
    p.Y += 100;
}

static void MovePointByIn(in Point p)
{
    // Read‑only ref, no copy, perfect for large readonly structs.
    int lengthSquared = p.X * p.X + p.Y * p.Y;
    Console.WriteLine($"Length² (in) = {lengthSquared}");
}
Enter fullscreen mode Exit fullscreen mode

4.1 Copying Cost

  • Structs are copied by value.
  • Copying a tiny struct like Point (8 bytes) is cheap.
  • Copying a huge struct with 256+ bytes in tight loops can be very expensive.

Rule of thumb:

  • Small, immutable, math‑like → struct is excellent.
  • Large, mutable, frequently passed around → prefer class or in / ref parameters.

LLM prompt hint:

“Use small immutable structs (≤ 16 bytes) for math and geometry; use in parameters for large readonly structs to avoid copies.”


5. Records: Value‑Like Semantics on Top of Classes

Consider:

var phone1 = new CellPhone("Nokia 225", 2024);
var phone2 = new CellPhone("Nokia 225", 2024);

Console.WriteLine(phone1 == phone2); // true
Enter fullscreen mode Exit fullscreen mode

This is possible because record CellPhone(string Model, int Year); expands roughly to:

class CellPhone
{
    public string Model { get; init; }
    public int Year { get; init; }

    public override bool Equals(object? other) { ... }
    public override int GetHashCode() { ... }

    public void Deconstruct(out string model, out int year) { ... }
    public static bool operator ==(...), !=(...);

    public CellPhone With(...) => new CellPhone(...);
}
Enter fullscreen mode Exit fullscreen mode

So you get:

  • Value‑based equality: two instances with same Model/Year are “equal”.
  • Immutability by default: init setters, with expression for clones.

Example:

var newer = phone1 with { Year = 2025 };
Enter fullscreen mode Exit fullscreen mode

LLM prompt hint:

“Use record for immutable DTOs / messages where equality is by value, not identity. Use record struct if you also want value‑type layout.”


6. Composition: Structs Inside Classes = Inline Data

sealed class EntityWithPosition
{
    public int Id;
    public Point Position; // struct field

    public override string ToString() => $"Entity {Id} at ({Position.X},{Position.Y})";
}

var entity = new EntityWithPosition
{
    Id = 1,
    Position = new Point { X = 5, Y = 10 }
};
Enter fullscreen mode Exit fullscreen mode

On the heap (conceptually):

[hdr][mtbl*][Id][Position.X][Position.Y]
Enter fullscreen mode Exit fullscreen mode
  • Position is inline, not another heap allocation.
  • This is very cache‑friendly compared to class Position { int X; int Y; } which would require one more pointer dereference and another object.

Design pattern:

  • Use structs to embed small, pure data inside larger entities.
  • You get fewer allocations and better locality.

LLM prompt hint:

“Model hot numeric data as structs embedded inside aggregate classes to avoid extra allocations and pointer chasing.”


7. Boxing & Interfaces: Hidden Allocations That Hurt Performance

int value = 42;

object boxed = value;      // boxing → allocates
int unboxed = (int)boxed;  // unboxing → copies bits back
Enter fullscreen mode Exit fullscreen mode
  • Boxing: value type → reference (object / interface)
    • Allocates a new object with a copy of the value.
  • Unboxing: reference → value type
    • Type check + copy the value back.

In collections:

var numbers = new List<int> { 1, 2, 3, 4, 5 };
long sum = 0;

foreach (int n in numbers) // generic List<int>, no boxing
{
    sum += n;
}
Enter fullscreen mode Exit fullscreen mode

But if you did:

var bad = new List<object> { 1, 2, 3, 4, 5 }; // each int boxed
Enter fullscreen mode Exit fullscreen mode

Every integer becomes a separate heap allocation.

LLM prompt hint:

“Avoid boxing in hot paths. Use generic collections (List, Dictionary) instead of ArrayList or List<object> for numeric data.”


8. Micro‑Benchmark Shape: Classes vs Structs

A simplified conceptual benchmark:

const int N = 200_000;

var classArray = new User[N];
var structArray = new Point[N];

for (int i = 0; i < N; i++)
{
    classArray[i] = new User { Name = "User" + i, Age = i };
    structArray[i] = new Point { X = i, Y = i };
}

long ClassSumAges()
{
    long sum = 0;
    for (int i = 0; i < classArray.Length; i++)
    {
        sum += classArray[i].Age; // pointer load + field load
    }
    return sum;
}

long StructSumXs()
{
    long sum = 0;
    for (int i = 0; i < structArray.Length; i++)
    {
        sum += structArray[i].X; // direct load from contiguous memory
    }
    return sum;
}
Enter fullscreen mode Exit fullscreen mode

Key takeaways:

  • Struct arrays offer perfect locality: all data is contiguous, great for numeric/game/physics workloads.
  • Class arrays store pointers. Each element requires a dereference + field access → more cache misses.
  • But:
    • Structs are copied by value, so large structs passed around a lot can hurt performance.
    • Classes introduce GC pressure but avoid expensive value copies.

For real work, use BenchmarkDotNet; here the goal is intuition, not exact numbers.


9. LLM‑Ready Patterns: How to Instruct an LLM to Use Data Structures Well

Here are some copy‑paste prompts you can use when working with LLMs:

9.1 For Domain Modeling

“In C#, design the domain model using:

  • class for large aggregate roots with shared mutable state (User, Order, Invoice).
  • Small struct types for math or geometry data (Point, Vector3) and embed them inline where possible.
  • record for immutable DTOs/events with value‑based equality (Event payloads, read models). Avoid unnecessary boxing and use generic collections.”

9.2 For Performance‑Sensitive Code

“Generate C# code optimized for performance:

  • Use struct for small immutable numeric data and keep them under 16 bytes.
  • Avoid boxing; use generic collections like List<int> and Dictionary<TKey,TValue>.
  • Prefer struct fields inline within classes for hot data to improve cache locality.
  • Use in parameters for large readonly structs, and ref only when mutating shared state is required.”

9.3 For Teaching / Documentation

“Explain the generated data structures with comments that describe:

  • Whether each type is a class / struct / record.
  • Where it lives (heap / stack / inline).
  • How copying behaves.
  • Any boxing that might occur in interfaces or collections.”

This is how you move from “LLM wrote some code” to “LLM generated something that respects hardware and the CLR”.


10. Production Checklist for Data Structure Design

Before you call your model “done”, walk through this checklist:

  • [ ] For each type, did I intentionally choose between class, struct, and record?
  • [ ] Are my structs small and focused (ideally ≤ 16–32 bytes)?
  • [ ] Am I embedding small structs inline inside classes where that improves locality?
  • [ ] Have I avoided boxing in hot paths (no List<object> of ints, no ArrayList)?
  • [ ] Do I use record/record struct where value‑based equality is desired?
  • [ ] Are there any very large structs being copied frequently? Should they be classes or passed by in / ref?
  • [ ] Do I understand where my allocation hotspots are (use a profiler / BenchmarkDotNet)?
  • [ ] Did I document the intent so that teammates (and future LLM prompts) keep the same invariants?

Once you’re comfortable with this “hello world” of data structure design, you can move to:

  • Span<T> / Memory<T> and stackalloc for low‑allocation slices.
  • readonly struct and ref struct for high‑performance pipelines.
  • Custom memory layouts for tight numerical kernels or game engines.

The important part is: you now have a mental model from C# → IL → CLR → CPU, and you know how to use LLMs as partners instead of just code generators.

Happy modeling — and may your structs stay small and your GC pauses intentional. 🧠💾

Top comments (0)