Basic Concepts of C# Data Types — From Bits to LLM‑Ready Mental Models
Most developers can list C# data types from memory:
int, double, bool, char, string, decimal, enum, struct, class...
But if you ask deeper questions, the room gets quiet:
- How do these types actually map to IL stack types (I4, I8, R4, R8, OBJ)?
- When does a value live in a register, when does it live on the stack, and when on the heap?
- Why does
0.1 + 0.2misbehave withdoublebut not withdecimal? - What does the JIT do differently for
List<int>vsList<double>vsList<object>?
And more important for this era:
How do I talk about data types with LLMs so they can help me at a systems level, not just tutorial level?
In this post we’ll walk through a DataTypesDeepDive.cs file that treats C# data types like a compiler engineer would — and we’ll connect that to how you can ask better questions to LLMs and expand your understanding.
If you can compile a C# console app, you can follow along.
Table of Contents
- Mental Model: How Any Data Type Travels Through the Stack
- The Demo File:
DataTypesDeepDive.cs - Basic Types in C#: What They Really Mean to the CPU
- Integer Types: Bits, Two’s Complement, and Registers
- Floating Point: IEEE‑754, Precision, and Why 0.1 + 0.2 ≠ 0.3
- Boolean: Just a Byte on Top of CPU Flags
-
charandstring: UTF‑16, Interning, and Allocations - Struct Layout & Padding: Why Field Order Can Matter
- Enums: Type‑Safe Names over Raw Integers
- Generics & Reification: Different JIT Code per Type
- Using This Mental Model to Get More from LLMs
- Data‑Type Mastery Checklist (Top‑1% Developer Mindset)
1. Mental Model: How Any Data Type Travels Through the Stack
At a high level, every data type in C# goes through the same pipeline:
// File: DataTypesDeepDive.cs
// Author: Cristian Sifuentes + ChatGPT
// Goal: Explain C# data types like a systems / compiler / performance engineer.
//
// High-level mental model (how ANY data type travels through the stack):
// 1. The C# compiler (Roslyn) translates your code into IL (Intermediate Language).
// 2. The JIT compiler (at runtime) translates that IL into machine code for your CPU.
// 3. The CLR runtime + JIT decide how each data type is represented:
// - Which IL "stack type" it uses (I4, I8, R8, OBJ, etc.).
// - Whether it lives in a register, stack slot, or on the managed heap.
// 4. The CPU only sees bits: fixed-width integer registers, floating-point registers,
// and bytes in memory. “int”, “double”, “string” are abstractions on top of this.
Key idea:
int,double,string,enum,structare names for patterns of bits and access rules.
The CPU doesn’t see types — it sees instructions and data widths.
If you want LLMs to act like real systems experts, your questions should reference this pipeline: Roslyn → IL → JIT → CLR → CPU.
2. The Demo File: DataTypesDeepDive.cs
Here’s the “front door” of our demo method:
partial class Program
{
static void DataTypesDeepDive()
{
var integer = 42;
double decimalNumber = 3.1416;
bool isTrue = true;
char character = 'C';
string text = "Hi C#";
Console.WriteLine($"Int: {integer}, Decimal: {decimalNumber}, Boolean: {isTrue}, Char: {character}, Text: {text}");
BasicDataTypesIntro();
IntegerBitLevel();
FloatingPointInternals();
BooleanSemantics();
CharAndStringInternals();
StructLayoutAndPadding();
EnumUnderlyingTypes();
GenericSpecializationDemo();
}
}
This single method calls a set of focused “labs” — each one exploring how a specific group of types behaves internally.
We’ll use this file as the reference artifact you can commit to your GitHub repo and send to LLMs when asking questions.
3. Basic Types in C#: What They Really Mean to the CPU
The first lab, BasicDataTypesIntro, revisits the classic example — but from the IL and CPU point of view:
static void BasicDataTypesIntro()
{
var integer = 42; // System.Int32
double decimalNumber = 3.1416; // System.Double
bool isTrue = true; // System.Boolean
char character = 'C'; // System.Char
string text = "Hola C#"; // System.String
Console.WriteLine(
$"[BasicDataTypesIntro] Entero: {integer}, Decimal: {decimalNumber}, " +
$"Booleano: {isTrue}, Carácter: {character}, Texto: {text}");
}
Conceptually, IL will have locals:
.locals init (
[0] int32 integer,
[1] float64 decimalNumber,
[2] bool isTrue,
[3] char character,
[4] string text
)
And to the CPU:
-
int32→ general‑purpose registers (e.g., EAX/RAX). -
float64(double) → floating‑point/XMM registers. -
bool→ just 0 or 1 (often extended to 32 bits in registers). -
char→ a 16‑bit integer (UTF‑16 code unit). -
string→ pointer to a heap object with layout roughly:[object header][method table pointer][int32 Length][UTF‑16 chars...].
Takeaway: The same high-level syntax maps to very different low-level representations and instruction sets.
4. Integer Types: Bits, Two’s Complement, and Registers
IntegerBitLevel() goes deeper into signed/unsigned integers and two’s complement:
static void IntegerBitLevel()
{
sbyte s8 = -1;
byte u8 = 255;
short s16 = -12345;
ushort u16 = 65535;
int s32 = -123456789;
uint u32 = 4000000000;
long s64 = -1234567890123456789L;
ulong u64 = 18446744073709551615UL;
Console.WriteLine($"[IntegerBitLevel] int: {s32}, uint: {u32}");
Console.WriteLine($"sbyte -1 raw bits: {Convert.ToString(s8, 2).PadLeft(8, '0')}");
}
Two’s complement refresher
For an N‑bit signed value, the raw bits represent:
value = raw_bitsif the sign bit is 0
value = -(2^N - raw_bits)if the sign bit is 1
So sbyte s8 = -1 has bits:
1111 1111 (0xFF)
Performance note
- On 32/64‑bit CPUs,
int(System.Int32) is the “natural” size for arithmetic. - The JIT often extends
byte/shortto 32 bits in registers anyway. - Smaller types help mainly with memory footprint & bandwidth (arrays, serialization, network packets).
💬 LLM prompt idea
“Given this
IntegerBitLevel()method, explain how the JIT extends smaller integer types to 32 bits in registers and whyintis usually the most efficient integer type for computation.”
5. Floating Point: IEEE‑754, Precision, and Why 0.1 + 0.2 ≠ 0.3
FloatingPointInternals() tackles the classic trap:
static void FloatingPointInternals()
{
double a = 0.1;
double b = 0.2;
double c = a + b;
Console.WriteLine($"[FloatingPointInternals] 0.1 + 0.2 = {c:R}");
}
Why is the result slightly off? Because double is IEEE‑754 binary64:
- 1 bit sign
- 11 bits exponent (biased)
- 52 bits fraction (mantissa)
Values like 0.1 and 0.2 are not exactly representable in base‑2, so the nearest representable numbers are stored, and their sum reflects that rounding error.
The code also inspects the raw bits:
long bits = BitConverter.DoubleToInt64Bits(c);
Console.WriteLine($"Bits of (0.1+0.2): 0x{bits:X16}");
Then it compares with decimal:
decimal d1 = 0.1m;
decimal d2 = 0.2m;
decimal d3 = d1 + d2;
Console.WriteLine($"decimal 0.1m + 0.2m = {d3}");
Tradeoff
-
double: hardware‑accelerated, very fast, but binary fractions. -
decimal: software‑implemented, slower, but base‑10 friendly and ideal for money.
💬 LLM prompt idea
“Using the
FloatingPointInternals()example, explain how IEEE‑754 binary64 encodes 0.1 and 0.2, and howdecimaldiffers in representation and performance. Include IL and CPU perspectives.”
6. Boolean: Just a Byte on Top of CPU Flags
BooleanSemantics() reminds us that bool is tiny on the surface but powerful in control flow:
static void BooleanSemantics()
{
bool flag = true;
Console.WriteLine($"[BooleanSemantics] flag = {flag}");
int x = 5, y = 10;
bool less = x < y;
Console.WriteLine($"x < y = {less}");
}
- In IL,
boolis defined as 1 byte. - In registers, it’s just a 0 or non‑zero integer.
- Comparisons like
x < yuse a CPU instruction (e.g.,cmp) that sets flags, which then drive conditional jumps.
Lesson: Boolean is a logical view on top of integer bits + CPU status flags.
7. char and string: UTF‑16, Interning, and Allocations
CharAndStringInternals() explores Unicode and string layout:
static void CharAndStringInternals()
{
char ch = 'C';
Console.WriteLine($"char: {ch}, code unit: {(int)ch}");
string s1 = "Hola C#";
string s2 = "Hola C#";
Console.WriteLine($"ReferenceEquals(s1, s2): {object.ReferenceEquals(s1, s2)}");
}
Key facts:
-
System.Charis a UTF‑16 code unit (16‑bit). - Strings are immutable, heap‑allocated, and can be interned.
- Identical string literals often share the same instance in the intern pool.
- Layout is roughly:
[object header][method table][int Length][chars...].
The demo also encodes to UTF‑8:
byte[] utf8 = Encoding.UTF8.GetBytes(s1);
Performance note: repeated string concatenation (
+in loops) produces many allocations. PreferStringBuilderor span‑based APIs for hot paths.
💬 LLM prompt idea
“Given
CharAndStringInternals(), explain the in‑memory layout ofstringin .NET, how interning works, and when that matters for performance and memory.”
8. Struct Layout & Padding: Why Field Order Can Matter
With [StructLayout(LayoutKind.Sequential)] you can see how padding works:
[StructLayout(LayoutKind.Sequential)]
struct PackedExample1
{
public bool Flag;
public double Value;
}
[StructLayout(LayoutKind.Sequential)]
struct PackedExample2
{
public double Value;
public bool Flag;
}
StructLayoutAndPadding() checks their sizes:
int size1 = Marshal.SizeOf<PackedExample1>();
int size2 = Marshal.SizeOf<PackedExample2>();
Console.WriteLine($"Size1 (bool,double) = {size1} bytes");
Console.WriteLine($"Size2 (double,bool) = {size2} bytes");
Due to alignment rules:
- A
doubleprefers an 8‑byte boundary. - The runtime may add padding bytes after
boolor at the end of the struct.
For arrays of these structs in hot loops, layout can affect:
- How many elements fit in a cache line.
- How many cache misses and TLB misses you incur.
This is a micro‑optimization, but in data‑heavy systems it can matter.
9. Enums: Type‑Safe Names over Raw Integers
Enums give you names but compile down to integers:
enum Status : byte
{
None = 0,
Started = 1,
Completed = 2,
Failed = 3
}
EnumUnderlyingTypes():
Status st = Status.Completed;
Console.WriteLine($"Status = {st}, raw = {(byte)st}");
IL wise, an enum is essentially a struct with an integral value__ field. At runtime:
- Comparisons on enums are as cheap as integer comparisons.
- Enums can use different underlying sizes to pack data tightly (e.g.,
bytefor small state machines).
💬 LLM prompt idea
“Using the
Status : byteenum, explain the IL shape of an enum type and how the underlying integral type affects memory layout and interop.”
10. Generics & Reification: Different JIT Code per Type
Finally, GenericSpecializationDemo() shows how .NET generics are reified:
class SimpleList<T> where T : struct
{
private T[] _items;
private int _count;
public void Add(T item) { ... }
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public T Sum()
{
dynamic sum = default(T);
for (int i = 0; i < _count; i++)
{
sum += (dynamic)_items[i];
}
return (T)sum;
}
}
And in the demo:
var listInt = new SimpleList<int>();
listInt.Add(10);
listInt.Add(20);
var listDouble = new SimpleList<double>();
listDouble.Add(3.14);
listDouble.Add(2.71);
The JIT will generate:
- Specialized machine code for
SimpleList<int>using integer registers. - Different specialized code for
SimpleList<double>using floating‑point registers.
This is a big reason why List<int> and List<double> are so efficient compared to “boxed” approaches.
⚠ The
dynamickeyword inSum()is only for demo purposes; in real numeric code you’d want a different pattern to avoid dynamic overhead.
11. Using This Mental Model to Get More from LLMs
Now the fun part: how do you use all this with LLMs like ChatGPT or Claude?
11.1 Ask Questions Across Layers
Instead of:
“Explain
doublein C#.”
Ask:
“Using this
FloatingPointInternals()method from myDataTypesDeepDive.cs, explain how Roslyn, IL, the JIT, and the CPU each see the value0.1, and why0.1 + 0.2is not exactly0.3.”
You’re forcing the model to traverse all abstraction layers.
11.2 Bring Your Code as Context
Paste parts of your file and say:
“Given this struct +
StructLayout.Sequential, draw the field offsets and padding in memory on x64, and reason about cache behavior for an array of a million of these structs.”
Now you’re not getting a generic answer — you’re getting feedback on your codebase.
11.3 Drill into IL and Assembly
Examples:
- “Show the IL for
GenericSpecializationDemo()and explain howSimpleList<int>andSimpleList<double>produce different machine code.” - “Show how the JIT might translate
x < yinto CPU flags and conditional branches.” - “Given this enum, show how it’s represented in memory when used inside a struct.”
11.4 Turn Questions into Experiments
Ask the model for microbenchmark designs:
“Design BenchmarkDotNet tests to measure the difference between
doubleanddecimalfor simple additions at scale, and predict the outcome before running.”
You’re using the model as both a teacher and a research collaborator.
12. Data‑Type Mastery Checklist (Top‑1% Developer Mindset)
Use this list to track your progress — and as a prompt basis when you want deep dives from LLMs.
- [ ] I can explain how C# data types map to IL stack types (I4, I8, R4, R8, O).
- [ ] I understand how value types vs reference types impact stack, heap, and registers.
- [ ] I can reason about integer types, two’s complement, and little‑endian byte order.
- [ ] I know how IEEE‑754 floating point works and when to use
decimalinstead ofdouble. - [ ] I can explain how
boolis implemented on top of CPU flags and integer registers. - [ ] I know the internal representation of
charandstring, and how interning affects memory. - [ ] I can reason about struct layout, padding, and how field order might affect cache usage.
- [ ] I understand enums as named integer values with well‑defined underlying types.
- [ ] I know that .NET generics are reified and that value types get specialized JIT code.
- [ ] I can ask LLMs questions that explicitly mention Roslyn, IL, JIT, CLR, stack, heap, and CPU behavior.
Once you think like this, you stop seeing int and string as “magic C# types” and start seeing them as bit patterns, layouts, and contracts that travel through a pipeline of compiler and runtime stages.
That’s when LLMs stop being copy‑paste generators and start becoming partners in systems‑level reasoning.
Happy hacking — and may your bits, bytes, and types always line up exactly how you expect. ⚡

Top comments (1)
Nice work — you took “C# data types” from a beginner topic and turned it into a systems-level view. The Roslyn → IL → JIT → CPU pipeline is explained clearly, and the examples make it practical, not theoretical. Loved the parts on struct padding, string internals, and generics reification — that’s the kind of detail most devs miss. It reads like a solid guide for anyone who wants to think beyond syntax and actually understand what the machine does