Furkan Kırat

Posted on Dec 11, 2025

The Unsafe Illusion: Benchmarking C# Pointers vs. Safe Arrays in Unity

#csharp #dotnet #gamedev #performance

The Unsafe Illusion: Why I Removed Pointers from My Performance Library

When I started building StructForge (a high-performance data structures library for Unity and .NET), I operated under a common assumption in the C# world:

"If you want maximum speed, turn off bounds checking and use unsafe pointers."

I was wrong. Or rather, my assumptions were outdated.

For the v1.4 update, I decided to benchmark my manual pointer arithmetic against standard .NET array access. The results from BenchmarkDotNet were surprising enough that they forced me to re-architect the entire library.

📺 The Video Analysis

I documented the entire engineering process, benchmarks, and the final architecture decision in this video breakdown:

(Note: If you prefer reading, the technical details and benchmarks are below.)

⚡ Part 1: The Performance Arsenal

Before diving into the "Safe vs Unsafe" discovery, let me explain what StructForge actually does. The library is built on three core pillars: Cache Locality, Bitwise Optimizations, and Zero-Allocation.

Here are the key structures I demonstrated in the video:

1. SfBitArray: 300x Faster with Intrinsics / SWAR

For voxel worlds, standard bool[] arrays are memory heavy and slow to process.

Optimization: On modern .NET 8, we utilize Hardware Intrinsics (PopCount). For older targets (like standard Unity profiles), I implemented a SWAR (SIMD Within A Register) fallback using the Hamming Weight algorithm.
Result: Operations like PopCount run 300x faster than native loops by processing 64 bits in parallel.

Figure 1: SWAR optimization crushing native boolean arrays.

2. SfGrid2D: 1D Flattened Layout

Native 2D arrays (int[,]) in C# can suffer from scattered memory layout.

Optimization: StructForge uses a flattened 1D linear array (_buffer[y * width + x]).
Ref Return: Crucially, the enumerators return by reference (ref T). This allows modifying large structs directly inside a foreach loop without any copying overhead (True Zero-Copy).

Figure 2: Cache locality improvements with 1D layout.

3. SfRingBuffer: Zero-Allocation Streaming

Designed for data streaming (logs/packets), it guarantees Zero-Allocation during enqueue/dequeue operations to prevent GC spikes in update loops.

Figure 3: Zero-allocation Ring Buffer performance.

4. SfEnumSet: Speed & Memory

Uses a BitArray backend to store enum flags. It provides O(1) operations while consuming 50% less memory than a standard HashSet<T>.

Figure 4: SfEnumSet provides faster insertions compared to standard HashSet.

📉 Part 2: The "Safe" Surprise (The Engineering Case Study)

This is where things got interesting. My initial implementation of the Grid and List systems relied heavily on Unsafe intrinsics (specifically Unsafe.Add to avoid GC pinning overhead). I assumed this was the only way to beat the overhead of .NET's bounds checking.

I ran benchmarks on .NET 8 (RyuJIT) and here is what happened.

A. List Random Access: The JIT Surprise

I assumed direct pointer access would be faster for random lookups. But benchmarking revealed that SfSafeList (using standard array access) was actually 13% faster than the Unsafe pointer implementation.

Why? The JIT compiler is smart enough to eliminate bounds checks in hot paths, generating cleaner machine code than my manual pointer arithmetic.

⚠️ Update: The Linearization Breakthrough (Span vs Indexers)

After discussing these results with the .NET Runtime team (huge thanks to @TannerGooding), I realized my initial benchmarks missed a crucial nuance: Linearization.

While standard array access is fast, using 2D indexers like grid[x,y] involves math overhead (y * width + x) inside the loop, which the JIT cannot always optimize away.

I ran a new set of benchmarks comparing:

Native 2D Array: int[,]
Manual Indexing: Get(x,y) with unsafe pointers.
Linear Span: foreach over grid.AsSpan().

The Result: Iterating via AsSpan() (Safe) was ~34% faster than Native 2D arrays and ~20% faster than my manual pointer indexing.

Why? Since SfGrid2D flattens data into a 1D array, exposing it as a Span<T> allows the JIT to eliminate bounds checks entirely and apply Auto-Vectorization (SIMD) on the linear memory block.

Figure 5: Safe code (Green) beating Unsafe pointers (Red) in random access.

B. BitArray Operations: The SWAR Power

For bitwise operations, I compared my custom SWAR PopCount (Safe context) against a manual unsafe loop iterating over bits.

The Result: The algorithmic optimization (SWAR) was 4.3x faster than the unsafe loop. This proves that better algorithms often beat "raw" unsafe code.

Figure 6: Custom SWAR Implementation (Safe) vs Manual Bit Loop (Unsafe).

🛠️ The Architecture Change (v1.4)

Based on this data, I released StructForge v1.4 with a Hybrid Architecture:

Safe Read / Native Write: I switched to standard array indexing for read operations to benefit from JIT safety and speed optimizations.
Hybrid Lists: I replaced manual pointer shifting with Array.Copy for bulk operations.
Strict Zero-Alloc: The core philosophy remained the same—preventing GC allocations in hot paths (Update loops).

💻 Code Example: Type-Safe & Fast

Here is how the new Safe Grid calculates indices. It maintains Cache Locality while being fully Type-Safe:

// 1. The Internal Storage (Flattened 1D Array)
private T[] _buffer;

// 2. Random Access (with Index Calculation)
// Note: AggressiveInlining is kept primarily for Unity IL2CPP aggressive optimization.
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public ref T GetUnsafeRef(int x, int y)
{
    // JIT handles bounds checks efficiently here
    return ref _buffer[y * _width + x];
}

// 3. The Performance King: Span Iteration
// Returns a linear span that JIT can linearize easily.
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public Span<T> AsSpan()
{
    return _buffer.AsSpan();
}

🏁 Conclusion: The Era of "Safe" Performance

The lesson I learned from this engineering journey is simple: Measure, don't guess.

While these benchmarks showcase the raw power of .NET 8 (RyuJIT), the architectural wins via—Spans, Linear Memory Access, Algorithmic Complexity (like Array.Copy vs loops), and SWAR—are universal. They provide massive gains in Unity’s IL2CPP pipeline just as they do in CoreCLR.

Writing unsafe code adds complexity, risk, and maintenance debt. In 2025, trusting the compiler often yields better results than trying to outsmart it.

🚀 Try StructForge v1.4

If you are ready to optimize your Unity projects, the library is open-source and waiting for you.

⭐ Star on GitHub: github.com/FurkanKirat/StructForge
📦 Download via NuGet: nuget.org/packages/StructForge
🎮 Install via OpenUPM: openupm.com/packages/com.kankangames.structforge

Discussion: Have you tested JIT bounds check elimination in your own projects? Let me know your results in the comments below! 👇

DEV Community