I’ve been building systems for a while now, and if there’s one trend in modern software engineering that drives me crazy, it’s the default reaction to "we have data."
Need to log AI agent telemetry, financial ticks, or server metrics? The modern playbook says: spin up a massive Docker container, deploy PostgreSQL, install the TimescaleDB extension, configure connection pools, and pull in a heavy ORM.
TimescaleDB is an incredible piece of engineering—it bridges the gap between fast row-based ingestion and compressed columnar analytics. But why do we have to cross a network boundary, suffer IPC overhead, and serialize data just to do it?
C# and .NET 10 are absolute weapons for data analysis. We don't need to default to heavy database servers or Python for this.
So, I built Glacier.Chrono: an embedded, zero-allocation, in-process time-series database in pure C#. It mirrors the hybrid row-to-columnar architecture of TimescaleDB, implements Facebook's Gorilla compression, and hits over 2 Billion values per second with exactly zero heap allocations.
Here is how I bypassed the bloat and built it to run on the metal.
The Architecture: Row-to-Columnar Hybridity
To build a time-series engine, you have to solve two conflicting problems:
- Ingestion needs to be row-based (Array of Structs) so you can blast data into memory instantly.
- Analytics needs to be columnar (Struct of Arrays) so you can run SIMD instructions across continuous blocks of a single data type without thrashing the CPU cache.
Here is how Glacier.Chrono handles it:
1. The Hot Ingest (Zero-Allocation)
Data is ingested into a pre-allocated, lock-free HotRingBuffer<T>. We restrict T to unmanaged C# structs (using [StructLayout(LayoutKind.Sequential)]). Writing to the database is literally just advancing an Interlocked.Increment pointer and writing raw bytes. Multiple threads can blast telemetry at it simultaneously with zero lock contention and zero garbage collection.
2. The Pivot
When a chunk hits 10,000 rows, a background thread performs a matrix transpose. It takes the row-based data and slices it into columnar Span<T> buffers.
3. The Compression Engine
This is where the magic happens. We apply data-type-specific algorithms directly to the Span<T> buffers:
- Timestamps: Delta-of-Delta (DoD). If you log every second perfectly, the DoD is 0. We pack thousands of timestamps into a handful of bits.
- Floats: Facebook Gorilla XOR. We XOR consecutive floats and strip the zeros, compressing slowly changing metrics like CPU usage by 90%+.
- State / Enums: Run-Length Encoding (RLE). 5,000 consecutive "Running" states become a tiny [Value: 1, Count: 5000] tuple.
The 64-Bit Accumulator Trick
To get these algorithms to scream, I couldn't use standard bit-by-bit while loops.
Instead, Glacier.Chrono uses a 64-bit CPU register accumulator trick. We accumulate bits directly into a ulong register and only write bytes to the managed memory array when the register is full. This single mechanical sympathy optimization provided a 19x speedup over standard bit-packing logic.
The Raw Numbers
I ran BenchmarkDotNet on a modern CPU with AVX-512 extensions using .NET 10. The results are frankly absurd for a managed language.
Notice the "Allocated" column.
| Phase | Mean Execution | Allocated Memory | Throughput |
|---|---|---|---|
| Hot Ingest (10k rows) | 138.83 μs | 0 B (Steady State) | ~72.0M writes/sec |
| Gorilla Float Compress | 33.27 μs | 0 B | 300.5M values/sec |
| Delta-of-Delta Compress | 4.95 μs | 0 B | 2.01B values/sec |
| SIMD Query Engine | 325.96 μs | 577 B (OS Handles) | 30.7M records/sec |
We are compressing over 2 Billion timestamps per second without touching the Garbage Collector once.
The "No-ORM" Query Engine (Powered by Source Generators)
In a traditional database, you'd send a dynamic query string like SELECT AVG(CpuTemp) FROM metrics WHERE ServerId = 1. But parsing a SQL string, building an Abstract Syntax Tree, and dynamically matching columns at runtime requires memory allocations and branching logic—which completely violates our zero-allocation philosophy.
So instead of a dynamic query engine, Glacier.Chrono uses C# Source Generators.
You define your custom telemetry schema, mark it with [ChronoTable], and specify how you want each field compressed:
[ChronoTable]
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct SystemMetric : IComparable<SystemMetric>
{
[Timestamp] // Delta-of-Delta
public long Time;
[Metric] // Gorilla XOR
public float CpuTemp;
[Category] // Run-Length Encoding
public int ServerId;
public int CompareTo(SystemMetric other) => Time.CompareTo(other.Time);
}
At compile time, Roslyn analyzes your struct and automatically emits a dedicated, zero-allocation compactor and a custom SIMD query engine directly into your project.
When you want to query the data, you use the strongly-typed, auto-generated methods. Glacier.Chrono maps only the specific columns you need directly into the OS virtual address space via MemoryMappedFile projection, ignoring the timestamp and unrelated columns completely.
// The Source Generator automatically created 'GetAverageCpuTempForServerId'
double avgCpuTemp = SystemMetricQueryEngine.GetAverageCpuTempForServerId(
chunkFilePath: "./data/chunk_0.glacier",
targetServerId: 1,
queryBuffers
);
You get a beautiful, strongly-typed developer experience with autocomplete, but under the hood, the compiler has emitted the exact, bare-metal AVX-512 vector loops for those specific columns.
The Takeaway
The data analysis and AI telemetry space has been entirely dominated by heavy database servers, Python wrappers, and massive C++ frameworks.
But .NET 10, combined with C# Source Generators, unmanaged memory, and hardware intrinsics, proves that C# is an absolute heavyweight contender. You don't need a heavy Docker-bound stack to get world-class, column-compressed time-series analytics. You just need good, native, mechanically sympathetic engineering.
Glacier.Chrono is open-source and part of the Glacier high-performance storage suite.
👉 Check out the repo here: github.com/ian-cowley/Glacier.Chrono
Top comments (0)