Yu Qian Yang

Posted on Apr 15 • Originally published at Medium

How I Used Bit Manipulation to Speed Up Float-to-Int Conversion in a Storage Engine

#cpp #performance #database #algorithms

In a columnar time-series database, one of the most effective compression tricks is
deceptively simple: if a float value is actually an integer, store it as one.

Why Integers Compress Better Than Floats

Integer compression algorithms like Delta-of-Delta, ZigZag, and Simple8b work by
exploiting predictable bit patterns — small deltas between adjacent values, values
that fit in fewer than 64 bits, and so on. They can pack multiple values into a
single 64-bit word.

Floats don't cooperate with these schemes. Even 1.0 and 2.0 have completely
different IEEE 754 bit representations (0x3FF0000000000000 and 0x4000000000000000).
Their XOR is large, their delta is meaningless as an integer, and bit-packing is useless.

So when a column is declared as FLOAT but actually contains values like 12.0,
18.0, 25.0 — which happens more often than you'd expect, either because the schema
was designed generically or because the upstream system always emits .0 values — you're
leaving significant compression headroom on the table.

The fix: detect these integer-valued floats at encode time, convert them losslessly
to integers, and route them through the integer compression path.

A temperature sensor that reports 21.0, 21.5, 22.0 is a good example. Multiply
by 10 and you get 210, 215, 220 — plain integers with small, predictable deltas.
Delta-of-Delta or Simple8b will compress these far more efficiently than any
float-specific scheme.

The challenge: before converting, you need to check whether the scaled value can be
losslessly represented as an integer. The naive check — std::isnan + range comparison —
works but it's slower than it needs to be on the hot encoding path.

Here's the faster approach I implemented, using nothing but bit manipulation.

The Setup: Scaling Floats to Integers

The encoding scheme works in two steps:

Scale: multiply the float by 10^scale (configurable per column)
Convert: cast the scaled value to integer using std::lround

For example, with scale = 2:

1.23 → 1.23 * 100 = 123.0 → 123
45.678 → 45.678 * 100 = 4567.8 → overflow risk or precision loss

Step 2 only makes sense if the scaled value actually fits in the target integer type.
That's the overflow check.

The Overflow Check

The function takes a pointer to the raw float bytes and the target integer width in bytes.
It returns non-zero if the value would overflow.

Called before every conversion — if it fires, skip the integer path and fall back to
float encoding.

The key insight: you can determine whether a float overflows a given integer type
purely from the float's exponent bits, without doing any arithmetic.

Here's why.

IEEE 754 in One Paragraph

A double-precision float is stored as 64 bits:

[ sign: 1 bit ][ exponent: 11 bits ][ fraction: 52 bits ]

The value is: 1.fraction × 2^(exponent − 1023)

The 1023 is the bias — it allows the 11-bit exponent field to represent negative
exponents. The real exponent is stored_exponent − 1023.

For 32-bit floats: 8 exponent bits, bias 127, fraction 23 bits.

Extracting the Exponent

For a double:

uint64_t bits;
memcpy(&bits, src, 8);                        /* safe type-pun, no UB */
int16_t real_exp = (int16_t)((bits >> 52) & 0x07ff) - 1023;

Step by step:

memcpy into a uint64_t — reinterpret the 8 bytes as a 64-bit integer (no arithmetic, just bits)
>> 52 — shift right past the 52 fraction bits, bringing the exponent to the low end
& 0x07ff — mask off the sign bit, keep only the 11 exponent bits
- 1023 — subtract the bias to get the real exponent

For a float:

uint32_t bits;
memcpy(&bits, src, 4);
int16_t real_exp = (int16_t)((bits >> 23) & 0xff) - 127;

Same logic: shift past 23 fraction bits, mask 8 exponent bits, subtract bias 127.

The Overflow Condition

Once you have the real exponent, the overflow check is one comparison:

is_overflow = real_exp > int_typewidth * 8 - 2;

Where does - 2 come from?

−1 for the sign bit: a signed integer of N bits can hold values up to 2^(N-1) - 1
−1 for the implicit leading 1: in IEEE 754, the fraction is 1.fraction, not 0.fraction

So a float with real exponent E represents a value with E + 1 significant bits
(the implicit 1 plus E fraction bits). For it to fit in a signed N-bit integer, you
need E + 1 ≤ N - 1, which simplifies to E ≤ N - 2.

Full implementation in C:

#include <stdint.h>
#include <string.h>

/* Returns 1 if the double at src overflows a signed integer of int_bytes bytes. */
static inline int double_overflow_check(const char *src, int int_bytes)
{
    uint64_t bits;
    memcpy(&bits, src, 8);
    int16_t real_exp = (int16_t)((bits >> 52) & 0x07ff) - 1023;
    return real_exp > int_bytes * 8 - 2;
}

/* Returns 1 if the float at src overflows a signed integer of int_bytes bytes. */
static inline int float_overflow_check(const char *src, int int_bytes)
{
    uint32_t bits;
    memcpy(&bits, src, 4);
    int16_t real_exp = (int16_t)((bits >> 23) & 0xff) - 127;
    return real_exp > int_bytes * 8 - 2;
}

Total cost: one memcpy, one shift, one AND, one subtract, one compare.
No floating-point arithmetic, no branches on the value itself.

How It Fits into the Encoder

The encoder scales the value first, then calls the overflow check on the scaled result:

double scaled = orig * scaler;               /* scale: e.g. orig * 100.0 */
if (double_overflow_check((char *)&scaled, sizeof(int64_t)))
    return ENCODE_OVERFLOW;                  /* fall back to float encoding */

int64_t result = llround(scaled);            /* safe: overflow already ruled out */

The scale factor is stored in the column header so the decoder can reverse the
operation: decoded = (double)stored_integer / pow(10, scale).

Why Not Just Use `std::isnan` + Range Check?

The conventional approach:

if (std::isnan(value)) return false;
if (value > INT64_MAX || value < INT64_MIN) return false;
return true;

This involves floating-point comparisons, which on many architectures require the
value to be loaded into a float register before comparison. On a hot encoding path
processing millions of values, the difference adds up.

The bit manipulation approach operates entirely on integer registers. The float's
bytes are reinterpreted as an integer — no floating-point unit involved until the
final std::lround conversion, which only happens when you've already confirmed
no overflow.

What This Enables

This check is the entry gate for the full encoding chain:

float column
    ↓
check_float_overflow   ← this article
    ↓ (passes)
float → integer cast
    ↓
Delta+ZigZag encoding
    ↓
Simple8b bit-packing

Without a cheap overflow gate, the chain can't run on untrusted float data. With it,
each value costs one check before entering the integer compression path — which can
achieve far better compression ratios than float-specific schemes on "integer-like"
time-series data.

What's Next

This article is part of a series on compression engineering in time-series databases:

Part 1: Runtime adaptive compression — how the system selects the best algorithm without scanning all data (published)
Part 3: Chained encoding — the full float-to-integer → Delta+ZigZag → Simple8b pipeline
Part 4: An improved floating-point compression algorithm based on ELF

I'm currently available for freelance work on backend systems, storage engineering,
and systems integration. Feel free to reach out.

Top comments (3)

mote • Apr 15

Great breakdown of the three memory types -- the episodic/semantic/procedural taxonomy maps cleanly onto how long-term agent systems actually behave in practice.

The hybrid search approach (dense embeddings + keyword via RRF) is something I've seen underappreciated in the agent memory space. Pure vector search breaks down exactly when you need it most: precise recall of specific identifiers (order IDs, model names, timestamps). RRF gives you the best of both without a separate full-text index service -- nice to see Oracle doing this natively.

A few things I'm still thinking through after reading this:

On the ONNX in-DB embedding approach: Computing embeddings at insert time inside the database is elegant for consistency, but what happens when you swap embedding models? You'd need to re-embed everything already stored. Is there a versioning story here, or do you basically treat it as a hard migration?

On procedural memory via tool call history: This is the part most frameworks get wrong. Storing which tools were called is useful, but the harder problem is storing why they were called -- the agent's internal rationale. Have you experimented with also persisting the chain-of-thought or the intermediate reasoning step that triggered a tool call? It would make debugging agent failures much more actionable.

On scale/deployment assumptions: This setup assumes reliable cloud connectivity, which works well for cloud-native agents. But I've been working on a different constraint: robots and edge devices that need agent memory to work offline-first. The moment you have intermittent connectivity, a Spring AI + Oracle architecture needs a local persistence fallback. For those scenarios, I've been using moteDB (an embedded Rust database designed for embodied AI), which lets the agent maintain its memory layers locally and sync when connectivity is restored.

What's the failure mode when Oracle DB is unreachable mid-conversation? Does the agent degrade gracefully, or does it error out?

mote • Apr 15

This is a genuinely clever optimization, and the temperature sensor example makes it immediately tangible. I've hit the exact same issue in a different context -- embedded time-series storage for robotics, where sensor readings are almost always "floats that are secretly integers."

A few thoughts from the trenches:

On Delta-of-Delta compression after float-to-int conversion: the real win isn't just the compression ratio -- it's the branch prediction friendliness. Delta-of-Delta encoding produces a distribution heavily skewed toward small positive values (since sensor readings change gradually). Modern CPUs predict this pattern extremely well, making decompression surprisingly cache-friendly too. We saw roughly a 3x throughput improvement on deserialization just from the better access patterns, independent of the compression itself.

On the NaN/infinity bit manipulation check: you mentioned the fast path for "is this float safely convertible to int." Worth noting that this check is also the bottleneck in the reverse direction -- when you decompress and need to cast back to float. If you're processing millions of rows, std::lround on the decode path can be surprisingly expensive compared to a simple reinterpret_cast. We ended up caching the original scale factor and using a multiply + reinterpret approach that was measurably faster on ARM Cortex-A chips, where lround has variable latency depending on rounding mode.

On the "schema was designed generically" problem: this resonates deeply. We actually made this configurable per-column in our storage engine -- each sensor column stores its detected scale factor, so the encoder/decoder can automatically pick the optimal bit-width. The downside is metadata overhead, but for workloads with millions of rows per column, it pays for itself within the first compression window.

Have you benchmarked how the integer compression path (Delta-of-Delta + Simple8b) compares against purpose-built float compression schemes like Gorilla compression? In my experience, Gorilla wins for pure float streams, but the moment you can detect integer-valued floats, the integer path is consistently 20-40% better. Curious if your numbers align.

Yu Qian Yang • Apr 15

Good point. Gorilla's XOR-based encoding assumes temporal locality — adjacent float values differ only in

low-order bits. That holds for metrics like CPU usage or temperature, but breaks down quickly for floats with
high entropy (e.g. raw sensor readings without smoothing).

In our case, testing on real electric vehicle telemetry data, Gorilla underperformed significantly compared

to what the Facebook paper suggested. The XOR delta distribution was nearly uniform — no clustering near zero
— so the leading/trailing zero compression gave almost no gain, and the encoding overhead made it net

negative vs simpler schemes.

We did experiment with modifications to the XOR step to handle this case better. Happy to go into the details
if you're interested.