DEV Community

Cover image for rscrypto v0.4.0: Verifying Constant-Time Behavior Instead of Assuming It
loadingalias
loadingalias

Posted on • Originally published at github.com

rscrypto v0.4.0: Verifying Constant-Time Behavior Instead of Assuming It

Every cryptography library says it's secure and performant.

Very few can explain how that security is validated and how that performance is proven after every change.

One of the easiest mistakes in cryptographic engineering is assuming code is constant-time because it looks constant-time. The source looks branchless. The review looks clean. The helper uses the right equality function. Then an optimization, a target specific lowering decision, an tiny refactor, or a new fast path changes the binary that actually runs. The maxim 'Don't roll your own crypto' exists for this reason, among many more.

That matters a great deal because perf work and side-channel resistance are not separate worlds. If you want a crypto crate to compete w/ native libs, you end up very close to the compiler, close to the CPU, and on top of target-specific behavior. That is exactly where "looks constant-time" stops being enough, and where early Reddit/HN feedback pushed me to make the evidence story explicit.

In rscrypto v0.4.0, I focused on turning constant-time behavior into release evidence instead of a style assertion or general promise... as well as adding the Wycheproofs for as many primitives as possible.

Anyway, this release is not "trust me, I wrote careful Rust." It's not "I read every line the LLM spit out". It's much more than that, and I hope it's the beginning of this crate earning its place in the community, in the industry.

The release is: here is the manifest, here are the harnesses, here are the artifacts, here are the timing checks, here are the binary checks where the tooling supports/allows them, and here are the places where the claim does not apply yet.

That is the story behind v0.4.0.

Why Does Constant-Time Matter?

Timing side channels are not theoretical. If secret data can influence control flow, memory addresses, table indexes, variable-time ops, alloc behavior, or failure shape, a caller may be able to learn something from timing... that something is a risk.

The classic examples:

  • a secret-dependent branch
  • a secret-dependent table lookup
  • an early return when a tag mismatch is found
  • a richer error path for one kind of verification failure

Cryptographic code has to be incredibly disciplined about those shapes. MAC verification, AEAD open, password verification, private-key ops, scalar multiplication, signing, and padding checks all need stricter treatment than ordinary app code.

I am going to assume, if you are reading this, that you likely already know all of this. The harder part is what happens after the source looks right.

What's Wrong With "It Looks Constant-Time"?

Constant-time is not a property I can declare in any meaningful way. It is a property I am going to continuously verify because this library is now core to the systems I am building.

That sound pedantic until you're shipping accelerated crypto across multiple architectures.

rscrypto is pure Rust, but it's not only scalar Rust. It currently ships portable fallbacks, SIMD, ASM, hardware-instruction paths, runtime dispatch, no_std builds, WASM builds, and native Linux/macOS evidence lanes. I will add Windows to the target matrix as soon as I can for all of you needing Windows validation. The current bench matrix covers Intel, AMD, AWS Graviton, IBM Z, IBM POWER10, RISC-V, and Apple Silicon.

That platform reach is valuable, but it makes constant-time validation much harder. An early Reddit commenter called this out, and when I sat down to sketch the actual validation plan, I nearly choked. I'd forgotten how finicky constant-time claims can actually be to validate.

A source review can miss what LLVM emits for one target. A timing test on x86-64 does not prove the same behavior on aarch64. A check that passes on Apple Silicon does not tell me what happens on IBM Z. A binary-level tool that works on Linux ELF does not automatically work on Mach-O, PE/COFF, WASM, bare-metal, s390x, or little-endian POWER. The Rust backend matters too.

The maintenance problem is just as real. Crypto code does not stay frozen, and this lbi is way too young to pretend that's realistic. Perf tuning moves thresholds. Dispatch changes. A helper gets reused. A public API grows a new error case. A refactor that is harmless for ordinary code can become a leak if it changes secret-dependent control flow or failure timing.

Code review helps, but code review is not a release gate by itself. LLMs help find suspicious shapes, but they are even less of a release gate, IMO.

So I built v0.4.0 to treat constant-time as an evidence pipeline:

  • define exactly what is secret and what is public
  • define exactly which target/config is being claimed
  • build stable harnesses for CT-critical entrypoints
  • generate the compiled artifacts
  • scan them
  • run empirical timing checks
  • run binary symbolic checks where the toolchain supports the target
  • fail closed when required evidence is missing

That is more work than saying "constant-time." It is also the only version of the claim I trust, and the only version I expect serious users to accept while I figure out what a FIPS-ready path could realistically look like here.

The Three CT Gates

The machine-readable source of truth is ct.toml. The policy lives in docs/constant-time.md. The tooling lives under tools/ and scripts/ct/.

The hard local gate is:

just ct-full
Enter fullscreen mode Exit fullscreen mode

That builds CT artifacts, validates manifest coverage, runs DudeCT cases, runs BINSEC where the target policy requires it, and emits release-style reports.

Gate 1: Artifact Review + Heuristic Scans

The first gate is deliberately simple.

It builds stable CT harnesses from tools/ct-harness, captures provenance, emits LLVM IR, assembly, object disassembly, symbol maps, artifact hashes, and evidence indexes, then validates the result against the manifest.

The scripts involved include:

This catches boring but dangerous problems:

  • missing artifacts
  • missing manifest coverage
  • unexpected calls
  • suspicious branches
  • suspicious indexed loads
  • panic or allocation paths in CT-critical leaves
  • changed symbols that need review

This gate does not prove constant-time behavior. It gives me and reviewers the compiled evidence and blocks clear regressions.

That matters because the source is obviously not the artifact that runs in prod. The binary is.

Gate 2: DudeCT Validation

The second gate is empirical timing evidence.

tools/ct-dudect runs fixed-vs-random or valid-vs-invalid timing tests for manifest-declared cases. scripts/ct/dudect.sh runs the cases, and scripts/ct/dudect_report.py normalizes the output.

DudeCT is useful because it catches a different class of failure than static review. If the compiled code or the platform creates a measurable timing distinction between the two classes, the release gate should fail.

The wording here matters:

No leakage detected for this configuration.
Enter fullscreen mode Exit fullscreen mode

That is not the same as "proved constant-time forever."

DudeCT is statistical evidence for a specific configuration. The exact compiler, LLVM backend, target triple, CPU/features, profile, panic mode, enabled features, dependency lockfile, and physical runner matter here.

If a required DudeCT case is missing, times out, crashes, or crosses the configured leakage threshold, ct-full fails.

Gate 3: BINSEC

The third gate is binary-level symbolic checking with BINSEC's checkct mode.

tools/ct-binsec-harness exposes small CT-critical kernels. scripts/ct/binsec.py runs BINSEC directly rather than delegating the policy to a generic wrapper.

This is a direct integration. The rscrypto manifest owns the kernels, target policies, assumptions, required status, and evidence artifacts. A wrapper cargo check-ct was the wrong abstraction here; the release claim has to be owned by the crate manifest.

BINSEC is aimed at small, analyzable leaf kernels: constant-time equality, secret selection, Poly1305/GHASH/POLYVAL style leaves, AES round leaves, Ascon tag paths, Curve25519 helpers, Ed25519 digit helpers, and bounded RSA private-operation helpers.

It is not used as a "whole public API" verifier. A full API often includes parsing, allocation, dispatch, conversion, and error handling. Those paths still need review, artifacts, heuristics, tests, fuzzing, and timing evidence, but they are not always tractable symbolic execution targets.

For Linux x86-64 and Linux aarch64 lanes, BINSEC is part of the current CT evidence where the workflow supports the object/ISA path.

For some other lanes, it is not possible today. RISC-V, s390x, little-endian POWER, macOS Mach-O, Windows PE/COFF, WASM, and bare-metal all need separate support, reduced kernels, different object handling, or engine specific evidence before I will make the same bin check claim. BINSEC alone is not enough there, and in some cases it is not compatible with the object format or ISA path yet. I will continue to explore and work on it.

Why Multiple Independent Checks Exist

Security validation should fail closed.

No single gate is enough.

Artifact review catches missing coverage and suspicious compiled shapes. DudeCT catches timing behavior the static pass may not understand. BINSEC gives stronger binary evidence for small kernels where symbolic analysis is practical.

The overlap is intentional. I do not want one green check to create false confidence.

I also do not have a FIPS lab budget. rscrypto is not a FIPS 140-3 validated module; it's not been audited by a third party. Pretending otherwise would be bullshit, and worse, technically useless. I am obviously really interested in changing this, but this is what I can do in the meantime:

Build the kind of evidence pipeline a serious review needs:

  • exact claims
  • exact targets
  • exact binaries
  • reproducible artifacts
  • current benchmark data
  • public limitations
  • release gates that fail when evidence is missing

If a formal validation path or third-party audit becomes available, this work makes that conversation cleaner... but it obviously doesn't replace it.

What v0.4.0 Means

rscrypto v0.4.0 is the first release where I am comfortable leading with the constant-time evidence story instead of treating it as a paragraph in the README.

This release adds or hardens:

  • ct.toml as the source of truth for CT-critical surfaces, target policy, DudeCT cases, and BINSEC kernels
  • just ct-full as the local full evidence gate
  • .github/workflows/ct.yaml for hosted CT evidence lanes
  • artifact/provenance capture, LLVM IR, assembly, object disassembly, symbol maps, and heuristic scans
  • manifest-driven DudeCT cases
  • manifest-driven BINSEC kernel rows where the target/tooling supports them
  • a dedicated RSA gate through .github/workflows/rsa.yaml, including Miri and leakage regression artifacts
  • weekly validation covering the broader quality stack: normal CI, feature matrix, Miri, fuzzing, fuzz corpus replay, CT, and coverage

The current public CT boundary is intentionally narrow:

  • Linux x86-64: artifact/provenance review, LLVM IR/ASM/object heuristics, DudeCT, and BINSEC on AMD Zen4, AMD Zen5, Intel Ice Lake, and Intel Sapphire Rapids
  • Linux aarch64: artifact/provenance review, LLVM IR/ASM/object heuristics, DudeCT, and BINSEC on AWS Graviton3 and Graviton4
  • Linux riscv64gc: artifact/provenance review, LLVM IR/ASM/object heuristics, and DudeCT on the RISE RISC-V lane; BINSEC is not claimed today
  • Linux s390x: artifact/provenance review, LLVM IR/ASM/object heuristics, and DudeCT on the IBM Z lane; BINSEC is not claimed today
  • Linux powerpc64le: artifact/provenance review, LLVM IR/ASM/object heuristics, and DudeCT on the IBM Power10 lane; BINSEC is not claimed today
  • macOS aarch64: local artifact/provenance review, LLVM IR/ASM/object heuristics, and DudeCT through just ct-full; BINSEC is not claimed for Mach-O today

WASM, no_std bare-metal, Windows, Linux MUSL, macOS x86_64, GCC codegen, and Cranelift are not part of the current native CT release claim. Some build today. Some have artifact-only or intended coverage. They still need their own evidence before I will claim constant-time behavior for them. I DO plan to add GCC and Cranelift coverage as time allows.

Perf!

Constant-time validation is not an excuse to be slow. In fact, it is motivation not to be.

The current bench overview is generated from raw results (an LLM generates this document programmatically) and includes the losses next to the wins:

https://github.com/loadingalias/rscrypto/blob/main/benchmark_results/OVERVIEW.md

For the 06/09/2026 v0.4.0 bench evidence:

  • Linux, fastest-external: 3,958 wins out of 6,525 matched comparisons
  • Linux, wins-or-ties: 5,878 out of 6,525 matched comparisons
  • Linux, fastest-external geomean: 1.60x
  • Apple Silicon, fastest-external: 373 wins out of 725 matched comparisons
  • Apple Silicon, wins-or-ties: 693 out of 725 matched comparisons
  • Apple Silicon, fastest-external geomean: 1.41x

Some category numbers:

  • Checksums: 2.52x Linux fastest-external geomean
  • Hashes/MACs/XOFs: 1.41x
  • Auth/KDF: 1.24x
  • RSA: 1.55x
  • AEAD: 1.56x

Fastest External = aws-lc-rs, ring, dryoc, dalek, crc-fast, RustCrypto, SHA2/3, Blake3, and others.

And the weak spots are still public too: Linux scrypt/password hashing, Linux ChaCha20-Poly1305 rows against ring and aws-lc-rs, near-parity Ed25519/X25519 ops, RISC-V checksum point losses, and Apple Silicon SHA3/XXH3 plus localized BLAKE3/CRC rows.

I will continue to push the envelope here. My current work relies on this crate being small, fast, and safe... that work is my life.

What rscrypto Is Today

rscrypto is a pure-Rust primitive stack.

It includes hashes, MACs, KDFs, password hashing, RSA, Ed25519, X25519, AEADs, CRCs, fast non-cryptographic hashes, no_std, WASM builds, portable fallbacks, and hardware accel in one crate.

Use one primitive:

[dependencies]
rscrypto = { version = "0.4.0", default-features = false, features = ["sha2"] }
Enter fullscreen mode Exit fullscreen mode

Or the full primitive stack:

[dependencies]
rscrypto = { version = "0.4.0", features = ["full", "getrandom"] }
Enter fullscreen mode Exit fullscreen mode

The primitive stack does not depend on OpenSSL, C FFI, RustCrypto, dalek, the blake3 crate, or the crc-* crates. Optional integrations such as getrandom, serde, and rayon are opt-in.

In my own codebase, replacing the pile of primitive deps w/ rscrypto removed fifteen dependencies and C-libs. It brought my compile times down drastically in conjunction w/ cargo-rail usage. That is not a universal promise, but it is the kind of consolidation this crate is designed to make possible: one feature matrix, one dispatch model, one security policy, one benchmark story. This is also my ethos as of late. I am not willing to play supply-chain roulette.

Remember, this is not a protocol stack. It is not TLS, PKI, key management, or a FIPS module.

What Comes Next

The roadmap is focused on building trust for the time being:

  • broader algo coverage where the primitive belongs in this crate
  • reviewed artifact hashes and artifact-diff review for CT-critical symbols
  • more formal and semi-formal verification experiments
  • third-party audit work before v1 is really important to me
  • FIPS-ready structure w/o pretending to be FIPS validated
  • better CT evidence for WASM and no_std, w/ engine/hardware-specifics
  • GCC and Cranelift CT backends after the LLVM evidence remains stable for a bit
  • platform expansion where real hardware and evidence can back the claim
  • post-quantum primitives only with portable constant-time baselines, official vectors, malformed-input tests, and benchmark coverage

The rules will remain the same: If a feature cannot be validated, benched, documented, and maintained, it does not belong in the public library yet.

Feedback Wanted

If you are building systems where cryptographic correctness matters, or where performance of the primitives gives you a meaningful edge - I would like feedback on the approach, tooling, and validation methodology.

Especially:

  • the CT manifest shape
  • the DudeCT case model
  • the BINSEC kernel boundary
  • the target/platform claims
  • the benchmark methodology
  • the API shape before v1
  • migration blockers from RustCrypto, blake3, crc-fast, crc32fast, ring, aws-lc-rs, or other primitive crates

Links:

Discussions, issues, benchmarks, and hard technical criticism are welcome. Do not be shy. I have thick skin.

Top comments (0)