UltrafastSecp256k1 v3.3.0

#algorithms #blockchain #performance #security

Highlights
Batch operations 17-67x faster — all-affine fast path with Pippenger touched-bucket + window tuning (#169)
OpenCL generator mul ~10% faster — precomputed affine table with mixed J+A adds eliminates per-thread table construction
CUDA precomputed tweak tables — BENCH_CLOCK_WARMUP and simplified warmup path
Schnorr batch verify optimized — cached x-only pubkeys, reused scratch buffers, retuned crossover, fast path through N=64
463+ code-scanning alerts resolved — braces, const, widening, dead-stores, init-vars, argumentSize
Complete audit infrastructure — P0+P1+P2 audit TODO completed (#148)
Performance
Batch ops 17-67x faster via all-affine fast path; Pippenger touched-bucket + window tuning (#169)
OpenCL generator mul — hardcode precomputed affine table for scalar_mul_generator + force __NV_CL_C_VERSION
CUDA BIP352 — precomputed tweak tables, BENCH_CLOCK_WARMUP, simplified warmup
CUDA BIP352 benchmark optimization and enriched project graph
OpenCL GLV generator phi table optimization
OpenCL generator nibble lookup optimization
Silent payment scan invariants optimization
Coin HD fixed-path derivation optimization
Schnorr batch verify — cache repeated x-only pubkeys in large batches, reuse scratch buffers, retune crossover, reduce setup passes, keep fast path through N=64, tune cutoff for N=128, trim seed serialization overhead, cache x-only lifts in parse path, reuse SHA256 base for batch weights
Field batch inversion — trim scratch overhead
OpenCL batch-inversion kernels added
Added
OpenCL LUT primitives for generator multiplication (#172)
Metal scalar_mul_generator_lut for Metal shaders (#171)
Metal wNAF w=4 for Metal shaders (#158)
Metal scalar_mul_glv for batched scalar multiplications (#155)
Cached schnorr batch path and preflight coverage fixes
Benchmark cached schnorr batch verification
Larger batch verify benchmark sizes
Source graph pipeline command and tooling improvements
Security & Hardening
Wallet seed-to-address cleanup hardened
ABI secret cleanup paths hardened
ECIES zero-ephemeral cleanup hardened
N-03 CT path for message signing (constant-time)
Solinas reduction — replaced broken Barrett reduction with correct implementation (#141)
Fixed
ARM64 SHA-256 — vsha256h2q_u32 bug using modified abcd register
MSVC C2026 string literal limit workaround (#173)
precompute_point_multiples stack allocation fix; ASan timeout 300→600s
Metal generator_mul_batch — use scalar_mul_glv correctly (#163)
CI bip39 audit regression (#161)
Clang-tidy code scanning warnings (#170)
463+ code-scanning alerts resolved across 4 PRs (#154, #156, #157, #162)
CI auto-detect compilers + best-effort source graph refresh
SonarCloud — exclude hash_accel.cpp, address.cpp from CPD; exclude cuda/** and platform-specific field_asm/field_simd from coverage (#139, #140)
CI SECP256K1_MARCH respected in cpu/CMakeLists.txt; benchmark regression downgraded to x86-64-v2 (#138)
SonarCloud fork PRs skipped + continue-on-error for Quality Gate (#159)
Audit
Complete audit infrastructure — P0+P1+P2 audit TODO finished (#148)
Test coverage for CT PrivateKey overloads and FE52 conditional_negate (#143)

DEV Community

UltrafastSecp256k1 v3.3.0

Top comments (0)