TL;DR
Built a PostgreSQL extension that creates UUIDs looking like random v4 but containing hidden timestamps. Result: 50% smaller indexes with perfect privacy protection.
Available now on PGXN: pgxn install pg_uuid_v8
The Problem Every Backend Dev Faces 🤔
You know this pain:
- UUID v4: Random, secure, but terrible for DB indexing (fragmentation nightmare)
- UUID v7: Great for indexing, but reveals creation timing (privacy issues)
For years, we've been stuck choosing between performance and privacy.
The Solution: Steganographic UUIDs 🔮
What if UUIDs could be both random AND fast? Enter steganography - hiding encrypted timestamps inside random-looking UUIDs.
-- Looks like normal UUID v4
SELECT uuid_v8_generate();
-- bf3fcf45-9476-4138-bf48-03933d90dc2d
-- But contains hidden timestamp!
SELECT uuid_stego_extract_timestamp('bf3fcf45-9476-4138-bf48-03933d90dc2d');
-- 1714127712849302 (microseconds since epoch)
Technical Deep Dive 🛠️
UUID Structure
Standard v4: xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
Steganographic: TTTTTTTT-TTTT-4RRR-yRRR-RRRRRRRRRRRR
T = Encrypted timestamp bits (48 bits)
R = Random bits
4 = Version marker (v4 compliant)
Encryption Algorithm
// XOR encryption with SHA-256 key derivation
uint64 encrypt_timestamp(uint64 timestamp, const char* seed) {
uint64 key = sha256_derive_key(seed);
return (timestamp ^ key) & 0xFFFFFFFFFFFFULL;
}
PostgreSQL C Extension Implementation
PG_FUNCTION_INFO_V1(uuid_v8_generate);
Datum uuid_v8_generate(PG_FUNCTION_ARGS) {
pg_uuid_t *uuid = palloc(sizeof(pg_uuid_t));
uint64 timestamp = get_current_timestamp_us();
uint64 key = generate_key_from_seed(stego_seed);
uint64 encrypted = crypt_timestamp(timestamp, key);
// Embed in first 48 bits + random fill
embed_in_uuid(uuid, encrypted);
set_uuid_v4_bits(uuid);
PG_RETURN_UUID_P(uuid);
}
Performance Results 📈
Tested on 500k records:
| Index Type | Size | Performance |
|---|---|---|
| UUID v4 Full | 3.1 MB | Sequential Scan |
| Stego Functional | 1.5 MB | Index Scan |
-- The magic: functional indexing
CREATE INDEX events_time_idx ON events
USING btree (uuid_stego_extract_timestamp(id));
-- Fast time-based queries
EXPLAIN ANALYZE SELECT * FROM events
WHERE uuid_stego_extract_timestamp(id) BETWEEN start_ts AND end_ts;
-- Index Scan using events_time_idx (fast!)
Getting Started (Now Available on PGXN!) 📦
Easy Installation
# Install from PGXN (recommended)
pgxn install pg_uuid_v8
# Or build from source
git clone https://github.com/ineron/pg_uuid_v8
cd pg_uuid_v8
make && sudo make install
Setup
CREATE EXTENSION pg_uuid_v8;
SELECT uuid_v8_set_seed('your_secret_2024');
CREATE TABLE events (
id uuid PRIMARY KEY DEFAULT uuid_v8_generate(),
data jsonb
);
CREATE INDEX ON events (uuid_stego_extract_timestamp(id));
Queries That Actually Work
-- Efficient pagination
SELECT * FROM events
WHERE uuid_stego_extract_timestamp(id) > last_timestamp
ORDER BY uuid_stego_extract_timestamp(id) LIMIT 100;
-- Time range analytics
SELECT count(*) FROM events
WHERE uuid_stego_in_range(id, '2024-01-01', '2024-12-31');
The Secret Sauce: Functional Indexes 🔑
PostgreSQL's functional indexes are the key. Instead of indexing the full 16-byte UUID, we index the extracted 8-byte timestamp:
-- Traditional approach (large index)
CREATE INDEX ON table (uuid_column); -- 16 bytes per entry
-- Steganographic approach (compact index)
CREATE INDEX ON table (uuid_stego_extract_timestamp(uuid_column)); -- 8 bytes per entry
Result: 50% space savings + Index Scan performance!
Security & Encryption Modes 🔒
The extension supports multiple encryption modes:
-- Fast privacy protection (default)
SELECT uuid_v8_set_encryption_mode('XOR');
-- GDPR/compliance ready
SELECT uuid_v8_set_encryption_mode('AES128');
-- Maximum security
SELECT uuid_v8_set_encryption_mode('AES256');
Threat Model
✅ Prevents: Timing analysis attacks, creation pattern discovery
✅ Protects: Privacy-sensitive applications (healthcare, finance)
⚠️ Note: XOR mode provides privacy, AES modes provide cryptographic security
Real-World Use Cases 🎯
Perfect For:
- High-volume APIs (millions of records daily)
- Privacy-sensitive systems (healthcare, finance)
- Multi-tenant platforms (SaaS applications)
- Time-series data with privacy requirements
Migration Example:
-- Existing UUID v4 tables
ALTER TABLE existing_table
ADD COLUMN new_id uuid DEFAULT uuid_v8_generate();
-- Create functional index
CREATE INDEX ON existing_table (uuid_stego_extract_timestamp(new_id));
-- Gradually migrate queries
Performance Deep Dive 📊
Index Size Analysis
-- Compare index sizes
SELECT
indexname,
pg_size_pretty(pg_total_relation_size(indexname::regclass)) as size
FROM pg_indexes
WHERE tablename = 'test_table';
-- indexname | size
-- ------------------------------+--------
-- test_table_uuid_idx | 3.1 MB
-- test_table_stego_func_idx | 1.5 MB
Query Performance
-- Query plan comparison
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM large_table
WHERE uuid_stego_extract_timestamp(id) BETWEEN start_ts AND end_ts;
-- Result: Index Scan using large_table_stego_time_idx
-- Buffers: shared hit=3, Execution Time: 0.040 ms
Lessons Learned 📚
What Worked:
- Functional indexes are PostgreSQL's superpower
- Standard compliance (UUID v4) ensures easy adoption
- PGXN distribution makes installation trivial
- Multiple encryption modes satisfy different security requirements
Challenges Overcome:
-
LLVM bitcode compilation issues (solved with
NO_LLVM=1) -
Memory management in PostgreSQL extensions (
pallocvsmalloc) - PGXN metadata spec compliance (v1.0.0 validation)
Architecture Decisions 🏗️
Why C Extension vs PL/pgSQL?
- Performance: Native speed for cryptographic operations
- Integration: Deep PostgreSQL type system integration
- Security: Compiled code vs interpreted SQL
Why Functional Indexes vs Custom Index Types?
- Compatibility: Works with all PostgreSQL versions
- Maintenance: Leverages existing B-tree infrastructure
- Flexibility: Standard PostgreSQL query optimization
Community & Open Source 🌟
- License: PostgreSQL License (permissive)
- PGXN: https://pgxn.org/dist/pg_uuid_v8/
- GitHub: Full source + comprehensive docs
- Tests: Regression suite across PostgreSQL versions
Try It Yourself! 🚀
# Quick start
pgxn install pg_uuid_v8
psql -c "CREATE EXTENSION pg_uuid_v8; SELECT uuid_v8_generate();"
Benchmark Your Use Case
-- Create test table
CREATE TABLE benchmark (
id uuid DEFAULT uuid_v8_generate(),
data text DEFAULT 'sample data'
);
-- Insert test data
INSERT INTO benchmark (data)
SELECT 'test-' || generate_series(1,100000);
-- Create functional index
CREATE INDEX benchmark_time_idx ON benchmark
(uuid_stego_extract_timestamp(id));
-- Test performance
EXPLAIN ANALYZE
SELECT * FROM benchmark
WHERE uuid_stego_extract_timestamp(id) >
(EXTRACT(EPOCH FROM NOW() - INTERVAL '1 hour') * 1000000)::bigint;
What's Next? 🔮
Working on:
- Version management for seamless upgrades
- Monitoring functions for performance analytics
- ORM integrations for popular frameworks
- Cloud platform deployment guides (AWS RDS, Google Cloud SQL)
Discussion 💬
- Have you hit UUID performance walls in your applications?
- What's your current approach to time-based indexing?
- Tried steganographic UUIDs yet? Drop your benchmark results!
The PostgreSQL ecosystem is amazing for solving real-world problems. What database challenges are you tackling?
pg_uuid_v8
A PostgreSQL extension for steganographic UUIDs with embedded timestamps.
Overview
pg_uuid_v8 addresses the performance vs privacy trade-off in UUID usage by implementing steganographic UUIDs. These UUIDs maintain full compatibility with the UUID v4 format while embedding hidden timestamps that enable efficient indexing and range queries.
Features
- UUID v4 Compatibility: Generated UUIDs pass standard v4 validation (correct version and variant bits)
- Hidden Timestamps: Microsecond-precision timestamps embedded using steganographic techniques
- Configurable Encryption: XOR, AES-128, and AES-256 modes for timestamp obfuscation
- Functional Indexing: Support for PostgreSQL functional indexes on extracted timestamps
- Range Queries: Efficient time-based queries using hidden timestamp data
- Seed Management: Configurable encryption seeds via PostgreSQL GUC variables
Technical Approach
Standard UUID implementations present a trade-off between indexing performance and timestamp privacy:
- UUID v4: Random values provide good privacy but result in poor B-tree index performance due to random insertion patterns
- UUID v7…
Building better databases one UUID at a time 🛠️
Links:
Top comments (0)