You added a JSONB column. You created a GIN index. Your queries are still doing sequential scans.
Sound familiar?
Here's the problem: Most developers don't understand which operators GIN indexes actually support. You're probably using the wrong one.
I spent days of my life debugging slow JSONB queries before I figured this out. Here's what I learned.
The Problem: You're Using the Wrong Operator
Let's say you have a table with JSONB data:
CREATE TABLE products (
id BIGINT PRIMARY KEY,
data JSONB
);
-- Insert some products
INSERT INTO products (id, data) VALUES
(1, '{"brand": "Nike", "category": "Shoes", "price": 99.99}'),
(2, '{"brand": "Adidas", "category": "Shoes", "price": 89.99}'),
(3, '{"brand": "Nike", "category": "Apparel", "price": 49.99}');
You create a GIN index:
CREATE INDEX idx_products_data ON products USING GIN (data);
Then you run a query:
SELECT data->>'brand'
FROM products
WHERE data->>'brand' = 'Nike';
PostgreSQL does a sequential scan. The index is ignored.
Why?
What the Hell is a GIN Index?
GIN = Generalized Inverted Index.
Think of it like the index at the back of a book. Instead of "Page 42 contains this word," GIN says "The key brand with value Nike appears in rows 1, 3, 7, 42."
For JSONB, GIN breaks apart the JSON document and indexes every key-value pair separately:
{"brand": "Nike", "category": "Shoes"}
Gets indexed as:
brand:Nike → [row 1, row 3]
category:Shoes → [row 1, row 5]
This is perfect for "containment" queries: "Find all documents that contain {"brand": "Nike"}".
But here's the catch: GIN doesn't support all JSONB operators.
The Supported Operators (This is Critical)
PostgreSQL has two GIN operator classes for JSONB:
1. jsonb_ops (Default)
Supports these operators:
| Operator | What it does | Example |
|---|---|---|
@> |
Contains | data @> '{"brand": "Nike"}' |
? |
Key exists | data ? 'brand' |
| `? | ` | Any key exists |
?& |
All keys exist | data ?& array['brand', 'price'] |
@? |
JSONPath exists | data @? '$.brand' |
@@ |
JSONPath match | data @@ '$.price > 50' |
Notably missing: ->, ->>, #>, #>>
2. jsonb_path_ops
Supports only these operators:
| Operator | What it does |
|---|---|
@> |
Contains |
@? |
JSONPath exists |
@@ |
JSONPath match |
Smaller index, faster queries, but fewer operators.
Why Your Query Doesn't Use the Index
Remember this query?
SELECT data->>'brand'
FROM products
WHERE data->>'brand' = 'Nike';
The ->> operator is NOT GIN-indexable.
PostgreSQL sees this and says: "I can't use the GIN index for ->>. Sequential scan it is."
The Fix: Use @> (Containment)
SELECT data->>'brand'
FROM products
WHERE data @> '{"brand": "Nike"}';
Now PostgreSQL uses the index:
Bitmap Heap Scan on products
Recheck Cond: (data @> '{"brand": "Nike"}'::jsonb)
-> Bitmap Index Scan on idx_products_data
Index Cond: (data @> '{"brand": "Nike"}'::jsonb)
15-20x faster on large datasets.
Real Benchmark: 100k Documents
Setup:
CREATE TABLE logs (
id BIGSERIAL PRIMARY KEY,
metadata JSONB
);
-- Insert 100k log documents
INSERT INTO logs (metadata)
SELECT jsonb_build_object(
'user_id', 'user_' || (random() * 1000)::int,
'service', (ARRAY['api', 'web', 'worker'])[floor(random() * 3 + 1)],
'duration_ms', (random() * 5000)::int,
'status', (ARRAY['success', 'error'])[floor(random() * 2 + 1)]
)
FROM generate_series(1, 100000);
Test 1: Without Index
EXPLAIN ANALYZE
SELECT metadata->>'user_id'
FROM logs
WHERE metadata->>'user_id' = 'user_42';
Seq Scan on logs
Filter: ((metadata ->> 'user_id') = 'user_42')
Rows Removed by Filter: 99900
Execution Time: 45.2 ms
Test 2: With GIN Index (Wrong Operator)
CREATE INDEX idx_logs_metadata ON logs USING GIN (metadata);
EXPLAIN ANALYZE
SELECT metadata->>'user_id'
FROM logs
WHERE metadata->>'user_id' = 'user_42';
Seq Scan on logs -- STILL sequential scan!
Execution Time: 44.8 ms
No improvement. Index not used.
Test 3: With GIN Index (Correct Operator)
EXPLAIN ANALYZE
SELECT metadata->>'user_id'
FROM logs
WHERE metadata @> '{"user_id": "user_42"}';
Bitmap Index Scan on idx_logs_metadata
Index Cond: (metadata @> '{"user_id": "user_42"}'::jsonb)
Execution Time: 2.8 ms
16x faster. Index used.
jsonb_ops vs jsonb_path_ops: Which to Use?
I tested both on a 100k row table with typical JSONB documents (~500 bytes each).
Index Size
| Operator Class | Index Size | Table Size |
|---|---|---|
| No index | 0 MB | 27 MB |
jsonb_ops |
18 MB | 27 MB |
jsonb_path_ops |
6 MB | 27 MB |
jsonb_path_ops is 3x smaller.
Why? jsonb_ops indexes every key separately. jsonb_path_ops uses a hash-based approach for paths.
Write Performance (INSERT 100k rows)
| Operator Class | Time | vs No Index |
|---|---|---|
| No index | 1.58s | baseline |
jsonb_ops |
2.84s | +79% |
jsonb_path_ops |
1.84s | +16% |
jsonb_path_ops has way less write overhead.
Query Performance (Containment @>)
WHERE metadata @> '{"service": "api"}'
| Operator Class | Time |
|---|---|
| No index | 45ms |
jsonb_ops |
3.2ms |
jsonb_path_ops |
2.4ms |
jsonb_path_ops is 25% faster for containment queries.
Decision Matrix
Use jsonb_ops (default) if:
- You need
?,?|,?&(key existence checks) - You query unpredictable paths (wildcards)
- Index size doesn't matter
Use jsonb_path_ops if:
- You only use
@>(containment) - Write performance matters (high INSERT rate)
- Index size matters (smaller = better)
For 90% of use cases, jsonb_path_ops is better.
Common Gotchas
1. Type Casting Hell
This doesn't work:
WHERE metadata->>'duration_ms' > 1000
Why? ->> returns text, not a number. PostgreSQL can't use the index on a text comparison.
Fix:
WHERE (metadata->>'duration_ms')::int > 1000
Or better, use JSONPath:
WHERE metadata @@ '$.duration_ms > 1000'
2. Nested Field Queries
This doesn't use the index:
WHERE metadata->'user'->>'id' = '42'
Fix with @>:
WHERE metadata @> '{"user": {"id": "42"}}'
3. Partial Matches Don't Work
This doesn't use the index:
WHERE metadata->>'name' LIKE '%john%'
GIN indexes are for exact key-value matches, not substring searches.
Workaround: Use pg_trgm extension for trigram indexes on extracted fields:
CREATE EXTENSION pg_trgm;
CREATE INDEX idx_name_trgm
ON logs USING GIN ((metadata->>'name') gin_trgm_ops);
Expression Indexes for Specific Keys
If you always query the same 1-3 keys, use expression indexes instead:
-- Instead of GIN on entire JSONB column
CREATE INDEX idx_user_id ON logs ((metadata->>'user_id'));
CREATE INDEX idx_service ON logs ((metadata->>'service'));
Why?
- Smaller index (B-tree)
- Faster queries (for those specific keys)
- Lower write overhead
When to use:
- Stable schema (keys don't change)
- Query patterns are predictable
When NOT to use:
- Unpredictable query keys
- Many different keys queried
GIN Fast Update (Write Optimization)
By default, GIN uses "fast update" mode. Inserts go to a pending list, then bulk-updated later.
Pros:
- Faster INSERTs (batched updates)
- Less lock contention
Cons:
- Reads must scan pending list (slower)
- Unpredictable latency spikes (when pending list flushes)
Disable for Consistent Read Performance
ALTER INDEX idx_products_data SET (fastupdate = off);
Use when:
- Read latency predictability > write throughput
- Low INSERT rate
- Real-time dashboards
Production Tips
1. Always EXPLAIN ANALYZE
Don't assume the index is used:
EXPLAIN (ANALYZE, BUFFERS)
SELECT ...
Look for:
Bitmap Index Scan on <index_name>-
Index Cond:(notFilter:)
2. Monitor Index Bloat
GIN indexes can bloat with heavy writes:
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
WHERE indexrelname LIKE '%gin%'
ORDER BY pg_relation_size(indexrelid) DESC;
Fix: REINDEX CONCURRENTLY
3. Vacuum Regularly
GIN pending lists need cleanup:
SELECT gin_clean_pending_list('idx_products_data');
Or rely on autovacuum (recommended).
When NOT to Use JSONB + GIN
Use dedicated columns if:
- Schema is stable (keys won't change)
- You need foreign key constraints
- You need strong typing (dates, numbers)
- You're querying 80%+ of the same fields
Use JSONB + GIN if:
- Schema is evolving (startup mode)
- Sparse data (many optional fields)
- Flexible querying across unknown keys
- Storing metadata, logs, events
Real-World Example
I use this pattern in Logtide, an open-source log management platform:
CREATE TABLE logs (
time TIMESTAMPTZ NOT NULL,
level TEXT NOT NULL,
service TEXT NOT NULL,
message TEXT,
metadata JSONB -- User-defined fields
);
CREATE INDEX idx_logs_metadata
ON logs USING GIN (metadata jsonb_path_ops);
Production stats (30M logs):
- Index size: 2.5 GB (table: 15 GB compressed)
- Query time: P95 78ms for
metadata @> '{...}' - Write throughput: 18.5k logs/sec
GIN on JSONB handles production load without breaking a sweat.
Summary
Stop using ->> in WHERE clauses. Use @> (containment) instead.
Quick reference:
-- Wrong (index not used)
WHERE data->>'key' = 'value'
-- Right (index used)
WHERE data @> '{"key": "value"}'
-- Create index
CREATE INDEX idx_data ON table USING GIN (data jsonb_path_ops);
Key takeaways:
- GIN supports
@>,?,@?,@@— NOT->or->> - Use
jsonb_path_opsunless you need key existence operators - Always
EXPLAIN ANALYZEto verify index usage - Consider expression indexes for stable schemas
PostgreSQL JSONB + GIN is fast. You just need to use it correctly.
some benchmarks may vary on your machine/queries
Top comments (0)