Problem setup (How Index Scan Internally works for both DBs)
Table in MySQL and PostgreSQL:
user(
id BIGINT PRIMARY KEY,
name VARCHAR(100),
age INT
)
Index:
CREATE INDEX idx_user_name ON user(name);
Query:
SELECT age FROM user WHERE name = 'Rahim';
Assume:
- Table has 10 million rows
- Index is a B-Tree
- name is not unique
- age is not in the index
1️⃣ MySQL (InnoDB) — Step by step
🔑 Important InnoDB rule
Secondary index stores the PRIMARY KEY, not the row location
So idx_user_name looks like:
(name, primary_key_id)
Step-by-step execution
🧭 Step 1: Traverse secondary index (name)
MySQL searches the B-Tree for 'Rahim'.
Cost: O(log N)
Example:
B-Tree levels:
Root → Internal → Leaf
At the leaf, it finds:
('Rahim', id=73482)
⚠️ It does NOT have age yet
🧭 Step 2: Traverse PRIMARY KEY index (clustered index)
In InnoDB:
Primary key index IS the table
Rows are physically stored in PK order
Now MySQL uses id = 73482 to search the PK B-Tree.
Cost: O(log N)
At the PK leaf node, it finds the full row:
(id=73482, name='Rahim', age=29)
✅ Total cost (MySQL)
🔍 Key MySQL characteristics
- ✅ Clustered index = good locality
- ❌ Two tree traversals
- ❌ PK lookup cost grows with table size
2️⃣ PostgreSQL — Step by step
🔑 Important PostgreSQL rule
Index stores a TID (tuple ID) → pointer to heap location
Index entry looks like:
(name, (block_id, offset))
Step-by-step execution
🧭 Step 1: Traverse B-Tree index (name)
PostgreSQL searches index for 'Rahim'.
Cost: O(log N)
At the leaf, it finds:
('Rahim', TID=(block=102345, offset=7))
🧭 Step 2: Heap fetch (direct pointer)
Postgres now:
Goes directly to heap page 102345
Reads row at offset 7
Cost: O(1) (conceptually)
Row found:
(id=73482, name='Rahim', age=29)
✅ Total cost (PostgreSQL)
🔍 Key PostgreSQL characteristics
- ✅ One B-Tree traversal
- ✅ Heap fetch is constant-time
- ❌ Heap pages may be scattered (less locality)
- ❌ Extra visibility checks (MVCC)
3️⃣ Side-by-side comparison (core difference)
4️⃣ Real-world performance (THIS is the key insight)
⚠️ Big-O is not the full story
When MySQL can be faster
- PK is small
- Data fits in buffer pool
- Rows are accessed sequentially
- Heavy read workloads
➡️ Clustered index wins
When PostgreSQL can be faster
- Very large tables
- Many secondary indexes
- Random access patterns
- Index-only scans possible
➡️ Pointer-based access wins
5️⃣ Index-only scan (Postgres secret weapon 🧠)
If you change index to:
CREATE INDEX idx_user_name_age ON user(name, age);
Then:
SELECT age FROM user WHERE name = 'Rahim';
🔥 PostgreSQL can return result WITHOUT touching heap at all
Cost:
O(log N)
MySQL cannot do true index-only scan in the same way because:
Secondary index does not contain row version visibility info
6️⃣ Final verdict (short answer)
📌 Theoretical efficiency
✅ PostgreSQL is more efficient
`O(logN) + O(1)
vs
O(logN) + O(logN)`
📌 Practical efficiency
`Small / medium datasets → MySQL often feels faster
Large datasets + many indexes → PostgreSQL scales better
Analytics / complex queries → PostgreSQL wins
Simple OLTP workloads → MySQL is excellent`
🧠 One-line intuition
MySQL: “Find index → find PK → find row”
Postgres: “Find index → jump straight to row”



Top comments (0)