Indexing, Hashing & Query I/O in DBMS
Efficient data retrieval is one of the most important goals in any database system.
When we query a table, the DBMS must decide how to find the required data — scanning the entire table is slow.
That’s where Indexing, Hashing, and Query I/O optimization come into play.
📚 1️⃣ Indexing in DBMS
💡 What is an Index?
An index is a data structure that improves the speed of data retrieval operations on a table — similar to how an index in a book helps you find topics quickly.
Without an index, the DBMS must perform a full table scan, checking every row.
With an index, it can jump directly to the matching record.
🧱 Types of Indexes
Type Description
Primary Index Built on the primary key; records are stored in sorted order.
Secondary Index Created on non-primary attributes for faster lookup.
Clustered Index Reorders the actual data to match the index.
Non-Clustered Index Keeps a separate structure pointing to the actual data.
Dense Index Every record has an entry.
Sparse Index Only some records have entries (less space, more traversal).
🧠 Example:
CREATE INDEX idx_name
ON Employees (name);
Now, SELECT * FROM Employees WHERE name = 'John';
will use the index to find results faster 🚀
🧩 2️⃣ Hashing in DBMS
Hashing is another data access method — instead of sorting and searching, it uses a hash function to compute the location of data directly.
⚡ How it Works:
Hash Function → Hash(Key) = Address
Each key is converted into an address (or bucket) where the record is stored.
🧱 Example:
If Hash(101) → 5, record with key 101 will be stored in bucket 5.
🔹 Advantages
Very fast access for equality searches (e.g. WHERE id = 101).
No need to traverse indexes or sort data.
🔹 Disadvantages
Not efficient for range queries (BETWEEN, <, >, etc.)
May cause collisions (different keys map to same bucket).
⚙️ Collision Handling Techniques
Open Addressing — find another empty slot.
Chaining — use a linked list for multiple keys in the same bucket.
💾 3️⃣ Query I/O (Input/Output)
When a query runs, the DBMS spends most of its time performing I/O operations — reading and writing data pages from disk.
Optimizing I/O is key to improving performance.
🔍 Query I/O Workflow
Parse & validate SQL query.
Use the optimizer to choose the best plan (index scan, hash join, etc.).
Fetch data pages into buffer cache.
Return the result to the user.
🔧 Ways to Optimize Query I/O
Use appropriate indexes on frequently searched columns.
*Avoid SELECT ** (fetch only needed columns).
Use joins carefully — prefer indexed joins.
Partition large tables for faster access.
Analyze query plans (EXPLAIN in SQL).
🧾 Quick Summary
Concept Description Use Case
Indexing Sorted lookup structure for fast search Range queries
Hashing Direct address computation Equality search
Query I/O Disk operations during query execution Performance tuning
💡 Takeaway:
Indexes and hashing make searches lightning-fast ⚡, while efficient I/O management keeps your queries scalable and optimized. Together, they’re the core of any high-performance DBMS.
Top comments (0)