Athreya aka Maneshwar

Posted on Mar 22

Inside SQLite’s Frontend: How the Query Optimizer Makes Your SQL Fast

#webdev #programming #database #architecture

Hello, I'm Maneshwar. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

In the previous part, you saw how SQLite converts a parse tree into bytecode. At that point, SQLite knows exactly what to do and how to execute it.

But there is still one critical question left.

Is this the fastest way to do it?

That is where the query optimizer comes in.

It sits between parsing and code generation and decides how your query should actually be executed for the best performance.

Why Optimization Exists at All

Given a single SQL query, there are often multiple ways to execute it.

Take a simple example:

SELECT * FROM users WHERE age = 25;

This query can be executed in different ways:

Scan the entire table and check each row
Use an index on age to directly find matching rows

Both approaches produce the same result, but the performance difference can be massive.

The job of the optimizer is to pick the approach that produces the most efficient bytecode program.

As described, different parse trees can represent equivalent relational operations, and each can lead to different execution strategies.

The optimizer’s role is to select the one that minimizes execution time and resource usage

Plans, Not Just Queries

Internally, every SQL query is converted into a query plan.

A plan is essentially a strategy that answers:

Which tables to access first
Which indexes to use
How to filter rows
How to handle intermediate results

Each parse tree corresponds to a specific plan. The optimizer evaluates possible alternatives and chooses a plan that is efficient enough.

Finding the absolute best plan is computationally expensive, so SQLite does not try to be perfect.

Instead, it focuses on avoiding bad plans and finding a good enough plan quickly.

SQLite’s Philosophy: Frontend Does All the Work

One important design choice in SQLite is that the Virtual Machine does not optimize anything.

It simply executes bytecode instructions exactly as given.

This means all optimization must happen in the frontend, before bytecode is generated.

If the optimizer makes a poor decision, the VM will blindly execute inefficient instructions.

That is why query optimization is one of the most critical responsibilities in SQLite’s architecture

The Real Cost: Accessing Tables

The biggest cost in query execution is not computation. It is accessing data from disk.

Every time SQLite reads rows from a table, it performs I/O operations, which are expensive.

So the optimizer’s main goal is simple:

Reduce the number of rows read from base tables.

The fewer rows accessed, the faster the query runs.

Choosing Between Full Scan and Index Scan

For every table involved in a query, the optimizer must decide how to access it.

There are two main options.

Full Table Scan

SQLite reads every row in the table in rowid order.

This happens when:

No index exists on the column being filtered
The optimizer decides an index is not beneficial

Example:

SELECT * FROM users;

This requires scanning the entire table.

Index Scan

If an index exists, SQLite can use it to narrow down the rows.

Example:

SELECT * FROM users WHERE age = 25;

If there is an index on age, SQLite can jump directly to matching entries instead of scanning everything.

For very specific queries like:

SELECT * FROM users WHERE rowid = 2;

SQLite can directly access a single row using the table’s primary B+ tree, making the query extremely fast.

If no index exists for a condition like:

SELECT * FROM users WHERE age = 25;

SQLite has no choice but to scan the entire table and check each row individually

How Indexes Actually Work in SQLite

Each table in SQLite is stored as a B+ tree, where the key is the rowid. This is called the primary index.

In addition to that, SQLite can have secondary indexes, which are also B-trees built on other columns.

When using a secondary index, SQLite typically performs two steps:

Search the index to find matching entries
Extract the rowid from the index
Use the rowid to fetch the actual row from the table

This means an indexed lookup often involves two tree searches.

However, there is an important optimization.

If all required columns are already present in the index, SQLite does not need to access the base table at all.

This avoids the second lookup and can significantly improve performance, sometimes making queries nearly twice as fast

Two Core Challenges in Optimization

For any query, the optimizer has to solve two main problems:

1. Which Plans Should Be Considered

There are many possible ways to execute a query.

The optimizer cannot explore all of them, so it uses heuristics to narrow down the options.

2. How to Estimate Cost

For each plan, SQLite estimates how expensive it will be.

Since SQLite does not maintain detailed statistics about tables, its cost estimation is relatively simple compared to larger database systems.

Despite this, it performs surprisingly well in practice.

Optimization Is Different for Different Queries

Not all queries benefit equally from optimization.

For example:

INSERT statements have limited optimization opportunities
Queries without a WHERE clause usually result in full table scans

Most optimization effort is focused on queries that filter data, especially SELECT statements.

Special Handling for DELETE and UPDATE

DELETE and UPDATE statements follow a slightly different execution model.

They are processed in two phases:

SQLite identifies the rows that match the condition and stores their rowids in a temporary structure (RowSet)
It then performs the actual deletion or update using those rowids

There is also a special optimization.

If you run:

DELETE FROM users;

SQLite uses a special opcode (OP_Clear) to wipe the entire table efficiently.

If you want to prevent this optimization, you can force a condition:

DELETE FROM users WHERE 1;

This forces SQLite to go through the normal row-by-row process

How SQLite Organizes Optimization Work

SQLite breaks queries into query blocks and optimizes each block independently.

Most of the optimization logic lives in the where.c file, which handles decisions like:

Which indexes to use
How to structure loops
How to filter rows efficiently

This is the same component that works closely with the code generator to produce efficient loops for WHERE clauses.

*AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.*

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

HexmosTech / git-lrc

Free, Micro AI Code Reviews That Run on Commit

git-lrc

Free, Micro AI Code Reviews That Run on Commit

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
…

View on GitHub

Top comments (4)

Kai Alder • Mar 22

Really appreciate how you explained the two-step lookup with secondary indexes. That's one of those things that sounds simple but has huge perf implications — especially when you're building indexes thinking "more is better" and then wondering why things are slow.

One thing I'd add: covering indexes are such an underrated optimization. If you know your query patterns upfront, structuring indexes to include all needed columns can completely eliminate table lookups. I learned this the hard way optimizing a read-heavy analytics dashboard.

Quick question — does SQLite's query planner expose anything like Postgres's EXPLAIN ANALYZE for profiling actual vs estimated costs? Would be useful for debugging when your assumptions about index usage are wrong.

Botánica Andina • Mar 29

Super interesting breakdown of SQLite's optimizer! That example with SELECT * FROM users WHERE age = 25; really makes the idea of "multiple ways to execute" click. I'm curious, does the optimizer pretty much always lean on an index if one's available for the WHERE clause column, or are there times when it might actually prefer a full table scan even then?

klement Gunndu • Mar 22

The 'frontend does all the work' design sounds limiting but makes the VM dead simple to reason about. Ran into this same pattern building bytecode interpreters — pushing complexity earlier in the pipeline pays off at execution.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.