The problem
I'm building N3MO — an open source code intelligence
engine that maps Python codebases into a knowledge
graph stored in PostgreSQL.
When I tested it on Django (3,021 Python files,
181,000+ function calls), the full index took
23 minutes. Extraction was 8 minutes. Linking
was 15 minutes.
Linking was the bottleneck.
Why linking was slow
N3MO's linking phase resolves which function calls
which. The original query used LIKE:
WHERE c.call_name LIKE '%%.' || s.name
This matches calls like module.function or
self.function by checking if the call name
ends with the function name.
The problem: a leading wildcard LIKE query
cannot use indexes. PostgreSQL has to compare
every call against every symbol.
For Django: 181,000 calls × 43,000 symbols =
7.8 billion comparisons. That's why it took
15 minutes.
The fix
Replace LIKE with SPLIT_PART:
WHERE SPLIT_PART(c.call_name, '.', -1) = s.name
SPLIT_PART splits the call name at '.' and takes
the last part. module.function becomes function.
Then it's an exact match — which CAN use indexes.
The result
Before: 23 minutes
After: 11 minutes
Speedup: 2x faster from one line change
What I learned
When you have a performance problem in SQL:
- Leading wildcard LIKE = no index possible
- SPLIT_PART + exact match = index friendly
- Always check if your WHERE clause can use indexes
N3MO is open source: github.com/RajX-dev/N3MO
Top comments (0)