DEV Community

N3MO
N3MO

Posted on

How I cut Django indexing from 23 minutes to 11 minutes with one SQL change

The problem

I'm building N3MO — an open source code intelligence
engine that maps Python codebases into a knowledge
graph stored in PostgreSQL.

When I tested it on Django (3,021 Python files,
181,000+ function calls), the full index took
23 minutes. Extraction was 8 minutes. Linking
was 15 minutes.

Linking was the bottleneck.

Why linking was slow

N3MO's linking phase resolves which function calls
which. The original query used LIKE:

WHERE c.call_name LIKE '%%.' || s.name
Enter fullscreen mode Exit fullscreen mode

This matches calls like module.function or
self.function by checking if the call name
ends with the function name.

The problem: a leading wildcard LIKE query
cannot use indexes. PostgreSQL has to compare
every call against every symbol.

For Django: 181,000 calls × 43,000 symbols =
7.8 billion comparisons. That's why it took
15 minutes.

The fix

Replace LIKE with SPLIT_PART:

WHERE SPLIT_PART(c.call_name, '.', -1) = s.name
Enter fullscreen mode Exit fullscreen mode

SPLIT_PART splits the call name at '.' and takes
the last part. module.function becomes function.
Then it's an exact match — which CAN use indexes.

The result

Before: 23 minutes
After: 11 minutes
Speedup: 2x faster from one line change

What I learned

When you have a performance problem in SQL:

  • Leading wildcard LIKE = no index possible
  • SPLIT_PART + exact match = index friendly
  • Always check if your WHERE clause can use indexes

N3MO is open source: github.com/RajX-dev/N3MO

Top comments (0)