eyanpen

Posted on May 20

How FalkorDB Stores Edges: Why Neighbor Lookup Is O(degree)

#falkordb #sparsematrix #graphdatabase #graphblas

Many people have a question when they first see FalkorDB's architecture:

It doesn't use traditional adjacency lists but maintains edges with sparse matrices — how does it efficiently find all edges of a given node?

And a follow-up question:

If neighbor data is already stored contiguously, why is the query complexity still O(degree) instead of O(1)?

1. How Traditional Graph Databases Store Edges

Traditional graph databases (like Neo4j) typically use:

Adjacency List

For example:

A -> B
A -> C
A -> D

Internally it looks more like:

A:
  edge1 -> edge2 -> edge3

That is:

Each node maintains its own edge linked list
To find all edges of a node:
- Simply traverse the linked list

Therefore the complexity is:

O(degree)

Where:

degree = number of edges

For example:

out_degree
Number of outgoing edges
in_degree
Number of incoming edges

2. FalkorDB Is Completely Different: Sparse Matrix

FalkorDB's core design is not an adjacency list.

It is based on:

Sparse Matrix
GraphBLAS

to maintain the entire graph.

For example:

A(id=0) -> B(id=1)

Internal representation:

M[0,1] = edge_id

Meaning:

source=0
target=1

An edge exists.

3. One Matrix Per Edge Type

For example:

(:User)-[:FRIEND]->(:User)
(:User)-[:LIKES]->(:Post)

FalkorDB maintains:

FRIEND matrix
LIKES matrix

This way during traversal:

No need to scan the entire graph.

4. How Multi-edges Are Maintained

FalkorDB supports:

A -[:CALL]-> B
A -[:CALL]-> B
A -[:CALL]-> B

Therefore a matrix cell cannot simply be:

M[0,1] = 1

It's more like:

M[0,1] = [3,8,15]

That is:

edge ids

Essentially similar to:

sparse tensor
compressed adjacency structure

5. How to Efficiently Find Edges?

Many people mistakenly think:

0 0 0 1 0 0 1 1 0

Means:

You must scan the entire row to find the 1s.

That's completely wrong.

Because:

Sparse Matrix Doesn't Store Zeros At All

6. What Does a Sparse Matrix Actually Store?

For example:

[0,0,0,1,0,0,1,1,0]

The actual storage looks more like:

[3,6,7]

Meaning:

index 3 has an edge
index 6 has an edge
index 7 has an edge

Zeros don't exist at all.

Therefore:

Finding neighbors of node A:

neighbors(A) = [3,6,7]

Return directly.

7. CSR / CSC: Industrial-Grade Sparse Matrix Structures

Real implementations typically use:

CSR (Compressed Sparse Row)
CSC (Compressed Sparse Column)

For example:

Matrix:

A: 0 0 0 1 0 0 1 1
B: 1 0 0 0 0 0 0 0
C: 0 1 0 0 1 0 0 0

CSR might store it as:

indices = [3,6,7,0,1,4]
row_ptr = [0,3,4,6]

Explanation:

A's data is at indices[0:3]
B's data is at indices[3:4]
C's data is at indices[4:6]

So:

Finding all edges of A:

indices[0:3]

Gives us:

[3,6,7]

8. Why Is the Complexity Still O(degree)?

This is the most commonly misunderstood point.

Many people ask:

Since [3,6,7] is already in contiguous memory,
isn't a direct memcpy just O(1)?

The answer:

Locating the array is O(1)

But:

Traversing the array is still O(k)

Where:

k = degree

9. What Does Algorithmic Complexity Actually Measure?

For example:

MATCH (a)-[e]->()
RETURN e

The database doesn't just return:

an array pointer

It must:

Traverse each edge
Decode the edge object
Construct the result set
Return to the client

Therefore:

for edge in neighbors:
    emit(edge)

Must execute:

degree times

So the overall complexity is:

O(degree)

10. Output-sensitive Complexity

This is a classic concept:

The size of the output itself counts toward complexity

For example:

If:

A has 1 million edges

Even if:

finding the array start

Only takes:

O(1)

But:

Returning 1 million edges:

Cannot be:

O(1)

Because:

You must at least "look at" each element.

11. Why Is FalkorDB Still Fast?

Because:

[3,6,7]

Is:

Contiguous memory
Cache-friendly
SIMD-friendly

The CPU can:

Prefetch
Vector load
Branch prediction

While traditional adjacency lists:

edge1 -> edge2 -> edge3

Involve:

Pointer chasing

Which causes:

Cache misses
Memory stalls
Branch mispredictions

Therefore:

FalkorDB has clear advantages in:

High fan-out traversal
Multi-hop pattern matching
Graph analytics
GraphRAG

scenarios.

12. Neo4j vs FalkorDB: The Essential Difference

Neo4j is more like:

nodes + edge linked lists

Suited for:

OLTP
Single-hop queries
High-frequency edge updates

FalkorDB is more like:

a graph computation engine

Suited for:

Multi-hop traversal
Pattern matching
Graph analytics
Vectorized computation

For example:

(A)-[:F]->(B)-[:F]->(C)

Neo4j:

pointer traversal

FalkorDB:

matrix multiply

That is:

F × F

This is its biggest architectural difference.

13. Final Summary

FalkorDB's core philosophy:

Don't store "empty"
Only store "existing edges"

Therefore:

0 0 0 1 0 0 1

Actually becomes:

[3,6]

Querying all edges of a node:

Locating adjacency data:
- O(1)
Returning all edges:
- O(degree)

Where:

degree = number of edges for the current node

Not:

total number of edges in the entire graph

This is the core performance model of a Sparse Matrix graph database.

14. Does Splitting Edges Into Multiple Types vs. a Single Type Affect Query Speed?

A common question:

Since locating edges is O(1) and returning edges is O(degree),
does categorizing edges into one type vs. multiple types affect query speed?

The answer: It depends on whether the query specifies an edge type.

When the Query Specifies an Edge Type

For example:

MATCH (a)-[:FRIEND]->(b) RETURN b

FalkorDB only scans the FRIEND matrix.

If all edges are categorized as a single type (e.g., :REL), the matrix contains all edges, making the degree larger.

Multiple types = smaller matrices = less traversal = faster.

When the Query Does Not Specify an Edge Type

For example:

MATCH (a)-[]->(b) RETURN b

FalkorDB needs to merge results from multiple matrices.

In this case:

Total traversal volume is the same (total degree)
Multiple types have slight merge overhead
Single type traverses one matrix directly

The difference is minimal, approximately no impact.

Summary

Scenario	Single Type vs. Multiple Types	Impact
Query specifies edge type	Multiple types faster	Only scans the corresponding matrix, smaller degree
Query does not specify edge type	Nearly no difference	Same total degree, slight merge overhead with multiple types

Practical modeling recommendation:

Splitting into multiple types is the better practice.
Most real-world queries specify a relationship type, and splitting types significantly reduces the number of edges that need to be traversed.

DEV Community

How FalkorDB Stores Edges: Why Neighbor Lookup Is O(degree)

1. How Traditional Graph Databases Store Edges

Adjacency List

2. FalkorDB Is Completely Different: Sparse Matrix

3. One Matrix Per Edge Type

4. How Multi-edges Are Maintained

5. How to Efficiently Find Edges?

Sparse Matrix Doesn't Store Zeros At All

6. What Does a Sparse Matrix Actually Store?

7. CSR / CSC: Industrial-Grade Sparse Matrix Structures

8. Why Is the Complexity Still O(degree)?

Locating the array is O(1)

Traversing the array is still O(k)

9. What Does Algorithmic Complexity Actually Measure?

10. Output-sensitive Complexity

The size of the output itself counts toward complexity

11. Why Is FalkorDB Still Fast?

Pointer chasing

12. Neo4j vs FalkorDB: The Essential Difference

13. Final Summary

14. Does Splitting Edges Into Multiple Types vs. a Single Type Affect Query Speed?

When the Query Specifies an Edge Type

When the Query Does Not Specify an Edge Type

Summary

Top comments (0)