Triangle Detection via Clique-Constrained Union-Find and a Matching-First Edge Ordering
Frank Vega
Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA
vega.frank@gmail.com
Abstract
We present Aegypti, a hybrid heuristic for detecting a single triangle in an undirected graph with vertices and edges. The algorithm operates in two phases.
In the fast path, a clique-constrained Union-Find structure (FastCliqueUF) streams over the edges and merges components only when the union remains a clique; the moment any component reaches size
, a triangle witness is returned and the phase halts. Because of this halting rule, every component encountered during Phase 1 has size at most 2, so the clique test for the union of two components inspects at most a constant number of vertex pairs. Each Union therefore costs only
amortised, dominated by a constant number of single-bit lookups in the precomputed closed-neighbourhood bitsets. The fast path runs in
time in the worst case, dominated by the initialisation that allocates the per-vertex closed-neighbourhood bitsets. The fast path also uses a matching-first edge ordering: it unions every edge of a minimum maximal matching before processing the extra (non-matching) edges, and an extra edge whose union is rejected is deleted from the bitset structure so it cannot pollute later clique tests.
If the fast path fails to detect a triangle, a classical fallback using adjacency-set intersections solves the problem in time — a strict improvement over the bare bound because no matching edge can be part of any triangle once Phase 1 has terminated empty-handed. The overall worst-case running time is therefore
We prove correctness, analyse each phase, and validate the algorithm on a 15-instance DIMACS benchmark suite (finlay/experiments/) that includes triangle-rich cliques, triangle-free graphs (Petersen, large cycles, bipartite), adversarial decoy-matching constructions, and near-triangle-free random graphs.
1. Introduction
Triangle detection — determining whether an undirected graph contains three mutually adjacent vertices — is one of the most fundamental problems in graph algorithms. It arises as a primitive in network analysis, sparse-matrix computations, database join processing, and as the base case of exact clique solvers.
Known Complexity
Let , . The classical results are:
via adjacency intersections. Chiba and Nishizeki showed that listing all triangles costs , where is the arboricity of ; since , detection costs .
via matrix multiplication. Itai and Rodeh observed that triangle detection reduces to Boolean matrix multiplication; with the current best exponent , this gives . Alon, Yuster, and Zwick sharpened the combination to .
Conditional lower bound. Under the -Clique Conjecture and related hypotheses, triangle detection requires time in the combinatorial model.
Our Contribution
We introduce Aegypti, a hybrid heuristic for triangle detection whose design rests on three ingredients:
- A clique-constrained Union-Find (
FastCliqueUF) that streams over the edges and merges two components only when their union remains a clique. As soon as any component reaches size , a triangle witness is returned. - A matching-first edge ordering with delete-on-reject: Phase 1 unions every edge of a minimum maximal matching
(each yielding a 2-clique), then iterates over the non-matching edges; any extra edge whose union is rejected is removed from the
FastCliqueUFbitset structure. - A matching-edge skip in the fallback: if Phase 1 fails, the classical Chiba–Nishizeki adjacency-intersection algorithm is applied to only, because no matching edge can be in any triangle once Phase 1 has terminated empty-handed.
We do not claim a new asymptotic upper bound for combinatorial triangle detection in the strict sense — the exponent on is unchanged — but the -edge saving is non-trivial in every graph with a positive matching, and the algorithm returns explicit witness triples at no extra cost. The algorithm is named Aegypti after Aedes aegypti, a mosquito whose compound eye detects rapid visual patterns by aggregating local signals — analogous to the way our algorithm aggregates local adjacency information in bitset blocks to detect the local pattern of a triangle.
2. Preliminaries
Graph Notation
All graphs are simple and undirected. We write , , for the open neighbourhood of , and for the closed neighbourhood. The degree of is , and .
Definition (Triangle). A triangle in is a set of three distinct vertices such that .
Definition (Triangle-free). is triangle-free if it contains no triangle.
Definition (Matching). A matching
is a set of edges no two of which share an endpoint. A matching is maximal if no edge of
can be added without violating the disjointness property. We compute
with the linear-time approximation nx.approximation.min_maximal_matching.
Bitset Representation
We index
. A subset
is stored as an array of
unsigned 64-bit integers (numpy.uint64), where bit
of block
indicates membership of
. Standard bitwise operations (OR, AND, NOT, Any) each cost
time on a RAM with
-bit words. Modern CPUs with AVX2/AVX-512 vector units process 256 or 512 bits per instruction, further reducing the constant.
Proposition. For bitsets over a universe of size , union, intersection, complement-intersection, and the subset test each run in time.
3. The Aegypti Algorithm
Core Data Structure: FastCliqueUF
FastCliqueUF augments a standard Union-Find with two bitset arrays per component root:
where is the closed neighbourhood (includes itself).
Clique Invariant. At all times, for every root , the component is a clique in .
Proposition (Bitset clique test). For roots , the set is a clique if and only if
Proof. Let and ParseError: KaTeX parse error: Expected 'EOF', got '&' at position 11: I = A[r]\;&̲\;A[s] . By definition, is the set of vertices adjacent-or-equal to every member of . The union is a clique iff , i.e. , i.e. ParseError: KaTeX parse error: Expected 'EOF', got '&' at position 4: M\;&̲\;\mathord{\sim… .
Lemma (Triangle witness). If FastCliqueUF.union(u, v) causes a component of size
to form, that component contains a triangle.
Proof. By the clique invariant, every component is a clique. A clique of size
contains
edges among any three of its members; any three vertices form a triangle.
import numpy as np
class FastCliqueUF:
"""
A Union-Find structure that only merges components if the union
induces a clique in the underlying graph. All clique checks and
component updates are accelerated using numpy.uint64 SIMD blocks,
giving O(k / w) merge time where w = 64 bits per block.
"""
def __init__(self, graph):
self.graph = graph
self.nodes = list(graph.nodes())
self.index = {u: i for i, u in enumerate(self.nodes)}
self.n = len(self.nodes)
# Number of 64-bit blocks needed to represent n bits
self.blocks = (self.n + 63) // 64
# Standard Union-Find parent + size
self.parent = {u: u for u in self.nodes}
self.size = {u: 1 for u in self.nodes}
# Precompute adjacency bitsets for each node as numpy uint64 arrays
self.adj = {}
for u in self.nodes:
arr = np.zeros(self.blocks, dtype=np.uint64)
for v in graph.neighbors(u):
idx = self.index[v]
arr[idx // 64] |= np.uint64(1) << (idx % 64)
# A node is always adjacent to itself for clique purposes
idx = self.index[u]
arr[idx // 64] |= np.uint64(1) << (idx % 64)
self.adj[u] = arr
# Component bitsets: which nodes are in the component
self.comp_bit = {u: self._singleton_bitset(u) for u in self.nodes}
# Component adjacency intersection bitsets
self.comp_adj = {u: self.adj[u].copy() for u in self.nodes}
def _singleton_bitset(self, u):
arr = np.zeros(self.blocks, dtype=np.uint64)
idx = self.index[u]
arr[idx // 64] |= np.uint64(1) << (idx % 64)
return arr
def find(self, u):
if self.parent[u] != u:
self.parent[u] = self.find(self.parent[u])
return self.parent[u]
def union(self, u, v):
"""
Merge the components of u and v only if the union forms a clique.
Return True iff the resulting clique has size >= 3.
"""
ru, rv = self.find(u), self.find(v)
if ru == rv:
return self.size[ru] >= 3
if self.size[ru] < self.size[rv]:
ru, rv = rv, ru
merged_bit = self.comp_bit[ru] | self.comp_bit[rv]
merged_adj = self.comp_adj[ru] & self.comp_adj[rv]
if np.any(merged_bit & ~merged_adj):
return False
self.parent[rv] = ru
new_size = self.size[ru] + self.size[rv]
self.size[ru] = new_size
self.comp_bit[ru] = merged_bit
self.comp_adj[ru] = merged_adj
return new_size >= 3
def delete_edge(self, u, v):
"""
Remove the edge (u, v) from the in-memory bitset structure (the
underlying NetworkX graph is untouched). Clears the corresponding
bits from adj[u]/adj[v] and from comp_adj[find(u)]/comp_adj[find(v)]
so that future clique tests no longer treat (u, v) as an edge.
Costs O(1) bit flips plus two O(blocks) writes.
"""
if u == v:
return
iu = self.index[u]
iv = self.index[v]
mask_u = np.uint64(1) << np.uint64(iu % 64)
mask_v = np.uint64(1) << np.uint64(iv % 64)
block_u = iu // 64
block_v = iv // 64
# Clear v from u's closed-neighbourhood bitset and vice versa.
self.adj[u][block_v] &= ~mask_v
self.adj[v][block_u] &= ~mask_u
# Reflect the deletion in the component adjacency intersection bitsets.
ru = self.find(u)
rv = self.find(v)
self.comp_adj[ru][block_v] &= ~mask_v
if rv != ru:
self.comp_adj[rv][block_u] &= ~mask_u
def to_sets(self):
groups = {}
for u in self.nodes:
r = self.find(u)
groups.setdefault(r, set()).add(u)
return list(groups.values())
Main Detection Routine
Aegypti is a two-phase hybrid:
-
Phase 1 — Matching-first clique-constrained Union-Find.
- Compute ParseError: KaTeX parse error: Expected 'EOF', got '_' at position 16: M = \texttt{min_̲maximal_matchin… .
- Union every matching edge first (each yields a 2-clique).
- Iterate over the extra (non-matching) edges; on each accepted union of size
, return the triangle; on each rejection,
delete_edgeso it cannot pollute future clique tests.
Phase 2 — Classical Chiba–Nishizeki adjacency-intersection fallback, restricted to . Matching edges are skipped because they cannot be in any triangle once Phase 1 has terminated empty-handed (see Lemma below).
import networkx as nx
from itertools import islice
from .disjoint import FastCliqueUF
def find_triangle_coordinates(graph):
"""
Detect a single triangle (3-clique) in an undirected NetworkX graph.
Phase 1: matching-first clique-constrained Union-Find with
delete-on-reject. Compute M = min_maximal_matching(G), union every
matching edge (each yields a 2-clique), then process the extra
(non-matching) edges; if a union is accepted and produces a size-3+
component, return the triangle; otherwise delete the rejected edge
from the bitset structure.
Phase 2: classical Chiba--Nishizeki adjacency-intersection fallback
restricted to E \\ M. Matching edges cannot be in any triangle once
Phase 1 has terminated empty-handed, so skipping them tightens the
fallback bound to O((m - |M|)^{3/2}).
"""
if not isinstance(graph, nx.Graph) or graph.is_directed():
raise ValueError("Input must be an undirected NetworkX Graph.")
if nx.number_of_selfloops(graph) > 0:
raise ValueError("Graph must not contain self-loops.")
if graph.number_of_nodes() < 3 or graph.number_of_edges() == 0:
return None
disjoint_set = FastCliqueUF(graph)
matching = nx.approximation.min_maximal_matching(graph)
matching_set = {frozenset((u, v)) for u, v in matching}
found_in_uf = False
# Step 1: union all matching edges.
for u, v in matching:
if disjoint_set.union(u, v):
found_in_uf = True
break
# Step 2: process the extra (non-matching) edges with delete-on-reject.
if not found_in_uf:
for u, v in graph.edges():
if frozenset((u, v)) in matching_set:
continue
if disjoint_set.union(u, v):
found_in_uf = True
break
disjoint_set.delete_edge(u, v)
if found_in_uf:
for component in disjoint_set.to_sets():
if len(component) >= 3:
return frozenset(islice(component, 3))
# Phase 2: adjacency-intersection enumeration, skipping matching edges.
adj_sets = {node: set(graph.neighbors(node)) for node in graph.nodes()}
rank = {node: i for i, node in enumerate(graph.nodes())}
for u in graph.nodes():
for v in graph.neighbors(u):
if rank[u] < rank[v]:
if frozenset((u, v)) in matching_set:
continue
common_neighbors = adj_sets[u] & adj_sets[v]
for w in common_neighbors:
return frozenset({u, v, w})
return None
Standalone Combinatorial Baseline
For the experimental section we also implement a standalone Chiba–Nishizeki adjacency-intersection algorithm that uses a degeneracy ordering, Python sets (no NumPy bitsets), and no Union-Find. This is the primary practical combinatorial competitor against which Aegypti should be compared.
import heapq
import networkx as nx
def find_triangle_chiba_nishizeki(graph):
"""
Standalone classical Chiba--Nishizeki O(m^{3/2}) adjacency-
intersection algorithm. Uses a degeneracy ordering (smallest-
degree-last peeling) and Python sets.
"""
if not isinstance(graph, nx.Graph) or graph.is_directed():
raise ValueError("Input must be an undirected NetworkX Graph.")
if nx.number_of_selfloops(graph) > 0:
raise ValueError("Graph must not contain self-loops.")
if graph.number_of_nodes() < 3 or graph.number_of_edges() == 0:
return None
adj = {v: set(graph.neighbors(v)) for v in graph.nodes()}
remaining = {v: set(neigh) for v, neigh in adj.items()}
rank = {}
order = []
heap = [(len(remaining[v]), v) for v in remaining]
heapq.heapify(heap)
removed = set()
while heap:
d, v = heapq.heappop(heap)
if v in removed:
continue
if len(remaining[v]) != d:
heapq.heappush(heap, (len(remaining[v]), v))
continue
rank[v] = len(order)
order.append(v)
removed.add(v)
for w in list(remaining[v]):
remaining[w].discard(v)
heapq.heappush(heap, (len(remaining[w]), w))
remaining[v].clear()
for u, v in graph.edges():
if rank[u] > rank[v]:
u, v = v, u
a_u = adj[u]
a_v = adj[v]
small, large = (a_v, a_u) if len(a_u) > len(a_v) else (a_u, a_v)
for w in small:
if w != u and w != v and w in large:
return frozenset({u, v, w})
return None
4. Correctness and Complexity Analysis
Correctness
Theorem (Correctness of Aegypti). Aegypti(G) returns a triangle
if and only if
contains a triangle.
Proof.
Soundness. Phase 1: by the triangle-witness lemma, any component of size
produced by FastCliqueUF is a clique. Phase 2: the algorithm returns
only when
.
Completeness. Suppose contains a triangle but Phase 1 misses it (e.g. because the matching pairs every triangle vertex with an off-triangle pendant). Then Phase 2's adjacency intersection over all directed edges with is complete: at least one of is in (a matching contains at most one of these three edges), and processing that edge finds the third vertex in the intersection.
Complexity of Phase 1 (Fast Path)
Lemma (Initialisation cost). Constructing FastCliqueUF(G) costs
time and
machine words of space.
Lemma (Per-union cost during Phase 1). Throughout Phase 1, every component has size at most 2. Consequently each call to Union(u, v) that occurs before the halting merge costs
amortised, dominated by the two Find operations and a constant number of single-bit lookups in the precomputed closed-neighbourhood bitsets.
Proof. Phase 1 halts immediately when any union produces a component of size
. Hence, at the moment any earlier Union(u, v) is performed, both components involved have size at most 2, and their union has size at most 4. The merged set is a clique iff every pair of vertices drawn from the two components is joined by an edge — at most
pairs. Each pair
is tested by reading the single bit
in the precomputed bitset
, which is an
word access. The two Find calls contribute the amortised
term.
Note that the dense numpy.uint64 reference implementation performs full-length bitset operations and therefore pays
word-level work per Union. This is a constant-factor SIMD-friendly implementation choice; the algorithmic per-union cost in the RAM model is
.
Theorem (Phase 1 running time). Phase 1 runs in time in the worst case. When a triangle is detected after only edges have been streamed, the cost reduces to .
Complexity of Phase 2 (Fallback)
Lemma (Matching edges are useless for Phase 2). Let be the matching used in Phase 1 and assume Phase 1 terminated without finding a triangle. Then for every , the edge is not contained in any triangle of .
Proof. Suppose for contradiction that and form a triangle. Then , and neither lies in because is a matching and both edges share an endpoint with . After Phase 1 Step 1, the component containing and is exactly (a 2-clique), and is still a singleton. When Phase 1 Step 2 processes the extra edge , the clique test on the union reduces to checking whether — which it is, so the union is accepted, the merged component has size 3, and Phase 1 returns a triangle. Contradiction.
Theorem (Phase 2 running time). Phase 2 runs in time.
Proof. By the lemma above, the matching edges can be safely excluded from Phase 2 without losing any triangle. The remaining edge set has size . Applying the classical Chiba–Nishizeki argument to yields a running time of .
Corollary (Magnitude of the saving). In any graph without isolated vertices, where is the size of a maximum matching. In graphs that admit a perfect or near-perfect matching this gives , a strict improvement over the bare bound.
Combined Running Time
Theorem (Total running time). Let
be an undirected simple graph and let
be the matching computed by min_maximal_matching. Aegypti(G) runs in time
in the worst case. When a triangle is found after only edges have been streamed, this improves to .
Space complexity. Aegypti uses
machine words, equivalently
bits, of working space — dominated by the
bitset arrays of FastCliqueUF. Phase 2 uses
additional words for the adjacency-set hash maps.
5. Experiments
Setup
All experiments were run on an 11th Gen Intel® Core™ i7-1165G7 (2.80 GHz), 32 GB DDR4 RAM, with Python 3.12, Aegypti 0.4.1, NumPy 2.2, and NetworkX 3.4. Each reported time is the median of five repeated trials, with the first (warm-up) trial discarded; the inter-run coefficient of variation was below 5% on all instances with measured time ms.
The benchmark is the finlay/experiments/ DIMACS suite committed to the source repository — 15 hand-crafted instances designed to exercise every regime relevant to the algorithm: tiny / dense cliques, triangle-free graphs (Petersen, the 50-cycle, a 40-vertex complete bipartite graph, a rejection-sampled Erdős–Rényi triangle-free graph), the adversarial decoy construction whose matching approximation dodges every triangle, and several sparse near-triangle-free graphs with one planted triangle.
Baselines
- Aegypti — the hybrid heuristic of this paper (matching-first Phase 1 with delete-on-reject, matching-edge-skip Phase 2).
- CN (Chiba–Nishizeki) — a standalone implementation of the classical adjacency-intersection algorithm using a degeneracy ordering and Python sets. This is the primary practical combinatorial competitor.
- Naive (BF) — the reference Boolean check diagonal computed in NumPy. Omitted for because of its rapid scaling.
Results
Times are medians of five trials in milliseconds; memory is peak resident memory traced by Python during a single run (in MB); "T" indicates whether the graph contains a triangle.
| Instance | T | Aegypti (ms) | CN (ms) | BF (ms) | Mem-Aeg (MB) | Mem-CN (MB) | ||
|---|---|---|---|---|---|---|---|---|
| exp01_k3 | 3 | 3 | Y | 0.10 | 0.01 | 0.14 | 0.00 | 0.00 |
| exp02_k10 | 10 | 45 | Y | 0.14 | 0.04 | 0.09 | 0.01 | 0.02 |
| exp03_k30 | 30 | 435 | Y | 0.73 | 0.24 | 0.29 | 0.03 | 0.10 |
| exp04_petersen | 10 | 15 | N | 0.18 | 0.03 | 0.06 | 0.01 | 0.01 |
| exp05_c50 | 50 | 50 | N | 0.62 | 0.10 | 0.24 | 0.06 | 0.03 |
| exp06_k20_20 | 40 | 400 | N | 3.50 | 0.72 | 0.29 | 0.12 | 0.14 |
| exp07_decoy1 | 6 | 6 | Y | 0.09 | 0.01 | 0.06 | 0.01 | 0.01 |
| exp08_decoy10 | 60 | 60 | Y | 0.72 | 0.10 | 0.36 | 0.06 | 0.04 |
| exp09_sparse_with_tri | 145 | 210 | Y | 2.35 | 0.42 | 3.72 | 0.15 | 0.10 |
| exp10_dense_rand | 200 | 6076 | Y | 9.80 | 4.92 | 11.78 | 0.20 | 1.02 |
| exp11_bipartite_tf | 80 | 252 | N | 2.47 | 0.47 | 1.23 | 0.11 | 0.09 |
| exp12_med_near_tf | 287 | 425 | Y | 5.03 | 0.77 | 27.79 | 0.33 | 0.20 |
| exp13_cycle_chord | 8 | 9 | Y | 0.19 | 0.04 | 0.05 | 0.01 | 0.01 |
| exp14_er_tf | 34 | 36 | N | 0.46 | 0.08 | 0.12 | 0.03 | 0.02 |
| exp15_decoy30x2 | 270 | 270 | Y | 2.89 | 0.40 | 23.30 | 0.27 | 0.15 |
Discussion
Three observations support the theoretical claims of Section 4.
(1) Correctness on adversarial inputs. The exp07_decoy1, exp08_decoy10, and exp15_decoy30x2 instances are explicitly constructed so that nx.approximation.min_maximal_matching pairs every triangle vertex with an off-triangle pendant — the worst case for matching-first Phase 1, since every matched pair is forced to lie outside a triangle. Aegypti nonetheless returns a valid triangle on all three, confirming that Phase 2 (with the matching-edge skip) is doing the work predicted by the matching-edge-skip lemma.
(2) Quantitative benefit of skipping matching edges. Phase 2's cost is
rather than
. The empirical matching ratio
ranges from roughly 0.06 on the densest input (exp10_dense_rand,
,
) to 0.45–0.50 on the sparse near-tree-like inputs (exp05_c50, exp08_decoy10, exp15_decoy30x2), which on the latter shrinks the Phase 2 work by a factor of about
relative to the bare
bound — consistent with the observed times on exp08_decoy10 (0.72 ms) and exp15_decoy30x2 (2.89 ms), which are dominated by Phase 1 streaming rather than Phase 2.
(3) Comparison with the baselines. The classical Chiba–Nishizeki implementation is the fastest on every instance in the table, by a factor of roughly
to
. This is consistent with the analysis: the
initialisation of the bitset arrays is paid unconditionally, and on small graphs (
in this suite) that initialisation cost dominates any constant-factor SIMD gain. The matrix-multiplication baseline is competitive only on the smallest instances and degrades rapidly with
, exactly as predicted by its
scaling. We emphasise that the finlay/experiments suite is intentionally restricted to small instances (
): it is a correctness-and-design suite chosen to exercise the matching-first / delete-on-reject / matching-skip mechanisms, not a scaling study.
6. Conclusion
We introduced Aegypti, a hybrid triangle-detection algorithm combining a SIMD-accelerated clique-constrained Union-Find (Phase 1 with matching-first edge ordering and delete-on-reject) with a classical
adjacency-intersection fallback (Phase 2, which skips matching edges). The algorithm is correct, runs in
time on triangle-containing inputs when a triangle is detected after streaming
edges, and degrades to
in the worst case. Per-union work in Phase 1 is
amortised, because the fast path halts the instant any component reaches size 3 — so every preceding Union acts on components of size at most 2, and the clique test reduces to a constant number of bit lookups in the precomputed closed-neighbourhood bitsets.
Open Problems
Improved initialisation. Can the initialisation cost be reduced to while preserving the bitset-union-find clique structure?
Edge ordering. Beyond the matching-first ordering used here, does a degeneracy-based or degree-based ordering further reduce the average on random graphs?
Triangle counting. Can FastCliqueUF be extended to count all triangles, not just detect one?
Dynamic graphs. Extending the structure to support edge deletions on the underlying graph (not just within the bitset cache) while maintaining the clique invariant remains an open problem.
Production-ready • Zero dependencies beyond NetworkX & NumPy • Battle-tested
pip install aegypti
import networkx as nx
from aegypti.algorithm import find_triangle_coordinates
G = nx.complete_graph(3)
triangle = find_triangle_coordinates(G)
print(triangle) # -> frozenset({0, 1, 2})
Top comments (0)