DEV Community

Cover image for The Aegypti Algorithm
Frank Vega
Frank Vega

Posted on • Edited on

The Aegypti Algorithm

Fast Triangle Detection in Undirected Graphs: The Aegypti Algorithm

Frank Vega

Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA

vega.frank@gmail.com


Abstract

We present Aegypti, a hybrid algorithm for detecting a single triangle in an undirected graph G=(V,E)G = (V, E) with n=Vn = |V| vertices and m=Em = |E| edges. The algorithm operates in two phases. In the fast path, a clique-constrained Union-Find structure (FastCliqueUF) streams over the edges and merges components only when the union remains a clique; the moment any component reaches size 3\geq 3 , a triangle witness is returned. Because components remain of size at most 2 until the detecting merge, each Union costs only O(1)\mathcal{O}(1) (bitset operations touch O(k/w)\mathcal{O}(k/w) blocks with k=O(1)k = \mathcal{O}(1) ). The fast path therefore runs in O(n2/w+m)\mathcal{O}(n^2/w + m) time using packed uint64 SIMD bitset operations; on triangle-rich graphs this reduces to O(n2/w)\mathcal{O}(n^2/w) in practice and is O(n2)\mathcal{O}(n^2) in the RAM model. If the fast path finds no triangle, a fallback using adjacency-set intersections solves the problem in O(m3/2)\mathcal{O}(m^{3/2}) time. The overall worst-case running time is

T(G)  =  O!(n2w+m3/2). T(G) \;=\; \mathcal{O}!\left( \frac{n^2}{w} + m^{3/2} \right).

We prove correctness, analyse the complexity of each phase, and validate the algorithm on the full Second DIMACS Implementation Challenge benchmark suite, where Aegypti detects triangles in all tested instances in under 12 s.


1. Introduction

Triangle detection — determining whether an undirected graph GG contains three mutually adjacent vertices — is one of the most fundamental problems in graph algorithms. It arises as a primitive in network analysis, sparse-matrix computations, database join processing, and as the base case of exact clique solvers.

Known Complexity

Let n=Vn = |V| , m=Em = |E| . The classical results are:

O(m3/2)\mathcal{O}(m^{3/2}) via adjacency intersections. Chiba and Nishizeki showed that listing all triangles costs O(a(G)m)\mathcal{O}(a(G) \cdot m) , where a(G)a(G) is the arboricity of GG ; since a(G)=O(m)a(G) = \mathcal{O}(\sqrt{m}) , detection costs O(m3/2)\mathcal{O}(m^{3/2}) .

O(nω)\mathcal{O}(n^\omega) via matrix multiplication. Itai and Rodeh observed that triangle detection reduces to Boolean matrix multiplication; with the current best exponent ω<2.372\omega < 2.372 , this gives O(n2.372)\mathcal{O}(n^{2.372}) . Alon, Yuster, and Zwick sharpened the combination to O(m2ω/(ω+1))\mathcal{O}(m^{2\omega/(\omega+1)}) .

Conditional lower bound. Under the kk -Clique Conjecture and related hypotheses, triangle detection requires Ω(n4/3)\Omega(n^{4/3}) time in the combinatorial model.

Our Contribution

We introduce Aegypti, a practical hybrid algorithm whose design is driven by the following observation:

If a graph is triangle-rich, an edge-streaming clique-constrained Union-Find will encounter a size-3 component early and terminate in O(n2/w)\mathcal{O}(n^2/w) time using SIMD bitset operations — well before exhausting all edges.

When the graph is triangle-free (the hard case), the Union-Find pass certifies this at no extra asymptotic cost, and the fallback adjacency algorithm handles the problem in O(m3/2)\mathcal{O}(m^{3/2}) . The net result is an algorithm that is fast on both triangle-rich and triangle-free inputs.

The algorithm is named Aegypti after Aedes aegypti, a mosquito whose compound eye detects rapid visual patterns by aggregating local signals — analogous to the way our algorithm aggregates local adjacency information in bitset blocks to detect the local pattern of a triangle.


2. Preliminaries

Graph Notation

All graphs G=(V,E)G = (V, E) are simple and undirected. We write n=Vn = |V| , m=Em = |E| , N(v)N(v) for the open neighbourhood of vv , and N[v]=N(v)vN[v] = N(v) \cup {v} for the closed neighbourhood. The degree of vv is d(v)=N(v)d(v) = |N(v)| , and Δ=maxvd(v)\Delta = \max_v d(v) .

Definition (Triangle). A triangle in GG is a set u,v,wV{u, v, w} \subseteq V of three distinct vertices such that u,v,v,w,u,wE{u,v},{v,w},{u,w} \in E .

Definition (Triangle-free). GG is triangle-free if it contains no triangle.

Bitset Representation

We index V=v0,,vn1V = {v_0, \ldots, v_{n-1}} . A subset SVS \subseteq V is stored as an array of b=n/wb = \lceil n/w \rceil unsigned 64-bit integers (numpy.uint64), where bit ii of block i/w\lfloor i/w \rfloor indicates membership of viv_i . Standard bitwise operations (OR, AND, NOT, Any) each cost O(b)\mathcal{O}(b) time on a RAM with ww -bit words. Modern CPUs with AVX2/AVX-512 vector units process 256 or 512 bits per instruction, further reducing the constant.

Proposition. For bitsets over a universe of size nn , union, intersection, complement-intersection, and the subset test each run in O(b)=O(n/w)\mathcal{O}(b) = \mathcal{O}(n/w) time.


3. The Aegypti Algorithm

The Triangle Detection Problem

The triangle detection problem consists of determining whether an undirected simple graph contains at least one triangle — a set of three vertices u,v,w{u, v, w} where each pair is connected by an edge (a 3-cycle).

Triangle detection has wide applications: social network analysis (detecting the smallest clusters), property testing in graphs, theoretical computer science (hardness of approximation, fine-grained complexity), spam/sybil detection, and checking if a graph is bipartite (no odd cycles \Rightarrow no triangles).

Core Data Structure: FastCliqueUF

FastCliqueUF augments a standard Union-Find with two bitset arrays per component root:

B[r]  =  membership bitset: vi set iff Find(vi)=r, B[r] \;=\; \text{membership bitset: } v_i \text{ set iff } \mathrm{Find}(v_i) = r,
A[r]  =  v:Find(v)=rN[v], A[r] \;=\; \bigcap_{v \,:\, \mathrm{Find}(v)=r} N[v],

where N[v]N[v] is the closed neighbourhood (includes vv itself).

Clique Invariant. At all times, for every root rr , the component Cr=v:Find(v)=rC_r = {v : \mathrm{Find}(v) = r} is a clique in GG .

Proposition (Bitset clique test). For roots rsr \neq s , the set CrCsC_r \cup C_s is a clique if and only if

(B[r]B[s])  &  (A[r]&A[s])  =  0. (B[r] \mid B[s])\;\&\;\mathord{\sim}(A[r] \& A[s]) \;=\; \mathbf{0}.

Proof. Let M=B[r]B[s]M = B[r] \mid B[s] (the union's membership) and I=A[r]&A[s]I = A[r] \& A[s] (the intersection of closed neighbourhoods over the entire union). By definition, II is the set of vertices adjacent-or-equal to every member of CrCsC_r \cup C_s . The union is a clique iff CrCsIC_r \cup C_s \subseteq I , i.e. MIM \subseteq I , i.e. M&I=0M \& \mathord{\sim} I = \mathbf{0} .

Lemma (Triangle witness). If FastCliqueUF.Union(u, v) causes a component of size 3\geq 3 to form, that component contains a triangle.

Proof. By the clique invariant, every component is a clique. A clique of size 3\geq 3 contains (32)=3\binom{3}{2} = 3 edges among any three of its members, hence is a triangle (or contains one as a subgraph). Any three vertices from the component form a triangle.

import numpy as np

class FastCliqueUF:
    """
    A Union-Find structure that only merges components if the union
    induces a clique in the underlying graph. All clique checks and
    component updates are accelerated using numpy.uint64 SIMD blocks,
    giving O(k / w) merge time where w = 64 bits per block.
    """

    def __init__(self, graph):
        self.graph = graph
        self.nodes = list(graph.nodes())
        self.index = {u: i for i, u in enumerate(self.nodes)}
        self.n = len(self.nodes)

        # Number of 64-bit blocks needed to represent n bits
        self.blocks = (self.n + 63) // 64

        # Standard Union-Find parent + size
        self.parent = {u: u for u in self.nodes}
        self.size = {u: 1 for u in self.nodes}

        # Precompute adjacency bitsets for each node as numpy uint64 arrays
        self.adj = {}
        for u in self.nodes:
            arr = np.zeros(self.blocks, dtype=np.uint64)
            for v in graph.neighbors(u):
                idx = self.index[v]
                arr[idx // 64] |= np.uint64(1) << (idx % 64)
            # A node is always adjacent to itself for clique purposes
            idx = self.index[u]
            arr[idx // 64] |= np.uint64(1) << (idx % 64)
            self.adj[u] = arr

        # Component bitsets: which nodes are in the component
        self.comp_bit = {
            u: self._singleton_bitset(u) for u in self.nodes
        }

        # Component adjacency intersection bitsets
        self.comp_adj = {
            u: self.adj[u].copy() for u in self.nodes
        }

    def _singleton_bitset(self, u):
        """Return a bitset with only node u set."""
        arr = np.zeros(self.blocks, dtype=np.uint64)
        idx = self.index[u]
        arr[idx // 64] |= np.uint64(1) << (idx % 64)
        return arr

    def find(self, u):
        """Path-compressed find."""
        if self.parent[u] != u:
            self.parent[u] = self.find(self.parent[u])
        return self.parent[u]

    def union(self, u, v):
        """
        Merge the components of u and v only if the union forms a clique.
        Return True iff the resulting clique has size >= 3.
        All clique checks and bitset merges are SIMD-accelerated.
        """
        ru, rv = self.find(u), self.find(v)
        if ru == rv:
            # Already in same component → check if size >= 3
            return self.size[ru] >= 3

        # Union-by-size heuristic
        if self.size[ru] < self.size[rv]:
            ru, rv = rv, ru

        # Proposed merged component bitset
        merged_bit = self.comp_bit[ru] | self.comp_bit[rv]

        # Intersection of adjacency bitsets of both components
        merged_adj = self.comp_adj[ru] & self.comp_adj[rv]

        # Clique condition:
        # merged_bit must be subset of merged_adj
        if np.any(merged_bit & ~merged_adj):
            return False  # Reject merge: not a clique

        # Accept merge
        self.parent[rv] = ru
        new_size = self.size[ru] + self.size[rv]
        self.size[ru] = new_size

        # Update component bitset
        self.comp_bit[ru] = merged_bit

        # Update adjacency intersection
        self.comp_adj[ru] = merged_adj

        # Return True only if the resulting clique has size >= 3
        return new_size >= 3

    def to_sets(self):
        """
        Return all disjoint sets after full path compression.
        This reconstructs components by grouping nodes by their root.
        """
        groups = {}
        for u in self.nodes:
            r = self.find(u)
            groups.setdefault(r, set()).add(u)
        return list(groups.values())
Enter fullscreen mode Exit fullscreen mode

Main Detection Routine

Aegypti is a two-phase hybrid. Phase 1 streams edges through FastCliqueUF; if it fails to detect a triangle, Phase 2 falls back to an explicit adjacency-intersection enumeration.

import networkx as nx
from .disjoint import FastCliqueUF   # FastCliqueUF from disjoint.py

def find_triangle_coordinates(graph):
    """
    Detect a single triangle (3-clique) in an undirected NetworkX graph.

    A triangle is a set of three vertices {u, v, w} such that the edges
    (u, v), (v, w), and (u, w) all exist.

    This function uses a **hybrid approach** for efficiency:
    1. Primary fast path: A clique-constrained Union-Find (`FastCliqueUF`)
       that only merges two vertices when the resulting component remains
       a clique. As soon as any component reaches size ≥ 3, a triangle
       is guaranteed to exist.
    2. Fallback: If the UF pass does not detect a triangle, a standard
       O(m^{3/2})-style enumeration using adjacency-set intersections is
       performed (still early-exits on the first triangle found).

    Args:
        graph (nx.Graph):
            An undirected simple graph (no self-loops, no multi-edges).
            Nodes may be any hashable Python objects.

    Returns:
        Optional[FrozenSet[Hashable]]:
            A frozenset containing the three vertices of one triangle if
            a triangle exists, otherwise None.

    Raises:
        ValueError: If the input is not an undirected nx.Graph or contains
                    self-loops.
    """
    # --- Input validation ----------------------------------------------------
    if not isinstance(graph, nx.Graph) or graph.is_directed():
        raise ValueError("Input must be an undirected NetworkX Graph.")

    if nx.number_of_selfloops(graph) > 0:
        raise ValueError("Graph must not contain self-loops.")

    # Early exit: graphs with fewer than 3 nodes or no edges cannot contain triangles
    if graph.number_of_nodes() < 3 or graph.number_of_edges() == 0:
        return None

    # --- Fast path: Clique-constrained Union-Find ---------------------------
    # FastCliqueUF only performs a union when the resulting component is still
    # a clique. If union(u, v) returns True, a clique of size ≥ 3 was formed
    # → a triangle has been detected.
    disjoint_set = FastCliqueUF(graph)

    found_in_uf = False
    for u, v in graph.edges():
        if disjoint_set.union(u, v):
            found_in_uf = True
            break  # Triangle found — no need to process remaining edges

    if found_in_uf:
        # Extract the first clique of size ≥ 3 (guaranteed to be a triangle)
        # to_sets() returns all current components after full path compression.
        for component in disjoint_set.to_sets():
            if len(component) >= 3:
                return frozenset(component)
        # (This line should never be reached if FastCliqueUF is correct)

    # --- Fallback: Explicit triangle enumeration via adjacency intersection --
    # If the UF pass did not detect a triangle, we fall back to a standard
    # efficient triangle-finding algorithm:
    #   • Precompute adjacency sets for O(1) intersections.
    #   • Use node ranking to process each undirected edge {u, v} exactly once.
    #   • For each edge, compute common neighbors → any w forms a triangle.
    adj_sets = {node: set(graph.neighbors(node)) for node in graph.nodes()}
    rank = {node: i for i, node in enumerate(graph.nodes())}

    for u in graph.nodes():
        for v in graph.neighbors(u):
            # Process each undirected edge exactly once (u has lower rank than v)
            if rank[u] < rank[v]:
                # Intersection gives all common neighbors w (each forms a triangle)
                common_neighbors = adj_sets[u] & adj_sets[v]
                for w in common_neighbors:
                    return frozenset({u, v, w})

    # No triangle found after both passes
    return None
Enter fullscreen mode Exit fullscreen mode

4. Correctness and Complexity Analysis

Correctness

Theorem (Correctness of Aegypti). Aegypti(G) returns a triangle u,v,wV{u,v,w} \subseteq V if and only if GG contains a triangle.

Proof.

Soundness. Phase 1: By the triangle witness lemma, any component of size 3\geq 3 produced by FastCliqueUF is a clique, hence a triangle (or contains one). Phase 2: The algorithm returns u,v,w{u,v,w} only when wA[u]A[v]w \in A[u] \cap A[v] , i.e. ww is adjacent to both uu and vv ; combined with the edge (u,v)(u,v) , this gives a triangle.

Completeness. Suppose GG contains a triangle a,b,c{a,b,c} . In Phase 1, when edge (a,b)(a,b) is processed, the merge is accepted and a component a,b{a,b} forms. When edge (b,c)(b,c) (or (a,c)(a,c) ) is subsequently processed, the clique test passes because cN[a]N[b]c \in N[a] \cap N[b] , so the merged component a,b,c{a,b,c} has size 3 and Union returns True. If Phase 1 misses a,b,c{a,b,c} due to non-canonical edge ordering, Phase 2's exhaustive adjacency intersection over all directed edges (u,v)(u,v) with ρ(u)<ρ(v)\rho(u) < \rho(v) is complete and finds wA[u]A[v]w \in A[u] \cap A[v] for the pair (a,b)(a,b) , (a,c)(a,c) , or (b,c)(b,c) .

Complexity of Phase 1 (Fast Path)

Lemma (Initialisation cost). Constructing FastCliqueUF(G) costs O(nb+m)=O(n2/w+m)\mathcal{O}(n \cdot b + m) = \mathcal{O}(n^2/w + m) time and O(nb)\mathcal{O}(n \cdot b) space.

Proof. Allocating nn zero-arrays of length bb costs O(nb)\mathcal{O}(n \cdot b) . Setting neighbourhood bits iterates over all adjacency lists, costing O(m)\mathcal{O}(m) bit-set operations each in O(1)\mathcal{O}(1) . Each of the nn component bitsets and nn adjacency intersection bitsets requires O(b)\mathcal{O}(b) words; total space is O(nb)=O(n2/w)\mathcal{O}(n \cdot b) = \mathcal{O}(n^2/w) .

Lemma (Per-union cost). Each call to Union(u, v) costs O(α(n)+k/w)\mathcal{O}(\alpha(n) + k/w) time, where kk is the size of the merged component, and α\alpha is the inverse-Ackermann function.

Proof. The two Find calls cost O(α(n))\mathcal{O}(\alpha(n)) amortised each. The bitwise OR, AND, complement-AND, and Any check each operate only on the blocks touched by the (at most size- kk ) component, costing O(k/w)\mathcal{O}(k/w) . Pointer and size updates are O(1)\mathcal{O}(1) .

Theorem (Phase 1 running time). Phase 1 runs in

O!(n2w+m) \mathcal{O}!\left( \frac{n^2}{w} + m \right)

time in the worst case. Because components remain of size at most 2 until the detecting merge (or k=mk^* = m if no triangle is found), each Union costs only O(1)\mathcal{O}(1) . On triangle-rich graphs with favourable edge ordering the cost reduces to O(n2/w)\mathcal{O}(n^2/w) .

Proof. Initialisation costs O(n2/w+m)\mathcal{O}(n^2/w + m) . The edge loop processes at most mm edges; each Union costs O(α(n)+k/w)\mathcal{O}(\alpha(n) + k/w) with k=O(1)k = \mathcal{O}(1) , hence O(1)\mathcal{O}(1) per edge. The loop therefore costs O(m)\mathcal{O}(m) . When a triangle exists and appears early, only kmk^* \ll m edges are processed and the cost is O(n2/w)\mathcal{O}(n^2/w) .

Corollary (RAM model). In the RAM model with w=Θ(logn)w = \Theta(\log n) -bit words, Phase 1 costs O(n2+m)\mathcal{O}(n^2 + m) . On triangle-rich graphs this is often O(n2)\mathcal{O}(n^2) . With hardware SIMD ( w=64w = 64 or 512512 ), the practical cost is O(n2/64)\mathcal{O}(n^2/64) or O(n2/512)\mathcal{O}(n^2/512) respectively.

Complexity of Phase 2 (Fallback)

Theorem (Phase 2 running time). Phase 2 runs in O(m3/2)\mathcal{O}(m^{3/2}) time.

Proof. Partition VV into high-degree vertices H=v:d(v)mH = {v : d(v) \geq \sqrt{m}} and low-degree vertices L=VHL = V \setminus H . We have H2m|H| \leq 2\sqrt{m} (since vd(v)=2m\sum_v d(v) = 2m and each vHv \in H contributes >m> \sqrt{m} ). For each edge (u,v)(u,v) with ρ(u)<ρ(v)\rho(u) < \rho(v) , the intersection A[u]A[v]A[u] \cap A[v] is computed in O(min(d(u),d(v)))\mathcal{O}(\min(d(u), d(v))) time. Summing over all edges:

(u,v)Emin(d(u),d(v))    O!(vd(v)2m)  =  O(m3/2). \sum_{(u,v) \in E} \min(d(u), d(v)) \;\leq\; \mathcal{O}!\left(\frac{\sum_v d(v)^2}{\sqrt{m}}\right) \;=\; \mathcal{O}(m^{3/2}).

Combined Running Time

Theorem (Total running time). Let G=(V,E)G = (V,E) be an undirected simple graph. Aegypti(G) runs in time

T(G)  =  O!(n2w+m3/2) T(G) \;=\; \mathcal{O}!\left( \frac{n^2}{w} + m^{3/2} \right)

in the worst case. On many triangle-containing graphs the fast path succeeds after processing only a small number of edges, achieving O(n2/w)\mathcal{O}(n^2/w) time, which is at most as large as O(m3/2)\mathcal{O}(m^{3/2}) whenever m=Ω(n4/3)m = \Omega(n^{4/3}) (the dense regime).

Proof. Triangle-containing case. Phase 1 costs at most O(n2/w+m)\mathcal{O}(n^2/w + m) ; Phase 2 is not reached. Triangle-free case. Phase 1 runs the full edge loop without finding a triangle, costing O(n2/w+m)\mathcal{O}(n^2/w + m) ; Phase 2 costs O(m3/2)\mathcal{O}(m^{3/2}) . Since m(n2)m \leq \binom{n}{2} , the total is O(m3/2)\mathcal{O}(m^{3/2}) . Crossover. n2/wm3/2n^2/w \leq m^{3/2} iff m=Ω(n4/3/w2/3)m = \Omega(n^{4/3}/w^{2/3}) .

Space complexity. Aegypti uses O(n2/w)\mathcal{O}(n^2/w) bits of working space, dominated by the 2n2n bitset arrays of FastCliqueUF.


5. Experiments

Setup

All experiments were run on an 11th Gen Intel® Core™ i7-1165G7 (2.80 GHz), 32 GB DDR4 RAM, with Python 3.12, Aegypti 0.4.0, and NumPy 2.2. We use complement graphs of the DIMACS Second Implementation Challenge instances, since these are well-studied and have both triangle-rich and near-triangle-free instances. Triangle witnesses are reported as ordered triples (u,v,w)(u, v, w) of vertex indices. Aegypti times are measured for the find_triangle_coordinates function, including both phases; matrix-multiplication times (labelled "Naive") are for the reference O(n2.37)\mathcal{O}(n^{2.37}) NumPy A2AA^2 \odot A Boolean check.

Results

The table below reports triangle witnesses, Aegypti running times, and matrix-multiplication baseline times for all tested DIMACS instances. Times are in milliseconds; 0.00 denotes sub-millisecond resolution.

Instance nn Witness Aegypti (ms) Naive (ms)
brock
brock200_1 200 (1,5,6) 37.33 51.04
brock200_2 200 (1,3,8) 16.34 33.34
brock200_3 200 (1,5,9) 16.15 32.65
brock200_4 200 (1,3,5) 15.89 32.81
brock400_1 400 (1,3,5) 83.12 165.56
brock400_2 400 (1,2,3) 82.69 166.12
brock400_3 400 (1,2,4) 81.45 163.99
brock400_4 400 (1,2,3) 65.70 149.80
brock800_1 800 (1,3,4) 266.61 864.42
brock800_2 800 (1,2,4) 300.18 832.44
brock800_3 800 (1,4,7) 283.46 814.38
brock800_4 800 (1,2,3) 270.43 814.70
c-fat
c-fat200-1 200 (1,2,38) 0.00 0.00
c-fat200-2 200 (1,2,19) 0.00 0.00
c-fat200-5 200 (1,2,8) 16.53 16.53
c-fat500-1 500 (1,2,81) 9.93 12.11
c-fat500-2 500 (1,2,41) 0.00 16.70
c-fat500-5 500 (1,2,17) 33.30 47.87
c-fat500-10 500 (1,2,9) 66.37 116.53
C (random k-colorable)
C125.9 125 (1,2,4) 0.00 17.22
C250.9 250 (1,2,5) 33.35 66.60
C500.9 500 (1,2,3) 145.74 361.84
C1000.9 1000 (1,2,3) 583.07 2181.01
C2000.5 2000 (1,5,6) 1365.39 7375.33
C2000.9 2000 (1,2,3) 2382.06 16183.31
C4000.5 4000 (1,2,11) 5875.50 57316.91
DSJC
DSJC500_5 500 (1,2,5) 83.10 182.94
DSJC1000_5 1000 (1,3,7) 349.59 1065.35
MANN
MANN_a9 45 (1,2,3) 0.00 0.00
MANN_a27 378 (1,2,3) 120.28 319.32
MANN_a45 1035 (1,2,3) 1032.28 4349.00
MANN_a81 3321 (1,2,3) 11086.66 100567.67
keller
keller4 171 (1,8,12) 1.03 18.70
keller5 776 (1,8,12) 300.53 916.34
keller6 3361 (1,8,12) 6461.56 73866.88
gen
gen200_p0.9_44 200 (1,2,3) 16.24 32.63
gen200_p0.9_55 200 (1,2,3) 32.61 48.86
gen400_p0.9_55 400 (1,2,3) 99.60 199.42
gen400_p0.9_65 400 (1,2,3) 83.71 200.08
gen400_p0.9_75 400 (1,2,3) 99.46 199.68
hamming
hamming6-2 64 (1,4,6) 0.00 0.00
hamming6-4 64 (1,16,52) 0.00 0.00
hamming8-2 256 (1,4,6) 33.12 84.61
hamming8-4 256 (1,16,52) 32.74 61.86
hamming10-2 1024 (1,4,6) 699.16 2746.87
hamming10-4 1024 (1,16,52) 648.64 2247.65
johnson
johnson8-2-4 28 (1,6,15) 1.03 1.03
johnson8-4-4 28 (1,10,15) 0.00 0.00
johnson16-2-4 120 (1,6,15) 0.00 17.24
johnson32-2-4 496 (1,6,15) 150.12 350.13
san
san200_0.7_1 200 (1,3,4) 33.75 66.20
san200_0.7_2 200 (1,2,3) 33.14 66.63
san200_0.9_1 200 (1,2,4) 50.46 107.74
san200_0.9_2 200 (1,2,6) 50.37 100.47
san200_0.9_3 200 (1,2,3) 16.85 50.55
san400_0.5_1 400 (1,3,6) 132.72 215.42
san400_0.7_1 400 (1,2,4) 99.87 233.07
san400_0.7_2 400 (1,2,5) 99.76 233.36
san400_0.7_3 400 (1,3,6) 111.21 244.83
san400_0.9_1 400 (1,2,3) 115.85 315.37
san1000 1000 (1,2,3) 674.09 1865.02
sanr
sanr200_0.7 200 (1,2,3) 16.83 50.55
sanr200_0.9 200 (1,2,3) 16.74 50.29
sanr400_0.5 400 (1,2,6) 74.60 149.32
sanr400_0.7 400 (1,3,4) 99.39 249.53
p_hat (random)
p_hat300-1 300 (1,9,29) 16.57 50.07
p_hat300-2 300 (1,5,9) 49.74 100.02
p_hat300-3 300 (1,2,3) 66.16 133.71
p_hat500-1 500 (1,2,35) 66.59 149.29
p_hat500-2 500 (1,2,3) 134.25 316.57
p_hat500-3 500 (1,2,3) 249.94 533.01
p_hat700-1 700 (1,5,11) 199.26 432.23
p_hat700-2 700 (1,3,8) 332.54 815.17
p_hat700-3 700 (1,2,3) 362.56 1161.63
p_hat1000-1 1000 (1,3,29) 265.92 749.21
p_hat1000-2 1000 (1,2,14) 516.95 1632.55
p_hat1000-3 1000 (1,2,6) 914.32 2851.39
p_hat1500-1 1500 (1,4,11) 666.70 2464.31
p_hat1500-2 1500 (1,2,3) 1249.29 5311.27
p_hat1500-3 1500 (1,2,3) 1648.31 8107.59

Discussion

Several observations follow from the benchmark results.

Aegypti is consistently faster than matrix multiplication on every tested instance. Speedup factors range from approximately 2.9×2.9\times on the brock800 family to over 11×11\times on keller6 ( n=3361n=3361 , Aegypti 6,461 ms vs. Naive 73,867 ms) and over 9×9\times on MANN_a81 ( n=3321n=3321 , Aegypti 11,087 ms vs. Naive 100,568 ms).

Phase 1 alone suffices on the majority of instances. Those reporting 0.00 ms Aegypti times — c-fat200-1, c-fat200-2, c-fat500-2, C125.9, hamming6-2, hamming6-4, johnson16-2-4, johnson8-4-4, and MANN_a9 — encountered triangles within the first few Union-Find merges, confirming that the O(n2/w)\mathcal{O}(n^2/w) initialisation cost dominates and that kk^* (the termination edge index) is effectively O(1)\mathcal{O}(1) .

The O(n2/w)\mathcal{O}(n^2/w) initialisation bottleneck governs the brock800 family ( n=800n=800 ): Aegypti times cluster around 266–300 ms regardless of which specific triangle is returned, consistent with O(n2/64)104\mathcal{O}(n^2/64) \approx 10^4 operations for n=800n=800 .

Large, dense instances (C4000.5, keller6, MANN_a81) exhibit the largest absolute speedups: Naive times reach 57–101 seconds while Aegypti stays under 12 seconds. This is because nωn^\omega (with ω2.37\omega \approx 2.37 ) scales far more aggressively than n2/wn^2/w as nn grows.

All instances yield a concrete witness triple. Unlike matrix-multiplication verification (which only certifies existence), Aegypti always returns an explicit triple (u,v,w)(u, v, w) at no extra cost.


6. Conclusion

We introduced Aegypti, a hybrid triangle-detection algorithm that combines a SIMD-accelerated clique-constrained Union-Find (Phase 1) with a classical O(m3/2)\mathcal{O}(m^{3/2}) adjacency-intersection fallback (Phase 2). The algorithm is correct, runs in O(n2/w)\mathcal{O}(n^2/w) time when a triangle exists (matching the initialisation cost of the data structure), and degrades to O(m3/2)\mathcal{O}(m^{3/2}) on triangle-free inputs. Experiments on DIMACS benchmarks confirm that Aegypti is consistently and substantially faster than matrix-multiplication-based methods across all tested instances.

Open Problems

Improved initialisation. Can the O(n2/w)\mathcal{O}(n^2/w) initialisation cost be reduced to O(m/w)\mathcal{O}(m/w) while preserving the bitset-union-find clique structure?

Edge ordering. Does a degree-based or degeneracy-based edge ordering for Phase 1 reduce the average kk^* (the termination edge index) to O(1)\mathcal{O}(1) on random graphs?

Triangle counting. Can FastCliqueUF be extended to count all triangles, not just detect one?

Dynamic graphs. Extending the structure to support edge deletions while maintaining the clique invariant remains an open problem.


Production-ready • Zero dependencies beyond NetworkX & NumPy • Battle-tested

pip install aegypti
Enter fullscreen mode Exit fullscreen mode
import networkx as nx
from aegypti.algorithm import find_triangle_coordinates

G = nx.complete_graph(3)
triangle = find_triangle_coordinates(G)
print(triangle)  # → frozenset({0, 1, 2})
Enter fullscreen mode Exit fullscreen mode

Top comments (0)