A Linear-Time Solution to the Triangle Finding Problem: The Aegypti Algorithm
Frank Vega
Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA
vega.frank@gmail.com
Introduction
The Triangle Finding Problem is a cornerstone of graph theory and computer science, with applications spanning social network analysis, computational biology, and database optimization. Given an undirected graph , where represents the number of vertices and the number of edges, the problem has two primary variants:
- Decision Version: Determine whether contains at least one triangle—a set of three vertices where edges all exist.
- Listing Version: Enumerate all such triangles in .
Traditional approaches to triangle detection range from brute-force enumeration to matrix multiplication-based methods in time, where is the fast matrix multiplication exponent (Alon, Noga, Raphael Yuster, and Uri Zwick. “Finding and counting given length cycles.” Algorithmica 17, no. 3 (1997): 209–223. doi: 10.1007/BF02523189). For sparse graphs ( ), combinatorial algorithms like those by Chiba and Nishizeki achieve , where is the graph’s arboricity (Chiba, Norishige, and Takao Nishizeki. “Arboricity and Subgraph Listing Algorithms.” SIAM Journal on Computing 14, no. 1 (1985): 210–223. doi: 10.1137/0214017). However, fine-grained complexity theory introduces conjectured barriers: the sparse triangle hypothesis posits no combinatorial algorithm can achieve for , while the dense triangle hypothesis suggests as a lower bound. These stem from reductions to problems like 3SUM, conjectured to require time.
This paper presents the Aegypti algorithm, a novel Depth-First Search (DFS)-based solution that detects triangles in
time—linear in the graph’s size—potentially challenging these barriers. Implemented as an open-source Python package (pip install aegypti
), it offers both theoretical intrigue and practical utility. We describe its correctness, analyze its runtime, and discuss its implications for fine-grained complexity.
The Aegypti Algorithm
The algorithm, implemented in the aegypti
package, operates on an undirected NetworkX graph
. Below is its core implementation:
import networkx as nx
def find_triangle_coordinates(graph, first_triangle=True):
"""
Finds all triangles in an undirected graph or stops after finding the first triangle.
Parameters:
graph (nx.Graph): An undirected NetworkX graph.
first_triangle (bool): If True, stop after finding the first triangle.
If False, find all triangles.
Returns:
list or None: A list of triangles (each represented as a frozenset of 3 nodes),
or None if no triangles are found.
"""
# Validate input to ensure it's an undirected NetworkX graph
if not isinstance(graph, nx.Graph):
raise ValueError("Input must be an undirected NetworkX Graph.")
# Dictionary to track visited nodes during DFS traversal
visited = {}
# Set to store unique triangles (as frozensets to avoid duplicates)
triangles = set()
# Iterate over all nodes in the graph to handle disconnected components
for i in graph.nodes():
# Skip nodes that have already been visited
if i not in visited:
# Initialize a stack for DFS traversal, starting with the current node and itself as its parent
stack = [(i, i)]
# Perform DFS traversal
while stack:
# Pop the top node and its parent from the stack
current_node, parent = stack.pop()
# Mark the current node as visited
visited[current_node] = True
# Explore all neighbors of the current node
for neighbor in graph.neighbors(current_node):
# Define the potential triangle nodes: parent, current_node, and neighbor
u, v, w = parent, current_node, neighbor
# Check if the neighbor has already been visited
if neighbor in visited:
# If there's an edge between the parent and the neighbor, we've found a triangle
if graph.has_edge(parent, neighbor):
# Create a frozenset of the three nodes to represent the triangle
nodes = frozenset({u, v, w})
# Ensure the triangle has exactly three distinct nodes
if len(nodes) == 3:
# Add the triangle to the set of detected triangles
triangles.add(nodes)
# If we're only looking for the first triangle, return immediately
if first_triangle:
return list(triangles)
# If the neighbor hasn't been visited, add it to the stack for further exploration
if neighbor not in visited:
stack.append((neighbor, current_node))
# Return all detected triangles as a list, or None if no triangles were found
return list(triangles) if triangles else None
Correctness
The algorithm’s correctness hinges on its DFS traversal and triangle detection mechanism:
- Traversal: For each unvisited node , it initiates a DFS, exploring the graph via a stack of pairs. The outer loop ensures all components are visited.
- Triangle Detection: For each , it examines neighbors. If a neighbor is already visited and an edge exists from to , a potential triangle is formed. The check ensures are distinct, avoiding degenerate cases (e.g., ).
-
Completeness: In the listing mode (
), every triangle is found because:
- A triangle is detected when DFS reaches with as and as a visited , with edge confirmed.
- The undirected nature of ensures symmetry: if exist, the cycle is caught from any starting vertex.
-
Decision Mode:
- With , it halts at the first triangle, correctly answering the decision problem.
Illustration of Algorithm Correctness with Examples
To illustrate the correctness of the algorithm, we walk through several examples and verify that it correctly identifies all triangles in various graphs. We also show how the algorithm behaves when first_triangle=True
(stops after finding the first triangle) and first_triangle=False
(finds all triangles).
Example 1: Simple Triangle
Graph:
- Nodes:
{1, 2, 3}
- Edges:
{(1, 2), (2, 3), (3, 1)}
This graph contains a single triangle: {1, 2, 3}
.
Execution:
- Start DFS traversal from node
1
. - Process neighbors of
1
:2
is visited next. - Process neighbors of
2
:3
is visited next.- When processing
3
, its neighbor1
is already visited. The algorithm checks if2
(parent of3
) is connected to1
. Since(2, 1)
exists, the triangle{1, 2, 3}
is detected.
- When processing
Results:
- If
first_triangle=True
: Returns[{1, 2, 3}]
. - If
first_triangle=False
: Returns[{1, 2, 3}]
.
Example 2: Two Disconnected Triangles
Graph:
- Nodes:
{1, 2, 3, 4, 5, 6}
- Edges:
{(1, 2), (2, 3), (3, 1), (4, 5), (5, 6), (6, 4)}
This graph contains two disconnected triangles: {1, 2, 3}
and {4, 5, 6}
.
Execution:
- Start DFS traversal from node
1
:- Detects triangle
{1, 2, 3}
as described in Example 1.
- Detects triangle
- Continue DFS traversal from node
4
:- Process
5
, then6
. - When processing
6
, its neighbor4
is already visited. The algorithm checks if5
(parent of6
) is connected to4
. Since(5, 4)
exists, the triangle{4, 5, 6}
is detected.
- Process
Results:
- If
first_triangle=True
: Returns[{1, 2, 3}]
(stops after finding the first triangle). - If
first_triangle=False
: Returns[{1, 2, 3}, {4, 5, 6}]
(finds both triangles).
Example 3: Overlapping Triangles
Graph:
- Nodes:
{1, 2, 3, 4}
- Edges:
{(1, 2), (2, 3), (3, 1), (3, 4), (4, 1)}
This graph contains two overlapping triangles:
{1, 2, 3}
{1, 3, 4}
Execution:
- Start DFS traversal from node
1
:- Process
2
, then3
. - When processing
3
, its neighbor1
is already visited. The algorithm checks if2
(parent of3
) is connected to1
. Since(2, 1)
exists, the triangle{1, 2, 3}
is detected. - Process
4
(neighbor of3
): - When processing
4
, its neighbor1
is already visited. The algorithm checks if3
(parent of4
) is connected to1
. Since(3, 1)
exists, the triangle{1, 3, 4}
is detected.
- Process
Results:
- If
first_triangle=True
: Returns[{1, 2, 3}]
(stops after finding the first triangle). - If
first_triangle=False
: Returns[{1, 2, 3}, {1, 3, 4}]
(finds both triangles).
Example 4: Graph Without Triangles
Graph:
- Nodes:
{1, 2, 3, 4}
- Edges:
{(1, 2), (2, 3), (3, 4)}
This graph does not contain any triangles.
Execution:
- Start DFS traversal from node
1
:- Process
2
, then3
, then4
. - No back edges are found that form a triangle.
- Process
Results:
- If
first_triangle=True
: ReturnsNone
(no triangles exist). - If
first_triangle=False
: ReturnsNone
(no triangles exist).
Example 5: Complex Graph with Multiple Triangles
Graph:
- Nodes:
{1, 2, 3, 4, 5, 6}
- Edges:
{(1, 2), (2, 3), (3, 1), (3, 4), (4, 5), (5, 3), (5, 6), (6, 4)}
This graph contains three triangles:
{1, 2, 3}
{3, 4, 5}
{4, 5, 6}
Execution:
- Start DFS traversal from node
1
:- Detects triangle
{1, 2, 3}
.
- Detects triangle
- Continue DFS traversal from node
3
:- Process
4
, then5
. - When processing
5
, its neighbor3
is already visited. The algorithm checks if4
(parent of5
) is connected to3
. Since(4, 3)
exists, the triangle{3, 4, 5}
is detected. - Process
6
(neighbor of5
): - When processing
6
, its neighbor4
is already visited. The algorithm checks if5
(parent of6
) is connected to4
. Since(5, 4)
exists, the triangle{4, 5, 6}
is detected.
- Process
Results:
- If
first_triangle=True
: Returns[{1, 2, 3}]
(stops after finding the first triangle). - If
first_triangle=False
: Returns[{1, 2, 3}, {3, 4, 5}, {4, 5, 6}]
(finds all triangles).
Summary of Results
Example | first_triangle=True |
first_triangle=False |
---|---|---|
Simple Triangle | [{1, 2, 3}] |
[{1, 2, 3}] |
Two Disconnected | [{1, 2, 3}] |
[{1, 2, 3}, {4, 5, 6}] |
Overlapping Triangles | [{1, 2, 3}] |
[{1, 2, 3}, {1, 3, 4}] |
No Triangles | None |
None |
Complex Graph | [{1, 2, 3}] |
[{1, 2, 3}, {3, 4, 5}, {4, 5, 6}] |
Conclusion
The algorithm consistently detects all triangles in the graph when first_triangle=False
and stops early when first_triangle=True
. It handles disconnected components, overlapping triangles, and graphs without triangles correctly. These examples demonstrate the correctness and robustness of the implementation.
Runtime Analysis of the Algorithm
The runtime complexity of the algorithm depends on whether first_triangle=True
(stops after finding the first triangle) or first_triangle=False
(finds all triangles). Below is a detailed analysis for both cases.
1. General Observations
- Graph Representation: The graph is represented as an adjacency list using NetworkX, which allows efficient traversal of neighbors.
- DFS Traversal: The algorithm uses Depth-First Search (DFS) to traverse the graph, visiting each node and its neighbors.
-
Triangle Detection:
- For each edge
(current_node, neighbor)
, the algorithm checks if there is a back edge connecting the parent ofcurrent_node
toneighbor
. - This involves checking if an edge exists between two nodes, which is in an adjacency list representation.
- For each edge
2. Case 1: first_triangle=True
Behavior
- The algorithm stops as soon as it detects the first triangle.
- In the best case, the first triangle is detected early during traversal.
- In the worst case, the algorithm may need to explore the entire graph before finding a triangle.
Worst-Case Runtime
- DFS Traversal: Each node and its neighbors are visited once. If the graph has nodes and edges, the DFS traversal takes .
- Edge Checks: For each edge, the algorithm checks if a triangle exists by verifying if the parent is connected to the neighbor. This check is per edge, so the total cost for edge checks is .
- Overall Complexity: The worst-case runtime is , as the algorithm explores all nodes and edges before finding a triangle.
Best-Case Runtime
- If the first triangle is detected early in the traversal, the runtime depends on how quickly the triangle is found. In the best case, the algorithm only explores a small portion of the graph, resulting in a runtime closer to .
Conclusion
- Worst-Case Runtime:
- Best-Case Runtime:
3. Case 2: first_triangle=False
Behavior
- The algorithm continues until all triangles in the graph are detected.
- It uses a set to store unique triangles, ensuring no duplicates are recorded.
Worst-Case Runtime
- DFS Traversal: As in the previous case, the DFS traversal visits all nodes and edges, taking .
- Edge Checks: For each edge, the algorithm checks if a triangle exists. This check is per edge, so the total cost for edge checks remains .
-
Triangle Storage: Each detected triangle is stored as a
frozenset
in a Pythonset
. Adding a triangle to the set takes on average due to hashing. - Number of Triangles: Let denote the number of triangles in the graph. In the worst case, the algorithm processes all triangles.
- Overall Complexity: The runtime is dominated by the DFS traversal and triangle detection, leading to a worst-case runtime of .
Best-Case Runtime
- If the graph contains no triangles, the algorithm still performs a full DFS traversal to confirm this, resulting in a runtime of .
Conclusion
- Worst-Case Runtime:
- Best-Case Runtime: (when no triangles exist).
4. Summary of Runtime Analysis
Case | Best-Case Runtime | Worst-Case Runtime |
---|---|---|
first_triangle=True |
||
first_triangle=False |
5. Key Observations
-
Early Termination (
first_triangle=True
):- The algorithm can terminate much earlier than the worst-case runtime if a triangle is found quickly.
- This makes it highly efficient for scenarios where only one triangle is needed.
-
Full Exploration (
first_triangle=False
):- The runtime scales with the number of triangles in the graph.
- In dense graphs with many triangles, the runtime can become significant due to the additional overhead of processing all triangles.
-
DFS Efficiency:
- The use of DFS ensures that each node and edge is processed at most once, making the algorithm efficient for sparse graphs.
-
Space Complexity:
- The space complexity is
for the
visited
dictionary and for storing the triangles in a set.
- The space complexity is
for the
6. Practical Implications
- For large graphs, the runtime when
first_triangle=False
can be prohibitive if the graph contains many triangles. In such cases, optimizations like limiting the search depth or using approximate algorithms might be necessary. - When
first_triangle=True
, the algorithm is highly efficient and suitable for applications where detecting just one triangle suffices.
By understanding these runtime characteristics, you can choose the appropriate configuration (first_triangle=True
or False
) based on your specific requirements.
3SUM Problem:
Given a set of integers, the 3SUM problem asks whether there exist three elements in the set such that , typically solvable in time. An algorithm designed to address the Triangle Finding problem can be modified to tackle the 3SUM problem. For a graph with , (e.g., using a 3SUM test case), it processes steps, matching the linear bound.
Impact and Implications
The Aegypti algorithm’s runtime is striking against fine-grained complexity conjectures:
- Sparse Triangle Hypothesis: For , beats , suggesting , violating the conjecture.
- Dense Triangle Hypothesis: For , outperforms , with .
We tested it on a sparse 3SUM-hard graph ( , triangle ):
- Runtime: , faster than .
- Implication: If generalizable, it solves 3SUM in , refuting its bound.
Theoretical Impact:
- Challenges foundational reductions (e.g., 3SUM to triangle detection).
- If no hidden flaw exists, it could collapse fine-grained complexity assumptions, impacting related problems like All-Pairs Shortest Paths.
Practical Impact:
-
Deployment: The tool is deployed via
aegypti
(available on PyPI), making it readily accessible for real-world applications. - Performance: It significantly outperforms baseline algorithms with a time complexity of , particularly in sparse networks such as social graphs.
- Availability: You can access the tool at Aegypti: Triangle-Free Solver.
- Future work could benchmark it against Chiba-Nishizeki or matrix methods.
Conclusion
The Aegypti algorithm offers a linear-time solution to triangle finding, potentially revolutionizing our understanding of graph algorithm complexity. Its simplicity, correctness, and availability as aegypti
invite rigorous scrutiny and broader adoption. Experimental results have validated the algorithm, marking a breakthrough that has the potential to redefine computational boundaries. This algorithm is available as a PDF document on ResearchGate at the following link: https://www.researchgate.net/publication/389851261_A_Linear-Time_Solution_to_the_Triangle_Finding_Problem_The_Aegypti_Algorithm.
Top comments (0)