An Approximate Solution to the Minimum Vertex Cover Problem: The Salvador Algorithm
Frank Vega
Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA
vega.frank@gmail.com
Abstract
The Minimum Vertex Cover problem is NP-hard. The best known polynomial-time approximation ratio is
, matching a Unique Games Conjecture lower bound of
. Without the conjecture, a
inapproximability threshold follows from the 2-to-2 games line of work under
. We present Salvador, an approximation algorithm that maps each connected component to a weighted bipartite planar auxiliary graph, solves the auxiliary weighted vertex-cover problem exactly by a min-cut formulation, decodes the selected incidence nodes back to original vertices, and prunes redundant vertices. The algorithm returns a valid vertex cover and runs in
time on the reduced planar bipartite instance, where
is the auxiliary size; in the linear-incidence regime this is
in the original graph size. Experiments extracted from np_bench.txt cover 52 NPBench clique-complement instances and return near-optimal covers with mean displayed ratio
and maximum displayed ratio
against the DIMACS reference values. These theoretical and experimental signals make Salvador a concrete candidate for a sub-
analysis: proving a universal
bound, with
, would imply
. The implementation is distributed via PyPI as the Salvador package, version v0.0.3.
Introduction
Given an undirected graph , a set is a vertex cover if every edge of has at least one endpoint in ; the goal is to find a minimum-cardinality vertex cover, denoted .
The approximation landscape for vertex cover is unusually well charted. The classical maximal-matching algorithm achieves a -approximation in linear time; LP-based refinements reach factor , which is but falls short of a constant . From the hardness side, Dinur and Safra ruled out any ratio below under ; subsequently, Khot, Minzer, and Safra proved the 2-to-2 games conjecture, strengthening this barrier to for any fixed under ; and under the Unique Games Conjecture, no constant ratio below is achievable. A polynomial-time algorithm achieving a constant ratio would therefore resolve P versus NP.
The algorithm FindVertexCover operates in three phases.
- Preprocessing. Self-loops are discarded; isolated vertices are removed (they appear in no edge and need not be covered).
- Bipartite planar reduction and weighted solve. For every connected component, the core routine constructs a weighted oriented-incidence auxiliary graph . Each original edge creates an auxiliary edge between the incidence nodes and ; local consistency edges complete the planar bipartite gadget. Salvador then solves weighted vertex cover exactly on by the integral bipartite LP/min-cut formulation.
- Linear-time pruning. PruneRedundantVertices makes a single forward pass over the candidate cover and removes every vertex all of whose neighbours are already covered, reducing without compromising validity.
The construction and pruning phases are linear in the input and auxiliary sizes. The dominant step is the bipartite weighted vertex-cover solve on the planar auxiliary graph, which costs when because for planar graphs.
The main contributions are:
- A practical polynomial-time vertex-cover heuristic based on a weighted bipartite planar auxiliary reduction and an exact min-cut solve on the reduced instance.
- A gadget that makes explicit how oriented incidences encode an original edge and local consistency around a vertex.
- An experimental study on 52 logged NPBench clique-complement benchmark graphs showing displayed ratios in (mean ).
- A conjecture that the universal approximation ratio is at most for the explicit constant . Since , any proof of this conjecture places the algorithm strictly below the inapproximability barrier, directly implying .
- An open-source implementation, Salvador v0.0.3.
Research Data
The algorithm is implemented in the Python package Salvador, freely available on PyPI. The current version is v0.0.3, hosted at https://github.com/frankvegadelgado/salvador, with reproducible capsule at https://pypi.org/project/salvador/. It is released under the MIT License, uses git for versioning, and requires Python
3.12 with NetworkX
3.0.
Description of the Algorithm
Main Algorithm
FindVertexCover : After cleaning the graph it solves each connected component through the bipartite planar auxiliary reduction and then applies the pruning pass to the union of the decoded component covers.
Require: Undirected graph
Ensure: A vertex cover
- Remove all self-loops from .
- .
- If , return .
- .
- For each connected component of : set and .
- Set .
- .
- Return .
Bipartite Planar Gadget and Weighted Solve
ComponentCover : For each original edge , creates two oriented-incidence nodes and with weights and . The forcing edge encodes the original edge; local cyclic edges couple consecutive oriented incidences. The implementation verifies that the resulting auxiliary graph is bipartite and planar before calling the exact weighted bipartite vertex-cover solver.
Require: Connected component
Ensure: A vertex cover
- new graph; mutable copy of .
- For each : let ; remove from ; for : add nodes and to with weights and ; add forcing edge ; add local cyclic edges; if add closing local edge.
- If is not bipartite or not planar, raise reduction failure.
- .
- .
- Return .
The weighted solve is exact on : the vertex-cover linear program is integral on bipartite graphs by König's theorem and the max-flow/min-cut theorem. The selected auxiliary cover is decoded by first coordinate. The forcing edge is the key correctness device: covering it forces the decoded solution to contain or .
Linear-Time Pruning
PruneRedundantVertices : removes redundant vertices from a candidate cover in a single forward pass, running in total time.
Require: Adjacency map
, candidate cover
Ensure: Valid vertex cover
with no redundant vertex
- mutable copy of .
- For each : if , set .
- Return .
Correctness
Theorem 1 (Validity). FindVertexCover always returns a valid vertex cover of the input graph .
Proof. Isolated vertices are removed in preprocessing and appear in no edge. For a connected component and any original edge , ComponentCover adds the forcing edge to the auxiliary graph . The set returned by MinWeightVertexCoverBipartite is a vertex cover of , so at least one of and lies in ; decoding by first coordinate therefore places or in . Distinct connected components have no edges between them, so covers all edges. Finally, a vertex is pruned only when every neighbour of lies in , so every edge incident to stays covered by the other endpoint.
Approximation Ratio and Consequences for P vs NP
Let be the optimum vertex-cover size and let be the cover returned by FindVertexCover. The approximation ratio studied in this paper is:
For the logged DIMACS clique-complement instances, is computed as from the published clique number of the original DIMACS graph ; rows marked use a best-known clique lower bound and therefore give a displayed reference ratio rather than a certified ratio.
Proposition ( ). The value satisfies for the positive constant . In particular, .
Conjecture 1 (Universal -approximation). For every undirected graph , the Salvador bipartite planar reduction succeeds and FindVertexCover returns a vertex cover with:
Since , a proof of Conjecture 1 would give a polynomial-time approximation ratio strictly below the 2-to-2 games hardness threshold and would imply .
Theorem 2 (Implication for vs ). If FindVertexCover achieves an approximation ratio at most on every undirected graph, then .
Proof. Khot, Minzer, and Safra proved the 2-to-2 games conjecture and derived that, for every , approximating Minimum Vertex Cover within a factor of is NP-hard. Since with , the ratio is exactly of the form for . A universal ratio by this polynomial-time algorithm therefore constitutes a polynomial-time -approximation, contradicting the 2-to-2 games NP-hardness result unless .
Across the current NPBench run, the largest displayed Salvador/reference ratio is , below the conjectured threshold and below . Three layers of evidence motivate the conjecture: (i) the correctness theorem establishes feasibility of every returned cover; (ii) on every logged instance the displayed ratio lies in , with the maximum already below the weaker Dinur–Safra threshold ; (iii) the explicit constant makes the conjecture algebraically precise rather than asymptotic.
Runtime Analysis
Theorem 3 (Time complexity). Let be the input graph and let be the disjoint union of the bipartite planar auxiliary graphs constructed across all connected components. Put and . FindVertexCover runs in:
time. Since is planar, , so the auxiliary solve is . Writing for the reduced auxiliary size gives the advertised bound.
Proof sketch. Preprocessing removes self-loops and isolated vertices in time. Constructing the oriented-incidence auxiliary graph creates two incidence nodes per original edge and a linear number of forcing and local gadget edges: time. Bipartiteness and planarity checks are linear in the auxiliary size. The weighted vertex-cover subproblem on is solved by a matching/flow routine with worst-case time. Decoding and pruning scan incidence and adjacency lists once. Planarity gives , making the dominant term .
Experimental Study
We evaluated Salvador v0.0.3 on the NPBench vertex-cover clique-complement collection. The instances are DIMACS maximum-clique graphs converted to vertex-cover instances by complementing the edge set. A clique of size in the original graph is an independent set of size in the complement , so the vertex-cover reference optimum is:
The command used for the logged run was python -m salvador.batch -i ./experiments/ -c -l (equivalently batch_vega -i ./experiments/ -c -l). The log is stored as experiments/np_bench.txt and contains 52 completed runs. Because the log does not contain a competing baseline, only values directly recoverable from the log and the DIMACS reference optimum are reported. Rows marked
use a best-known clique lower bound, so the displayed
is an upper bound on the true optimum.
| Instance | Parse (ms) | Alg. (ms) | |||
|---|---|---|---|---|---|
brock200_1 |
179 | 184 | 1.028 | 17.1 | 4192.0 |
brock200_2 |
188 | 193 | 1.027 | 24.7 | 15255.0 |
brock200_3 |
185 | 191 | 1.032 | 31.2 | 8296.4 |
brock200_4 |
183 | 189 | 1.033 | 22.3 | 7264.8 |
brock400_1 |
373 | 385 | 1.032 | 57.1 | 38880.3 |
brock400_2 |
371 | 384 | 1.035 | 70.0 | 42005.5 |
brock400_3 |
369 | 383 | 1.038 | 60.0 | 44912.0 |
brock400_4 |
367 | 384 | 1.046 | 75.5 | 45628.1 |
c-fat200-1 |
188 | 188 | 1.000 | 116.4 | 66467.3 |
c-fat200-2 |
176 | 177 | 1.006 | 73.8 | 85894.9 |
c-fat200-5 |
142 | 142 | 1.000 | 52.9 | 27365.7 |
C1000.9 |
932 | 951 | 1.020 | 166.9 | 143665.8 |
C125.9 |
91 | 94 | 1.033 | 22.5 | 257.3 |
C250.9 |
206 | 214 | 1.039 | 25.7 | 3466.8 |
C500.9 |
443 | 464 | 1.047 | 60.0 | 13496.0 |
gen200_p0.9_44 |
156 | 170 | 1.090 | 26.9 | 1187.4 |
gen200_p0.9_55 |
145 | 169 | 1.166 | 24.4 | 868.3 |
gen400_p0.9_55 |
345 | 366 | 1.061 | 32.2 | 8786.5 |
gen400_p0.9_65 |
335 | 362 | 1.081 | 63.6 | 8631.0 |
gen400_p0.9_75 |
325 | 360 | 1.108 | 37.0 | 10609.0 |
hamming10-2 |
512 | 512 | 1.000 | 61.8 | 3036.9 |
hamming6-2 |
32 | 32 | 1.000 | 33.1 | 333.2 |
hamming6-4 |
60 | 62 | 1.033 | 16.2 | 345.8 |
hamming8-2 |
128 | 128 | 1.000 | 0.0 | 319.5 |
hamming8-4 |
240 | 240 | 1.000 | 142.7 | 6647.7 |
johnson16-2-4 |
112 | 112 | 1.000 | 22.3 | 779.3 |
johnson32-2-4 |
480 | 480 | 1.000 | 61.5 | 7749.2 |
johnson8-2-4 |
24 | 24 | 1.000 | 0.0 | 23.4 |
johnson8-4-4 |
56 | 56 | 1.000 | 9.2 | 117.4 |
keller4 |
160 | 164 | 1.025 | 14.1 | 4494.3 |
MANN_a27 |
252 | 253 | 1.004 | 2.6 | 294.7 |
MANN_a45 |
690 | 703 | 1.019 | 0.0 | 473.3 |
MANN_a81 |
2221 | 2225 | 1.002 | 15.9 | 2834.3 |
MANN_a9 |
29 | 29 | 1.000 | 0.0 | 16.1 |
p_hat300-1 |
292 | 294 | 1.007 | 80.2 | 85573.4 |
p_hat300-2 |
275 | 285 | 1.036 | 180.1 | 55464.7 |
p_hat300-3 |
264 | 276 | 1.045 | 94.6 | 15246.7 |
p_hat500-3 |
450 | 471 | 1.047 | 165.5 | 78912.2 |
san200_0.7_1 |
170 | 185 | 1.088 | 50.2 | 9117.5 |
san200_0.7_2 |
182 | 188 | 1.033 | 25.4 | 5512.1 |
san200_0.9_1 |
130 | 155 | 1.192 | 7.1 | 1774.6 |
san200_0.9_2 |
140 | 163 | 1.164 | 15.2 | 1736.4 |
san200_0.9_3 |
156 | 171 | 1.096 | 10.0 | 2013.5 |
san400_0.5_1 |
387 | 393 | 1.016 | 232.8 | 135755.1 |
san400_0.7_1 |
360 | 380 | 1.056 | 142.1 | 69143.0 |
san400_0.7_2 |
370 | 384 | 1.038 | 86.6 | 59137.5 |
san400_0.7_3 |
378 | 388 | 1.026 | 72.5 | 40035.9 |
san400_0.9_1 |
300 | 350 | 1.167 | 36.8 | 10906.9 |
sanr200_0.7 |
182 | 188 | 1.033 | 12.2 | 6429.0 |
sanr200_0.9 |
158 | 171 | 1.082 | 0.0 | 985.9 |
sanr400_0.5 |
387 | 391 | 1.010 | 97.7 | 141881.3 |
sanr400_0.7 |
379 | 386 | 1.018 | 178.4 | 84044.0 |
Across the 52 completed runs, Salvador has mean displayed ratio
, minimum
, and maximum
on san200_0.9_1. Eleven instances are solved exactly against the displayed reference values. Every empirical Salvador ratio lies below both the
inapproximability threshold and the conjectured bound of
.
Algorithm time totals
s across the 52 instances, with mean
ms, median
ms, minimum
ms, and maximum
ms on C1000.9. The Hamming, Johnson, c-fat, and MANN families show the strongest solution-quality behaviour. The largest displayed ratio is
on san200_0.9_1.
Conclusion
The Salvador algorithm combines a bipartite planar oriented-incidence gadget, an exact weighted vertex-cover solve on the auxiliary graph, decoding by first coordinate, and a linear-time redundancy-pruning step. The algorithm always returns a valid vertex cover and runs in time on the reduced planar auxiliary instance.
Across the 52 completed NPBench clique-complement runs, the displayed approximation ratio ranges from
to
with a mean of
. The central observation is that
for the explicit positive constant
, so the conjectured ratio lies strictly below the
inapproximability threshold. By Theorem 2, any proof of a universal Salvador ratio at most
would imply
. The implementation is freely available as the Salvador package (v0.0.3) on PyPI.
Top comments (0)