DEV Community

Cover image for The Salvador Algorithm
Frank Vega
Frank Vega

Posted on • Edited on

The Salvador Algorithm

An Approximate Solution to the Minimum Vertex Cover Problem: The Salvador Algorithm

Frank Vega

Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA

vega.frank@gmail.com


Abstract

The Minimum Vertex Cover problem is NP-hard. The best known polynomial-time approximation ratio is 22 , matching a Unique Games Conjecture lower bound of 2ε2-\varepsilon . Without the conjecture, a 2ε\sqrt{2}-\varepsilon inapproximability threshold follows from the 2-to-2 games line of work under PNP\mathrm{P}\neq\mathrm{NP} . We present Salvador, an approximation algorithm that maps each connected component to a weighted bipartite planar auxiliary graph, solves the auxiliary weighted vertex-cover problem exactly by a min-cut formulation, decodes the selected incidence nodes back to original vertices, and prunes redundant vertices. The algorithm returns a valid vertex cover and runs in O(n^3/2)O(\hat{n}^{3/2}) time on the reduced planar bipartite instance, where n^\hat{n} is the auxiliary size; in the linear-incidence regime this is O(n3/2)O(n^{3/2}) in the original graph size. Experiments extracted from np_bench.txt cover 52 NPBench clique-complement instances and return near-optimal covers with mean displayed ratio 1.0421.042 and maximum displayed ratio 1.1921.192 against the DIMACS reference values. These theoretical and experimental signals make Salvador a concrete candidate for a sub- 2\sqrt{2} analysis: proving a universal 1.4=2ε01.4=\sqrt{2}-\varepsilon_0 bound, with ε0=27/5>0\varepsilon_0=\sqrt{2}-7/5>0 , would imply P=NP\mathrm{P}=\mathrm{NP} . The implementation is distributed via PyPI as the Salvador package, version v0.0.3.


Introduction

Given an undirected graph G=(V,E)G=(V,E) , a set CVC\subseteq V is a vertex cover if every edge of EE has at least one endpoint in CC ; the goal is to find a minimum-cardinality vertex cover, denoted τ(G)\tau(G) .

The approximation landscape for vertex cover is unusually well charted. The classical maximal-matching algorithm achieves a 22 -approximation in linear time; LP-based refinements reach factor 2Θ(1/logn)2-\Theta(1/\sqrt{\log n}) , which is 2o(1)2-o(1) but falls short of a constant 2ε2-\varepsilon . From the hardness side, Dinur and Safra ruled out any ratio below 105211.360610\sqrt{5}-21\approx 1.3606 under PNP\mathrm{P}\neq\mathrm{NP} ; subsequently, Khot, Minzer, and Safra proved the 2-to-2 games conjecture, strengthening this barrier to 2ε\sqrt{2}-\varepsilon for any fixed ε>0\varepsilon>0 under PNP\mathrm{P}\neq\mathrm{NP} ; and under the Unique Games Conjecture, no constant ratio below 2ε2-\varepsilon is achievable. A polynomial-time algorithm achieving a constant ratio ρ<2\rho<\sqrt{2} would therefore resolve P versus NP.

The algorithm FindVertexCover operates in three phases.

  1. Preprocessing. Self-loops are discarded; isolated vertices are removed (they appear in no edge and need not be covered).
  2. Bipartite planar reduction and weighted solve. For every connected component, the core routine constructs a weighted oriented-incidence auxiliary graph BB . Each original edge u,v{u,v} creates an auxiliary edge between the incidence nodes xuvx_{uv} and xvux_{vu} ; local consistency edges complete the planar bipartite gadget. Salvador then solves weighted vertex cover exactly on BB by the integral bipartite LP/min-cut formulation.
  3. Linear-time pruning. PruneRedundantVertices makes a single forward pass over the candidate cover CC and removes every vertex all of whose neighbours are already covered, reducing C|C| without compromising validity.

The construction and pruning phases are linear in the input and auxiliary sizes. The dominant step is the bipartite weighted vertex-cover solve on the planar auxiliary graph, which costs O(n^3/2)O(\hat{n}^{3/2}) when n^=V(B)\hat{n}=|V(B)| because E(B)=O(n^)|E(B)|=O(\hat{n}) for planar graphs.

The main contributions are:

  • A practical polynomial-time vertex-cover heuristic based on a weighted bipartite planar auxiliary reduction and an exact min-cut solve on the reduced instance.
  • A gadget that makes explicit how oriented incidences encode an original edge and local consistency around a vertex.
  • An experimental study on 52 logged NPBench clique-complement benchmark graphs showing displayed ratios in [1.000,1.192][1.000,\,1.192] (mean 1.0421.042 ).
  • A conjecture that the universal approximation ratio is at most 1.4=2ε01.4=\sqrt{2}-\varepsilon_0 for the explicit constant ε0=2750.0142>0\varepsilon_0=\sqrt{2}-\tfrac{7}{5}\approx 0.0142>0 . Since 1.4<21.4<\sqrt{2} , any proof of this conjecture places the algorithm strictly below the 2\sqrt{2} inapproximability barrier, directly implying P=NP\mathrm{P}=\mathrm{NP} .
  • An open-source implementation, Salvador v0.0.3.

Research Data

The algorithm is implemented in the Python package Salvador, freely available on PyPI. The current version is v0.0.3, hosted at https://github.com/frankvegadelgado/salvador, with reproducible capsule at https://pypi.org/project/salvador/. It is released under the MIT License, uses git for versioning, and requires Python \geq 3.12 with NetworkX \geq 3.0.


Description of the Algorithm

Main Algorithm

FindVertexCover (G)(G) : After cleaning the graph it solves each connected component through the bipartite planar auxiliary reduction and then applies the pruning pass to the union of the decoded component covers.

Require: Undirected graph G=(V,E)G=(V,E)

Ensure: A vertex cover CVC\subseteq V

  1. Remove all self-loops from GG .
  2. GGvV:degG(v)=0G \gets G \setminus {v\in V : \deg_G(v)=0} .
  3. If E=E=\emptyset , return \emptyset .
  4. CC\gets\emptyset .
  5. For each connected component GiG_i of GG : set CiComponentCover(Gi)C_i\gets\textit{ComponentCover}(G_i) and CCCiC\gets C\cup C_i .
  6. Set adjvNG(v):vV\textit{adj}\gets{v\mapsto N_G(v):v\in V} .
  7. CPruneRedundantVertices(adj,C)C\gets\textit{PruneRedundantVertices}(\textit{adj},C) .
  8. Return CC .

Bipartite Planar Gadget and Weighted Solve

ComponentCover (Gi)(G_i) : For each original edge u,v{u,v} , creates two oriented-incidence nodes xuvx_{uv} and xvux_{vu} with weights 1/degG(u)1/\deg_G(u) and 1/degG(v)1/\deg_G(v) . The forcing edge xuv,xvu{x_{uv},x_{vu}} encodes the original edge; local cyclic edges couple consecutive oriented incidences. The implementation verifies that the resulting auxiliary graph is bipartite and planar before calling the exact weighted bipartite vertex-cover solver.

Require: Connected component Gi=(Vi,Ei)G_i=(V_i,E_i)

Ensure: A vertex cover CiViC_i\subseteq V_i

  1. BB\gets new graph; WW\gets mutable copy of GiG_i .
  2. For each uViu\in V_i : let (v1,,vd)=NW(u)(v_1,\ldots,v_d)=N_W(u) ; remove uu from WW ; for j=1,,dj=1,\ldots,d : add nodes xuvjx_{uv_j} and xvjux_{v_j u} to BB with weights w(xuvj)=1/degGi(u)w(x_{uv_j})=1/\deg_{G_i}(u) and w(xvju)=1/degGi(vj)w(x_{v_j u})=1/\deg_{G_i}(v_j) ; add forcing edge xuvj,xvju{x_{uv_j},x_{v_j u}} ; add local cyclic edges; if d>1d>1 add closing local edge.
  3. If BB is not bipartite or not planar, raise reduction failure.
  4. SMinWeightVertexCoverBipartite(B)S\gets\textit{MinWeightVertexCoverBipartite}(B) .
  5. CiuVi:xuvS for some auxiliary node xuvC_i\gets{u\in V_i: x_{uv}\in S\text{ for some auxiliary node }x_{uv}} .
  6. Return CiC_i .

The weighted solve is exact on BB : the vertex-cover linear program is integral on bipartite graphs by König's theorem and the max-flow/min-cut theorem. The selected auxiliary cover is decoded by first coordinate. The forcing edge xuv,xvu{x_{uv},x_{vu}} is the key correctness device: covering it forces the decoded solution to contain uu or vv .

Linear-Time Pruning

PruneRedundantVertices (adj,C)(\textit{adj},C) : removes redundant vertices from a candidate cover in a single forward pass, running in O(n+m)O(n+m) total time.

Require: Adjacency map adj\textit{adj} , candidate cover CVC\subseteq V

Ensure: Valid vertex cover CCC'\subseteq C with no redundant vertex

  1. CC\gets mutable copy of CC .
  2. For each vCv\in C : if NG(v)CN_G(v)\subseteq C , set CCvC\gets C\setminus{v} .
  3. Return CC .

Correctness

Theorem 1 (Validity). FindVertexCover always returns a valid vertex cover of the input graph GG .

Proof. Isolated vertices are removed in preprocessing and appear in no edge. For a connected component GiG_i and any original edge u,vEi{u,v}\in E_i , ComponentCover adds the forcing edge xuv,xvu{x_{uv},x_{vu}} to the auxiliary graph BB . The set SS returned by MinWeightVertexCoverBipartite is a vertex cover of BB , so at least one of xuvx_{uv} and xvux_{vu} lies in SS ; decoding by first coordinate therefore places uu or vv in CiC_i . Distinct connected components have no edges between them, so C=iCiC=\bigcup_i C_i covers all edges. Finally, a vertex vv is pruned only when every neighbour of vv lies in CC , so every edge incident to vv stays covered by the other endpoint. \square


Approximation Ratio and Consequences for P vs NP

Let τ(G)\tau(G) be the optimum vertex-cover size and let CS(G)C_S(G) be the cover returned by FindVertexCover. The approximation ratio studied in this paper is:

RS(G)=CS(G)τ(G). R_S(G)=\frac{|C_S(G)|}{\tau(G)}.

For the logged DIMACS clique-complement instances, τ(G)\tau(G) is computed as τ=nω(G0)\tau^*=n-\omega(G_0) from the published clique number of the original DIMACS graph G0G_0 ; rows marked \dagger use a best-known clique lower bound and therefore give a displayed reference ratio rather than a certified ratio.

Proposition ( 1.4=2ε01.4=\sqrt{2}-\varepsilon_0 ). The value 75=1.4\tfrac{7}{5}=1.4 satisfies 75=2ε0\tfrac{7}{5}=\sqrt{2}-\varepsilon_0 for the positive constant ε0=2750.0142>0\varepsilon_0=\sqrt{2}-\tfrac{7}{5}\approx 0.0142>0 . In particular, 1.4<21.4<\sqrt{2} .

Conjecture 1 (Universal 1.41.4 -approximation). For every undirected graph GG , the Salvador bipartite planar reduction succeeds and FindVertexCover returns a vertex cover CS(G)C_S(G) with:

CS(G)75τ(G)=(2ε0)τ(G),ε0=275>0. |C_S(G)|\leq\tfrac{7}{5}\,\tau(G)=(\sqrt{2}-\varepsilon_0)\,\tau(G),\qquad \varepsilon_0=\sqrt{2}-\tfrac{7}{5}>0.

Since 1.4<21.4<\sqrt{2} , a proof of Conjecture 1 would give a polynomial-time approximation ratio strictly below the 2-to-2 games hardness threshold and would imply P=NP\mathrm{P}=\mathrm{NP} .

Theorem 2 (Implication for P\mathrm{P} vs NP\mathrm{NP} ). If FindVertexCover achieves an approximation ratio at most 1.4=2ε01.4=\sqrt{2}-\varepsilon_0 on every undirected graph, then P=NP\mathrm{P}=\mathrm{NP} .

Proof. Khot, Minzer, and Safra proved the 2-to-2 games conjecture and derived that, for every δ>0\delta>0 , approximating Minimum Vertex Cover within a factor of (2δ)(\sqrt{2}-\delta) is NP-hard. Since 1.4=2ε01.4=\sqrt{2}-\varepsilon_0 with ε0=275>0\varepsilon_0=\sqrt{2}-\tfrac{7}{5}>0 , the ratio 1.41.4 is exactly of the form 2δ\sqrt{2}-\delta for δ=ε0\delta=\varepsilon_0 . A universal ratio 1.4\leq 1.4 by this polynomial-time algorithm therefore constitutes a polynomial-time (2ε0)(\sqrt{2}-\varepsilon_0) -approximation, contradicting the 2-to-2 games NP-hardness result unless P=NP\mathrm{P}=\mathrm{NP} . \square

Across the current NPBench run, the largest displayed Salvador/reference ratio is 1.1921.192 , below the conjectured 1.41.4 threshold and below 21.414\sqrt{2}\approx1.414 . Three layers of evidence motivate the conjecture: (i) the correctness theorem establishes feasibility of every returned cover; (ii) on every logged instance the displayed ratio lies in [1.000,1.192][1,1.4)[1.000,\,1.192]\subset[1,\,1.4) , with the maximum already below the weaker Dinur–Safra threshold 105211.360610\sqrt{5}-21\approx 1.3606 ; (iii) the explicit constant ε00.0142\varepsilon_0\approx 0.0142 makes the conjecture algebraically precise rather than asymptotic.


Runtime Analysis

Theorem 3 (Time complexity). Let G=(V,E)G=(V,E) be the input graph and let BB be the disjoint union of the bipartite planar auxiliary graphs constructed across all connected components. Put n^=V(B)\hat{n}=|V(B)| and m^=E(B)\hat{m}=|E(B)| . FindVertexCover runs in:

O(V+E+m^n^) O\bigl(|V|+|E|+\hat{m}\sqrt{\hat{n}}\bigr)

time. Since BB is planar, m^=O(n^)\hat{m}=O(\hat{n}) , so the auxiliary solve is O(n^3/2)O(\hat{n}^{3/2}) . Writing nn for the reduced auxiliary size gives the advertised O(n3/2)O(n^{3/2}) bound.

Proof sketch. Preprocessing removes self-loops and isolated vertices in O(V+E)O(|V|+|E|) time. Constructing the oriented-incidence auxiliary graph creates two incidence nodes per original edge and a linear number of forcing and local gadget edges: O(V+E+n^+m^)O(|V|+|E|+\hat{n}+\hat{m}) time. Bipartiteness and planarity checks are linear in the auxiliary size. The weighted vertex-cover subproblem on BB is solved by a matching/flow routine with O(m^n^)O(\hat{m}\sqrt{\hat{n}}) worst-case time. Decoding and pruning scan incidence and adjacency lists once. Planarity gives m^=O(n^)\hat{m}=O(\hat{n}) , making the dominant term O(n^3/2)O(\hat{n}^{3/2}) . \square


Experimental Study

We evaluated Salvador v0.0.3 on the NPBench vertex-cover clique-complement collection. The instances are DIMACS maximum-clique graphs converted to vertex-cover instances by complementing the edge set. A clique of size ω\omega in the original graph G0G_0 is an independent set of size ω\omega in the complement G=G0G=\overline{G_0} , so the vertex-cover reference optimum is:

τ=nω. \tau^* = n - \omega.

The command used for the logged run was python -m salvador.batch -i ./experiments/ -c -l (equivalently batch_vega -i ./experiments/ -c -l). The log is stored as experiments/np_bench.txt and contains 52 completed runs. Because the log does not contain a competing baseline, only values directly recoverable from the log and the DIMACS reference optimum are reported. Rows marked \dagger use a best-known clique lower bound, so the displayed τ\tau^* is an upper bound on the true optimum.

Instance τ\tau^* CS\lvert C_S\rvert CS/τ\lvert C_S\rvert/\tau^* Parse (ms) Alg. (ms)
brock200_1 179 184 1.028 17.1 4192.0
brock200_2 188 193 1.027 24.7 15255.0
brock200_3 185 191 1.032 31.2 8296.4
brock200_4 183 189 1.033 22.3 7264.8
brock400_1 373 385 1.032 57.1 38880.3
brock400_2 371 384 1.035 70.0 42005.5
brock400_3 369 383 1.038 60.0 44912.0
brock400_4 367 384 1.046 75.5 45628.1
c-fat200-1 188 188 1.000 116.4 66467.3
c-fat200-2 176 177 1.006 73.8 85894.9
c-fat200-5 142 142 1.000 52.9 27365.7
C1000.9 \leq 932 ^\dagger 951 1.020 166.9 143665.8
C125.9 91 94 1.033 22.5 257.3
C250.9 206 214 1.039 25.7 3466.8
C500.9 \leq 443 ^\dagger 464 1.047 60.0 13496.0
gen200_p0.9_44 156 170 1.090 26.9 1187.4
gen200_p0.9_55 145 169 1.166 24.4 868.3
gen400_p0.9_55 345 366 1.061 32.2 8786.5
gen400_p0.9_65 335 362 1.081 63.6 8631.0
gen400_p0.9_75 325 360 1.108 37.0 10609.0
hamming10-2 512 512 1.000 61.8 3036.9
hamming6-2 32 32 1.000 33.1 333.2
hamming6-4 60 62 1.033 16.2 345.8
hamming8-2 128 128 1.000 0.0 319.5
hamming8-4 240 240 1.000 142.7 6647.7
johnson16-2-4 112 112 1.000 22.3 779.3
johnson32-2-4 480 480 1.000 61.5 7749.2
johnson8-2-4 24 24 1.000 0.0 23.4
johnson8-4-4 56 56 1.000 9.2 117.4
keller4 160 164 1.025 14.1 4494.3
MANN_a27 252 253 1.004 2.6 294.7
MANN_a45 690 703 1.019 0.0 473.3
MANN_a81 2221 2225 1.002 15.9 2834.3
MANN_a9 29 29 1.000 0.0 16.1
p_hat300-1 292 294 1.007 80.2 85573.4
p_hat300-2 275 285 1.036 180.1 55464.7
p_hat300-3 264 276 1.045 94.6 15246.7
p_hat500-3 \leq 450 ^\dagger 471 1.047 165.5 78912.2
san200_0.7_1 170 185 1.088 50.2 9117.5
san200_0.7_2 182 188 1.033 25.4 5512.1
san200_0.9_1 130 155 1.192 7.1 1774.6
san200_0.9_2 140 163 1.164 15.2 1736.4
san200_0.9_3 156 171 1.096 10.0 2013.5
san400_0.5_1 387 393 1.016 232.8 135755.1
san400_0.7_1 360 380 1.056 142.1 69143.0
san400_0.7_2 370 384 1.038 86.6 59137.5
san400_0.7_3 378 388 1.026 72.5 40035.9
san400_0.9_1 300 350 1.167 36.8 10906.9
sanr200_0.7 182 188 1.033 12.2 6429.0
sanr200_0.9 158 171 1.082 0.0 985.9
sanr400_0.5 387 391 1.010 97.7 141881.3
sanr400_0.7 379 386 1.018 178.4 84044.0

Across the 52 completed runs, Salvador has mean displayed ratio CS/τ=1.042|C_S|/\tau^*=1.042 , minimum 1.0001.000 , and maximum 1.1921.192 on san200_0.9_1. Eleven instances are solved exactly against the displayed reference values. Every empirical Salvador ratio lies below both the 2\sqrt{2} inapproximability threshold and the conjectured bound of 1.41.4 .

Algorithm time totals 1408.2651408.265 s across the 52 instances, with mean 27082.027082.0 ms, median 8022.88022.8 ms, minimum 16.116.1 ms, and maximum 143665.8143665.8 ms on C1000.9. The Hamming, Johnson, c-fat, and MANN families show the strongest solution-quality behaviour. The largest displayed ratio is 1.1921.192 on san200_0.9_1.


Conclusion

The Salvador algorithm combines a bipartite planar oriented-incidence gadget, an exact weighted vertex-cover solve on the auxiliary graph, decoding by first coordinate, and a linear-time redundancy-pruning step. The algorithm always returns a valid vertex cover and runs in O(n^3/2)O(\hat{n}^{3/2}) time on the reduced planar auxiliary instance.

Across the 52 completed NPBench clique-complement runs, the displayed approximation ratio ranges from 1.0001.000 to 1.1921.192 with a mean of 1.0421.042 . The central observation is that 1.4=2ε01.4=\sqrt{2}-\varepsilon_0 for the explicit positive constant ε0=2750.0142\varepsilon_0=\sqrt{2}-\tfrac{7}{5}\approx 0.0142 , so the conjectured ratio lies strictly below the 2\sqrt{2} inapproximability threshold. By Theorem 2, any proof of a universal Salvador ratio at most 1.41.4 would imply P=NP\mathrm{P}=\mathrm{NP} . The implementation is freely available as the Salvador package (v0.0.3) on PyPI.

Top comments (0)