An Approximate Solution to the Minimum Vertex Cover Problem: The Salvador Algorithm
Frank Vega
Information Physics Institute, 840 W 67th St, Hialeah, FL 33012, USA
vega.frank@gmail.com
Abstract
The Minimum Vertex Cover problem is NP-hard. The best known polynomial-time approximation ratio is , matching the Unique Games Conjecture lower bound of ; without that conjecture, 2-to-2 games hardness gives inapproximability within under . We present Salvador v0.0.2, an approximate vertex-cover solver whose current implementation builds a linear-size spanning-forest planar core, reduces it to a weighted Minimum Independent Dominating Set gadget, solves that gadget by a linear weighted independent-dominating-set pass, repairs all non-core edges, and prunes redundant vertices. The algorithm runs in time and always returns a valid vertex cover. On 68 NPBench clique-complement instances, Salvador's empirical ratio ranges from to with mean ; using the NetworkX 2-approximation baseline gives an a posteriori certified upper bound no larger than . We conjecture that the approximation ratio is universally bounded by . Since for , proving the conjecture would place the algorithm strictly below the inapproximability threshold.
Introduction
Given an undirected graph , a set is a vertex cover if every edge has at least one endpoint in . The goal is to minimize ; the optimum is denoted .
The current Salvador implementation differs from earlier drafts that used repeated planarity tests and maximal-planar-subgraph extraction. The public routine find_vertex_cover now runs through four linear phases:
- Preprocessing. Remove self-loops and isolated vertices.
- Linear planar core. Build a spanning forest with Union-Find in one pass over . The rejected cycle-closing edges are stored for repair.
- Weighted MIDS gadget and linear IDS pass. Build a planar weighted MIDS gadget on , solve it by a linear cheap-first maximal independent set pass, decode candidate cover vertices, and repair residual core edges.
- Original-edge repair and pruning. Scan all removed edges once, add the higher-degree endpoint if still uncovered, and prune redundant cover vertices in a single adjacency pass.
The main contributions are:
- A practical vertex-cover algorithm based on a linear spanning-forest planar core and one repair scan.
- A weighted MIDS gadget for the planar core, preserving the original vertex-cover encoding while avoiding repeated planarity augmentation.
- A current-code experimental study on 68 NPBench clique-complement instances with empirical ratios in and mean .
- An a posteriori certified upper-bound column computed from the NetworkX 2-approximation baseline.
- An open-source implementation, Salvador v0.0.2.
Current Python Implementation
Main Entry Point (algorithm.py)
import itertools
from . import utils
import networkx as nx
from . import vc_reduction
def prune_redundant_vertices(adj, C):
"""
Linear-time single-pass removal of redundant vertices.
For every v in C we check whether all its neighbors are still in the
current cover. If yes, we safely remove it immediately.
"""
C = set(C)
for v in list(C):
all_neighbors_covered = True
for u in adj.get(v, []):
if u not in C:
all_neighbors_covered = False
break
if all_neighbors_covered:
C.remove(v)
return C
def find_vertex_cover(graph, epsilon: float = 0.1):
G = graph.copy()
G.remove_edges_from(nx.selfloop_edges(G))
G.remove_nodes_from(list(nx.isolates(G)))
if G.number_of_edges() == 0:
return set()
# Linear forest-core MIDS reduction with one repair scan.
cover, _ = vc_reduction.solve_vc(G, epsilon)
adj = {v: set(G[v]) for v in G}
return prune_redundant_vertices(adj, cover)
Linear Planar Core and MIDS Reduction (vc_reduction.py)
For a core graph with vertices, the gadget has a penalty . For every vertex , it adds of weight and of weight ; for every core edge , it adds a hub of weight adjacent to and .
def reduce_vc_to_mids(G: nx.Graph, epsilon: float = 0.1,
assume_planar: bool = False):
if not assume_planar and not nx.is_planar(G):
raise ValueError("reduce_vc_to_mids requires a planar graph.")
N = G.number_of_nodes()
P = N + 1
H = nx.Graph(); weights = {}
for v in G.nodes():
H.add_edge((v, 0), (v, 1))
weights[(v, 0)] = 1
weights[(v, 1)] = 0
for u, v in G.edges():
hub = ('h', u, v); weights[hub] = P
H.add_edge((u, 0), hub); H.add_edge((v, 0), hub)
return H, weights, P
def _spanning_forest_planar_core(G: nx.Graph):
H = nx.Graph()
H.add_nodes_from(G.nodes())
parent = {v: v for v in G.nodes()}
rank = {v: 0 for v in G.nodes()}
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(u, v):
ru, rv = find(u), find(v)
if ru == rv:
return False
if rank[ru] < rank[rv]:
ru, rv = rv, ru
parent[rv] = ru
if rank[ru] == rank[rv]:
rank[ru] += 1
return True
removed = []
for u, v in G.edges():
if union(u, v):
H.add_edge(u, v)
else:
removed.append((u, v))
return H, removed
def solve_vc(G: nx.Graph, epsilon: float = 0.1):
if G.number_of_edges() == 0:
return frozenset(), 0.0
G_p, removed = _spanning_forest_planar_core(G)
cover = _solve_planar(G_p, G, epsilon)
for u, v in removed:
if u not in cover and v not in cover:
cover.add(u if G.degree(u) >= G.degree(v) else v)
return frozenset(cover), float(len(cover))
Linear Weighted IDS Pass (baker_ptas.py)
The current default weighted IDS solver is a linear cheap-first maximal independent set pass. A maximal independent set is also an independent dominating set, and the Salvador gadget uses only the weight classes
,
, and
, so bucket ordering avoids sorting.
def greedy_maximal_is_weighted(adj, vertices, weights):
zeros, ones, heavy = [], [], []
for v in vertices:
w = weights.get(v, 1)
if w == 0:
zeros.append(v)
elif w == 1:
ones.append(v)
else:
heavy.append(v)
vlist = zeros + ones + heavy
selected = set(); excluded = set()
for v in vlist:
if v not in excluded:
selected.add(v); excluded.add(v)
excluded.update(adj.get(v, ()))
return frozenset(selected)
def baker_ptas_ids_weighted(adj, weights=None, epsilon=0.5):
vertices = set(adj.keys())
if not vertices:
return frozenset()
if len(vertices) == 1:
return frozenset(vertices)
if weights is None:
weights = {}
return greedy_maximal_is_weighted(adj, vertices, weights)
The older tree-decomposition Baker PTAS code remains in the module after this early return for experimentation and backward comparison, but it is not the default path used by find_vertex_cover.
Correctness
Theorem 1 (Validity). find_vertex_cover always returns a valid vertex cover of the input graph
.
Proof sketch. Preprocessing removes only self-loops and isolated vertices. The spanning forest contains all vertices and a subset of original edges. The MIDS decode plus residual core repair covers every edge in . Every removed edge is then scanned: if neither endpoint is already in the cover, one endpoint is added. Therefore all original edges are covered. Finally, pruning removes a vertex only when every neighbor of remains in the cover, so every edge incident to stays covered by the other endpoint. ?
Approximation Ratio and Consequences for P vs NP
Let be the optimum vertex-cover size. The present implementation has a direct validity guarantee and an empirical approximation profile. For each NPBench clique-complement graph we compute the reference optimum as , where is the DIMACS maximum-clique value for the original graph.
The central conjecture remains:
Conjecture 1 (Universal
-approximation). For every undirected graph
, find_vertex_cover returns a cover
satisfying
Since , a proof of Conjecture 1 would give a polynomial-time approximation ratio strictly below the 2-to-2 games hardness threshold and would imply .
Across the current NPBench run, the largest displayed Salvador/reference ratio is , below the conjectured threshold and below .
Runtime Analysis
Theorem 2 (Time complexity). The current implementation of find_vertex_cover runs in
time, where
and
.
Proof sketch.
- Preprocessing scans vertices and edges: .
-
_spanning_forest_planar_coreperforms one Union-Find pass over : , absorbing the inverse-Ackermann factor into the linear bound. - The gadget is built on a forest, so it has vertices and edges.
- The weighted IDS pass buckets vertices by weight and scans adjacency once through the maximal independent set construction: on the forest gadget.
- Repair scans each removed edge once: .
- Pruning scans adjacency around candidate cover vertices: .
Therefore the total running time is .
Experimental Study
We evaluated Salvador v0.0.2 on the NPBench vertex-cover clique-complement collection. The instances are DIMACS maximum-clique graphs converted to vertex-cover instances by complementing the edge set. A clique of size in the original graph becomes an independent set of size in the complement, so the vertex-cover reference optimum is:
The logged run is stored as experiments/np_bench.txt and was produced with:
python -m salvador.batch -i ./experiments/ -c -a -l
or equivalently:
batch_vega -i ./experiments/ -c -a -l
The baseline is NetworkX min_weighted_vertex_cover, a Bar-Yehuda--Even local-ratio 2-approximation. We report both empirical ratios against the DIMACS reference optimum and the certified a posteriori Salvador bound:
where is Salvador's cover and is NetworkX's 2-approximation cover.
For example, on brock200_2.clq-compliment.txt, NetworkX computes a cover of size
in
ms, Salvador computes a cover of size
in
ms, and the certified upper bound is:
Rows marked dagger use a best-known clique lower bound, so the displayed vertex-cover optimum is an upper bound on the true optimum.
| Instance | Opt VC | NetworkX VC | NetworkX/Opt | Salvador VC | Salvador/Opt | Certified upper bound | NetworkX ms | Salvador ms |
|---|---|---|---|---|---|---|---|---|
brock200_1 |
179 | 197 | 1.101 | 184 | 1.028 | 1.868 | 0.000 | 21.115 |
brock200_2 |
188 | 199 | 1.059 | 192 | 1.021 | 1.930 | 15.690 | 21.458 |
brock200_3 |
185 | 197 | 1.065 | 190 | 1.027 | 1.929 | 0.998 | 29.073 |
brock200_4 |
183 | 198 | 1.082 | 187 | 1.022 | 1.889 | 3.405 | 37.961 |
brock400_1 |
373 | 397 | 1.064 | 382 | 1.024 | 1.924 | 0.000 | 66.091 |
brock400_2 |
371 | 396 | 1.067 | 381 | 1.027 | 1.924 | 3.232 | 56.858 |
brock400_3 |
369 | 396 | 1.073 | 381 | 1.033 | 1.924 | 0.000 | 69.879 |
brock400_4 |
367 | 398 | 1.084 | 383 | 1.044 | 1.925 | 15.741 | 69.259 |
brock800_1 |
777 | 798 | 1.027 | 785 | 1.010 | 1.967 | 17.816 | 390.844 |
brock800_2 |
776 | 798 | 1.028 | 785 | 1.012 | 1.967 | 10.028 | 306.896 |
brock800_3 |
775 | 796 | 1.027 | 785 | 1.013 | 1.972 | 11.772 | 303.142 |
brock800_4 |
774 | 798 | 1.031 | 784 | 1.013 | 1.965 | 15.647 | 309.083 |
c-fat200-1 |
188 | 198 | 1.053 | 188 | 1.000 | 1.899 | 2.913 | 43.041 |
c-fat200-2 |
176 | 198 | 1.125 | 176 | 1.000 | 1.778 | 1.978 | 38.133 |
c-fat200-5 |
142 | 199 | 1.401 | 142 | 1.000 | 1.427 | 2.356 | 27.579 |
c-fat500-1 |
486 | 498 | 1.025 | 486 | 1.000 | 1.952 | 11.976 | 274.073 |
c-fat500-10 |
374 | 499 | 1.334 | 374 | 1.000 | 1.499 | 0.000 | 198.928 |
c-fat500-2 |
474 | 498 | 1.051 | 474 | 1.000 | 1.904 | 15.514 | 362.902 |
c-fat500-5 |
436 | 499 | 1.144 | 436 | 1.000 | 1.747 | 5.511 | 255.697 |
C1000.9 |
<= 932 dagger | 987 | 1.059 | 953 | 1.023 | 1.931 | 0.000 | 150.017 |
C125.9 |
91 | 115 | 1.264 | 94 | 1.033 | 1.635 | 0.000 | 6.151 |
C250.9 |
206 | 241 | 1.170 | 214 | 1.039 | 1.776 | 0.998 | 4.281 |
C500.9 |
<= 443 dagger | 489 | 1.104 | 457 | 1.032 | 1.869 | 0.000 | 46.692 |
gen200_p0.9_44 |
156 | 192 | 1.231 | 167 | 1.071 | 1.740 | 0.000 | 15.628 |
gen200_p0.9_55 |
145 | 187 | 1.290 | 165 | 1.138 | 1.765 | 0.000 | 0.000 |
gen400_p0.9_55 |
345 | 388 | 1.125 | 359 | 1.041 | 1.851 | 3.732 | 19.340 |
gen400_p0.9_65 |
335 | 387 | 1.155 | 361 | 1.078 | 1.866 | 2.246 | 26.162 |
gen400_p0.9_75 |
325 | 385 | 1.185 | 356 | 1.095 | 1.849 | 3.169 | 23.254 |
hamming10-2 |
512 | 1023 | 1.998 | 512 | 1.000 | 1.001 | 1.073 | 46.221 |
hamming10-4 |
984 | 1023 | 1.040 | 996 | 1.012 | 1.947 | 11.849 | 258.114 |
hamming6-2 |
32 | 63 | 1.969 | 32 | 1.000 | 1.016 | 0.000 | 1.006 |
hamming6-4 |
60 | 63 | 1.050 | 62 | 1.033 | 1.968 | 0.000 | 2.023 |
hamming8-2 |
128 | 255 | 1.992 | 128 | 1.000 | 1.004 | 1.006 | 8.173 |
hamming8-4 |
240 | 255 | 1.062 | 240 | 1.000 | 1.882 | 2.248 | 28.899 |
johnson16-2-4 |
112 | 119 | 1.062 | 112 | 1.000 | 1.882 | 1.128 | 4.361 |
johnson32-2-4 |
480 | 495 | 1.031 | 480 | 1.000 | 1.939 | 0.000 | 49.459 |
johnson8-2-4 |
24 | 27 | 1.125 | 24 | 1.000 | 1.778 | 0.000 | 0.000 |
johnson8-4-4 |
56 | 69 | 1.232 | 56 | 1.000 | 1.623 | 0.999 | 5.097 |
keller4 |
160 | 170 | 1.062 | 162 | 1.012 | 1.906 | 1.743 | 59.447 |
keller5 |
749 | 775 | 1.035 | 758 | 1.012 | 1.956 | 16.164 | 183.929 |
MANN_a27 |
252 | 270 | 1.071 | 253 | 1.004 | 1.874 | 0.000 | 8.712 |
MANN_a45 |
690 | 735 | 1.065 | 695 | 1.007 | 1.891 | 0.986 | 23.164 |
MANN_a81 |
2221 | 2268 | 1.021 | 2225 | 1.002 | 1.962 | 3.769 | 79.834 |
MANN_a9 |
29 | 36 | 1.241 | 29 | 1.000 | 1.611 | 0.000 | 0.000 |
p_hat1000-3 |
<= 932 dagger | 991 | 1.063 | 943 | 1.012 | 1.903 | 19.721 | 351.907 |
p_hat300-1 |
292 | 296 | 1.014 | 294 | 1.007 | 1.986 | 0.000 | 83.977 |
p_hat300-2 |
275 | 296 | 1.076 | 278 | 1.011 | 1.878 | 0.000 | 64.064 |
p_hat300-3 |
264 | 293 | 1.110 | 269 | 1.019 | 1.836 | 2.261 | 31.998 |
p_hat500-1 |
491 | 499 | 1.016 | 494 | 1.006 | 1.980 | 16.023 | 222.256 |
p_hat500-2 |
464 | 495 | 1.067 | 471 | 1.015 | 1.903 | 0.000 | 167.353 |
p_hat500-3 |
<= 450 dagger | 489 | 1.087 | 459 | 1.020 | 1.877 | 5.095 | 89.239 |
p_hat700-1 |
689 | 699 | 1.015 | 693 | 1.006 | 1.983 | 31.294 | 453.505 |
p_hat700-2 |
<= 656 dagger | 698 | 1.064 | 662 | 1.009 | 1.897 | 15.850 | 321.422 |
p_hat700-3 |
638 | 692 | 1.085 | 645 | 1.011 | 1.864 | 17.240 | 173.352 |
san200_0.7_1 |
170 | 194 | 1.141 | 184 | 1.082 | 1.897 | 0.000 | 22.830 |
san200_0.7_2 |
182 | 190 | 1.044 | 188 | 1.033 | 1.979 | 0.000 | 21.728 |
san200_0.9_1 |
130 | 176 | 1.354 | 155 | 1.192 | 1.761 | 0.000 | 11.270 |
san200_0.9_2 |
140 | 182 | 1.300 | 164 | 1.171 | 1.802 | 0.000 | 10.641 |
san200_0.9_3 |
156 | 190 | 1.218 | 171 | 1.096 | 1.800 | 0.000 | 0.000 |
san400_0.5_1 |
387 | 396 | 1.023 | 393 | 1.016 | 1.985 | 0.000 | 100.968 |
san400_0.7_1 |
360 | 394 | 1.094 | 379 | 1.053 | 1.924 | 0.000 | 72.765 |
san400_0.7_2 |
370 | 393 | 1.062 | 385 | 1.041 | 1.959 | 0.000 | 73.016 |
san400_0.7_3 |
378 | 396 | 1.048 | 388 | 1.026 | 1.960 | 0.000 | 75.971 |
san400_0.9_1 |
300 | 381 | 1.270 | 359 | 1.197 | 1.885 | 1.976 | 15.149 |
sanr200_0.7 |
182 | 197 | 1.082 | 186 | 1.022 | 1.888 | 0.000 | 28.249 |
sanr200_0.9 |
158 | 187 | 1.184 | 166 | 1.051 | 1.775 | 0.000 | 14.531 |
sanr400_0.5 |
387 | 399 | 1.031 | 389 | 1.005 | 1.950 | 0.000 | 99.775 |
sanr400_0.7 |
379 | 396 | 1.045 | 385 | 1.016 | 1.944 | 20.387 | 67.288 |
Summary over the 68 logged NPBench instances. Salvador's mean Salvador/Opt ratio is
, with minimum
and maximum
. The mean NetworkX baseline ratio is
. The mean certified Salvador upper bound
is
, with maximum
.
References
- Bar-Yehuda, R., and Even, S. (1985). "A local-ratio theorem for approximating the weighted vertex cover problem." Annals of Discrete Mathematics, 25, 27-46. http://www.cs.technion.ac.il/~reuven/PDF/vc_lr.pdf
- Baker, B. S. (1994). "Approximation algorithms for NP-complete problems on planar graphs." Journal of the ACM, 41(1), 153-180.
- Mascia, F. (2015). The Maximum Clique Problem -- DIMACS Benchmark Set. https://iridia.ulb.ac.be/~fmascia/maximum_clique/DIMACS-benchmark
- Nguyen, T. V., and Bui, T. NP-Complete Benchmark Instances. https://roars.dev/npbench/
- NetworkX documentation for
min_weighted_vertex_cover: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.approximation.vertex_cover.min_weighted_vertex_cover.html - Salvador package on PyPI: https://pypi.org/project/salvador/
Conclusion
The current Salvador v0.0.2 implementation is a linear-time approximate vertex-cover solver built around a spanning-forest planar core, a weighted MIDS gadget, a linear IDS pass, one repair scan, and one pruning scan. It always returns a valid vertex cover, runs in time, and on the logged NPBench experiment improves the NetworkX 2-approximation cover size on every instance while producing mean empirical ratio against the DIMACS reference optima.
Top comments (0)