How I Built a Route Optimizer Within 0.13% of LKH-3 (Rust + API)
I spent months trying to match the best TSP solver in the world. Here's the story of how I got within 0.13%.
The Challenge
LKH-3, written by Keld Helsgaun, is the solver everyone measures against. It's been the gold standard for the Travelling Salesman Problem for over two decades. Researchers have spent entire careers trying to beat it.
I didn't set out to beat it. I wanted to see how close a clean, modern Rust implementation could get — and then make it accessible as a simple API call.
Where I Ended Up
| Instance | Cities | LKH-3 | Lambda-G | Gap |
|---|---|---|---|---|
| kroA100 | 100 | 21,907 | 21,908 | 0.005% |
| pr1002 | 1,002 | 259,045 | 261,921 | 1.11% |
| shared_benchmark | 1,000 | 23,342 | 23,373 | 0.13% |
On kroA100, Lambda-G finds essentially the exact same tour as LKH-3. On 1,000 cities, we're within 0.13%. And it beats Google OR-Tools on the same benchmark.
All tests: 60 second time limit, same hardware.
Try it live — click to place cities and watch the optimization happen →
The Wall I Hit
My first 8 versions were mediocre. I was getting 3-5% gaps on 1,000-city problems — respectable, but nowhere near LKH-3. The algorithm was correct. The code was clean. But something was off.
I profiled the code. The bottleneck wasn't the search strategy — it was the inner loop overhead. The solver was spending too much time on bookkeeping and not enough on actual search.
I was running out of iterations long before I was running out of good moves to find.
The Breakthrough
The fix came from careful profiling and micro-optimizations. No single silver bullet — just relentless attention to what the CPU was actually doing.
Same time budget, but significantly more iterations. And more iterations means finding better solutions.
This dropped my gap from 3-5% to under 1%. The rest of the improvement came from tuning the perturbation strategy.
The Algorithm
Lambda-G uses Iterated Local Search — conceptually simple, but the details matter:
1. Build initial tour (nearest-neighbour + greedy patching)
2. Local search: 2-opt + Or-opt until no improvement
3. Perturb: double-bridge (4-opt, non-sequential reconnection)
4. Accept or reject perturbation
5. Repeat until time limit
Why Double-Bridge Matters
2-opt and Or-opt are greedy — they always improve the tour. But they get stuck in local minima. You need a way to escape.
Double-bridge cuts the tour in 4 places and reconnects the segments in a way that no sequence of 2-opt moves can reach. It's like teleporting to a different region of the solution space.
After each perturbation, the local search kicks in again and optimizes from the new starting point. Sometimes it finds something better. Sometimes it doesn't. But over thousands of iterations, it converges toward the global optimum.
Candidate Lists
Checking every possible 2-opt swap is O(n²). For 10,000 cities, that's 100 million comparisons per iteration — way too slow.
The trick: precompute a set of nearest neighbours for each city. Only consider swaps involving these neighbours. This prunes the search space dramatically with almost no loss in solution quality, because good moves almost always involve nearby cities.
Parallelism
Multiple independent workers run with different random seeds. Best tour wins. Simple, embarrassingly parallel, and scales linearly with cores.
Beyond Delivery Trucks
Here's the thing most people don't realize: TSP isn't just about delivery routes. It's about optimal ordering of anything.
The algorithm doesn't care what your "cities" are:
- Logistics & Delivery → minimize fuel and driver hours
- Chip Design / VLSI → minimize total wire length between components
- DNA Sequencing → order fragments for genome assembly
- Warehouse Picking → minimize picker travel time
- CNC / Laser Cutting → minimize toolhead movement
- Telescope Scheduling → minimize slew time between targets
Send coordinates for Euclidean problems. Send a distance matrix for everything else — road networks, genomic distances, whatever.
Making It an API
I wanted this to be useful, not just a benchmark trophy. So I wrapped it in a REST API:
import httpx
result = httpx.post("https://api.bitsabhi.com/optimize",
headers={"Authorization": "Bearer YOUR_KEY"},
json={
"points": [[0,0], [100,0], [100,100], [0,100]],
"time_limit": 25.0
}
).json()
print(result["tour"]) # [0, 3, 1, 2]
print(result["length"]) # 341.42
print(result["improvement"]) # 17.6% vs nearest-neighbour
Response times:
- 100 cities → ~0.5 seconds
- 1,000 cities → ~3 seconds
- 10,000 cities → ~25 seconds
It also accepts CSV uploads and distance matrices. One endpoint, any format.
Distance Matrix (for non-Euclidean problems):
result = httpx.post("https://api.bitsabhi.com/optimize",
headers={"Authorization": "Bearer YOUR_KEY"},
json={
"matrix": [
[0, 29, 82, 46],
[29, 0, 55, 46],
[82, 55, 0, 68],
[46, 46, 68, 0]
],
"time_limit": 25.0
}
).json()
What I Learned
1. Profiling beats intuition. The bottleneck was never where I thought it was. The profiler doesn't lie.
2. Candidate lists are essential. Pruning the search space is what separates toy solvers from production solvers.
3. Distance matrices unlock everything. The moment you accept an arbitrary matrix instead of just coordinates, you go from "route optimizer" to "universal ordering optimizer." DNA sequencing, road networks, financial portfolio rebalancing — same API call.
4. Rust is worth the rewrite. Porting from Python to Rust took about a week. The speedup is permanent and compounds with every iteration of the solver. For compute-bound problems, the language matters.
5. Perturbation is everything. A perfect local search that gets stuck in local minima will lose to a mediocre local search with good perturbation. Double-bridge was the key to escaping local optima.
Pricing
No subscriptions. Pay once, use forever.
- Free: 100 cities, $0 forever
- Research: 500 cities, $0 (.edu email required)
- Builder: 1,000 cities, $49 lifetime
- Pro: 10,000 cities, $149 lifetime
Try it free → lambda-g-optimizer.netlify.app
What's Next
I'm exploring vehicle routing (multiple trucks with capacity constraints), time windows, and asymmetric TSP. If you have a real-world problem that needs optimal ordering, I'd love to hear about it — especially edge cases and weird domains I haven't thought of.
What are you solving that needs optimal ordering?
Abhishek Srivastava — ORCID — bitsabhi@gmail.com
Top comments (0)