My Ngrok URLs Got So Ugly, I Built My Own Tunneling Platform Instead

#architecture #django #webdev #discuss

Three days. That's how long I spent polishing a web app until it looked perfect — on my own 15-inch laptop screen. Then I plugged it into my phone to test camera permissions, and everything fell apart.

TL;DR: Ngrok's ugly URLs annoyed me enough that I built my own tunnel + URL shortener, with AI-generated slugs, a GraphQL backend, and real SSRF protections baked in. The backend's stable, the frontend's half-done, and I just opened the architecture up for review while I deal with exams. Feedback is welcome.

Local dev gets messy fast once a second device is involved.

My first fix was just sharing my local IP. Cute idea, terrible in practice — both devices need the same Wi-Fi, the firewall has opinions, and local HTTPS certs for camera/mic access break constantly.

So I moved to Ngrok. It worked, mostly, but the free tier kept dropping connections mid-test, and every link looked like garbage — a1b2-34-56.ngrok-free.app isn't something you want to paste into a Slack message or a client email. I didn't just want a tunnel. I wanted something I'd actually be okay sharing.

Around the same time, a classmate was grinding through a basic URL shortener for an assignment. Watching him build it, something clicked — why not combine a custom tunnel with a real URL platform, and throw some AI in to make the slugs less ugly too?

So that's what I did.

The Blueprint: Going Past Tutorial-Level Code

I didn't want another weekend clone of a YouTube project. I wanted something closer to how real systems get built, which meant splitting the control plane (the brains) from the data plane (the actual proxy traffic) inside one modular monolith.

Layer	Technologies
Tunnel / Proxy	Django Channels, Daphne (ASGI), WebSockets, CLI Agent (Python & Node.js)
Frontend	Next.js, TailwindCSS, Shadcn/UI (in progress)
Backend	Django 5, Graphene GraphQL
Database	PostgreSQL (Neon)
Auth	JWT (PyJWT), bcrypt, token rotation
AI	Google Gemini
Analytics & Security	MaxMind GeoLite2, Google Safe Browsing API

The Core Request Lifecycle

When someone requests a short URL or hits a proxied endpoint, the request moves through a chain of modular apps before it's done.

Engineering Highlights

Bi-directional WebSocket tunneling. A custom reverse-proxy tunnel. A local CLI agent (Python and Node.js versions) opens a persistent WebSocket connection to a Django Channels server through Daphne, forwarding HTTP requests both ways, with automatic exponential-backoff reconnects when things drop.

Race-condition protection. Click counting uses Django's F() expressions for atomic updates, so analytics don't get skewed when redirects spike.

Non-blocking telemetry. Browser, OS, and country lookups (via MaxMind GeoLite2) run on background daemon threads instead of holding up the redirect itself. Keeps latency where it should be.

Content-aware AI slugs. Gemini reads the target URL at the moment it's created and returns a slug that actually means something, instead of a random string.

Proactive ingestion security. SSRF mitigation, private IP range blocking (RFC 1918, link-local, IPv6 ULA), and Google Safe Browsing checks all happen before a link ever hits the database.

Where Things Stand

The backend, GraphQL API, tunneling logic, and both CLI agents work and are stable. The frontend dashboard is still half-built — the buttons exist, and most of them even do things.

Exams are coming up at IITM, so feature work is paused for a bit. Rather than let the repo sit untouched, I opened it up as an RFC — basically asking people who know more than me to find the holes in the architecture before I build more on top of it.

Where I Actually Need Help

I laid all of this out properly in a GitHub Discussion thread — that's the real RFC, with the full architecture breakdown. Here's a screenshot of it, but the short version is below.

Latency and performance. Past standard PostgreSQL composite indexes, what's a realistic way to get redirects under 10ms at low-thousands QPS? I'm trying to figure out Redis TTL strategies versus sharded KV stores, and whether CDN caching even makes sense here given that I need to preserve origin control for active tunnels.

Security, SSRF mitigation, and reputation. Beyond basic VPC egress rules, what does solid server-side SSRF hardening actually look like — destination IP range validation, DoH validation, egress proxies with network ACLs? And are Bloom filters realistic for fast malicious-URL reputation checks at higher QPS, or is that overkill for where this project is at?

Scalability and connection management. If this ever moves past local testing, what's the right pattern for handling a lot of concurrent long-lived tunnels — WebSocket or TCP? I'm also unsure where the line should sit between the control plane and the data plane once the proxy needs to handle real load, and whether something like Envoy, Traefik, or NGINX belongs at the forwarding layer instead of my current setup.

Observability and graceful degradation. What should I actually be tracking — P50/P95/P99 latency, open connections, downstream error rates, SSRF hits? And if the control plane goes down, should the data plane fall back to cached redirect rules with a grace period, served from read-only edge nodes?

Cost and deployment. What's a sane way to run this as a proof of concept versus a real production setup? Single VPS plus Redis, or is it worth thinking about Kubernetes and autoscaling this early? Also open to ideas on load-testing tunnels and long-lived connections properly.

If you've got concrete critiques, reference architectures, related OSS projects, or even small things like a Redis key layout or a proxy worker pool pattern — that's exactly the kind of feedback that'd help most right now.

Let's Talk

If you're a student who wants to dig into a real, slightly messy codebase, or you've built something like this before and want to roast my indexing strategy — either way, I want to hear it.

Full system design breakdown & discussion: Join the thread here
Codebase (clone it, fork it, break it): Check out the GitHub Repository

How do you usually handle testing camera/mic permissions or responsive layouts during local dev? Third-party tunnels, or something custom? Drop it in the comments.