Most web developers never look under the hood of their framework. Flask, Django, Express — they all abstract away the same thing: raw TCP bytes being transformed into structured HTTP messages. I wanted to understand that transformation, so I built ServeKit — a complete HTTP/1.1 server framework from nothing but Python's socket module.
No Flask. No http.server. No dependencies. Just socket.socket(AF_INET, SOCK_STREAM) and hand-written protocol parsing.
The TCP Foundation
Every HTTP server starts the same way: bind a socket, listen for connections, accept them, read bytes.
import socket
import selectors
class TCPServer:
def __init__(self, host="0.0.0.0", port=8080, workers=4):
self.host = host
self.port = port
self._selector = selectors.DefaultSelector()
self._executor = ThreadPoolExecutor(max_workers=workers)
def start(self):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind((self.host, self.port))
sock.listen(128)
sock.setblocking(False)
self._selector.register(sock, selectors.EVENT_READ)
while self._running:
events = self._selector.select(timeout=1)
for key, mask in events:
if key.data is None:
self._accept(key.fileobj)
else:
self._executor.submit(self._handle, key.fileobj)
The selectors module gives us non-blocking I/O — we can handle thousands of connections without one blocking the rest. When a connection is ready to read, we hand it to a thread pool worker.
Parsing HTTP by Hand
When raw bytes arrive on a TCP connection, they look like this:
GET /users/42?format=json HTTP/1.1\r\n
Host: localhost:8080\r\n
Accept: application/json\r\n
Connection: keep-alive\r\n
\r\n
That's it. Text separated by \r\n, with a blank line marking the end of headers. My parser reads this byte-by-byte:
class HTTPParser:
def parse(self, data: bytes, client_addr=("0.0.0.0", 0)):
# Find the header/body boundary
head_end = data.find(b"\r\n\r\n")
head_str = data[:head_end].decode("latin-1")
body_bytes = data[head_end + 4:]
lines = head_str.split("\r\n")
method, path, query, version = self._parse_request_line(lines[0])
headers = self._parse_headers(lines[1:])
body = self._read_body(headers, body_bytes, data, head_end + 4)
return Request(method=method, path=path, headers=headers,
body=body, query_string=query)
The parser handles everything from HTTP/1.0 to chunked transfer encoding. It validates methods, enforces size limits (no 10GB request lines), and builds a clean Request object.
URL Routing with Priority
Route matching seems simple until you have overlapping patterns. ServeKit supports three match types with clear priority:
-
Exact match:
/users/me -
Parameterized:
/users/{id} -
Wildcard:
/files/*path
class Router:
def resolve(self, method, path):
# Exact match first
for route in self._routes:
if route.pattern == path and route.method in (method, "ANY"):
return route.handler, {}
# Then parameterized
for route in self._routes:
if route.has_params:
params = route.match(path)
if params is not None:
return route.handler, params
# Finally wildcards
for route in self._routes:
if route.is_wildcard:
params = route.match(path)
if params is not None:
return route.handler, params
raise NotFound(f"No route for {method} {path}")
This means /users/me always wins over /users/{id}, which always wins over /users/*path. Route groups add prefix support for API versioning:
api = app.group("/api/v1")
@api.get("/users/{id}")
def get_user(req, res):
res.json({"id": req.params["id"]})
# Matches: GET /api/v1/users/42
Middleware: The Onion Model
Middleware wraps handlers in layers. Each middleware calls next_handler to continue the chain, or skips it to short-circuit:
class MiddlewareChain:
def execute(self, req, res, handler):
def build_chain(index):
if index >= len(self._middleware):
return handler
mw = self._middleware[index]
def next_fn(req, res):
mw(req, res, build_chain(index + 1))
return next_fn
build_chain(0)(req, res)
ServeKit ships with five built-in middleware:
- CORS — handles preflight OPTIONS and sets Access-Control headers
- Compression — gzip responses above a size threshold
- Rate limiting — sliding-window limiter with proper 429 responses
- Basic auth — HTTP Basic with challenge headers
- Logger — colored request/response timing
WebSocket: Upgrading the Protocol
The most interesting part was WebSocket support. The handshake is a standard HTTP request with magic headers:
def build_upgrade_response(request):
ws_key = request.headers["Sec-WebSocket-Key"]
# Concatenate with the magic GUID from RFC 6455
combined = ws_key + "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"
accept = base64.b64encode(hashlib.sha1(combined.encode()).digest())
return (
b"HTTP/1.1 101 Switching Protocols\r\n"
b"Upgrade: websocket\r\n"
b"Connection: Upgrade\r\n"
b"Sec-WebSocket-Accept: " + accept + b"\r\n\r\n"
)
After the handshake, the connection switches from HTTP to WebSocket frames — a binary format with opcode, length, optional mask, and payload. Parsing these frames is surprisingly tricky because of variable-length encoding: payloads under 126 bytes use 1 byte for length, 126-65535 use 2 bytes, and larger use 8 bytes.
Response Serialization
The response builder mirrors the parser — turning a structured object back into HTTP bytes:
class Response:
def serialize(self):
status_line = f"HTTP/1.1 {self.status_code} {phrase}\r\n"
header_lines = "".join(f"{k}: {v}\r\n" for k, v in self.headers.items())
return (status_line + header_lines + "\r\n").encode() + self.body
The chainable API makes handlers clean:
@app.post("/api/items")
def create_item(req, res):
data = req.json()
res.status(201).header("X-Custom", "value").json({"created": data})
What I Learned
Building an HTTP server from scratch taught me things I never would have learned from using frameworks:
HTTP is just text over TCP. The entire protocol is human-readable until you get to WebSocket frames.
Keep-alive is why HTTP/1.1 matters. The difference between opening a new TCP connection per request vs. reusing one is massive for performance.
Headers are case-insensitive. The spec says so, but most developers assume lowercase. A
CaseInsensitiveDictsolves this cleanly.Chunked transfer encoding exists because servers don't always know Content-Length upfront. Streaming responses need it.
WebSocket masking is XOR. Client-to-server frames must be masked; server-to-client frames must not. The mask is 4 random bytes, and unmasking is just
payload[i] ^= mask[i % 4].
The Result
ServeKit is ~4,500 lines of Python with 222 passing tests. It handles:
- HTTP/1.1 with keep-alive
- Routing with params and wildcards
- 5 built-in middleware
- Static file serving with caching
- WebSocket support
- Graceful shutdown
All from raw TCP sockets.
# Try it
git clone https://github.com/hajirufai/servekit
cd servekit
python examples/hello.py
# → http://localhost:8080
GitHub: github.com/hajirufai/servekit
Live page: hajirufai.github.io/servekit
Top comments (0)