Dmytro Huz

Posted on Jul 4 • Originally published at open.substack.com

Adding Homemade TLS to a Homemade Web Server

#networking #security #tutorial #webdev

Original series: Building Your Own Web Server + Rebuilding TLS from Scratch

This article connects two earlier series. In the web-server series, I rebuilt the HTTP side from raw sockets up to routing and file serving. In the TLS series, I rebuilt the secure-channel side: key exchange, certificates, key derivation, and encrypted records.

This piece is where the two meet: what has to change when a web server stops receiving plaintext HTTP bytes and starts receiving encrypted TCP bytes?

I expected adding TLS to my web server to change almost everything.

It did not.

The HTTP parser stayed boring. The router stayed boring. The file-serving code stayed boring.

The real change happened one layer below HTTP.

I had two separate learning projects before this:

a small web server built from scratch: sockets, HTTP parsing, location matching, file serving, and eventually a single-threaded event-loop server;
a small TLS-like secure channel built from scratch: ephemeral key exchange, certificate chains, server authentication, HKDF, AES-GCM records, and the difference between identity keys and session keys.

Putting them together led to one question:

What changes in the web server when the bytes coming from TCP are encrypted?

The useful answer is smaller than I expected:

TCP socket
   ↓
TLS handshake / record layer
   ↓ decrypted plaintext bytes
HTTP parser / router / file serving
   ↓ plaintext HTTP response
TLS record protection
   ↓ encrypted bytes
TCP socket

That is the whole architecture of the companion project.

This is not production HTTPS. It does not implement the real TLS 1.3 wire format, and browsers will not connect to it with https://localhost:8443.

It is a learning project. The goal is to make the integration boundary visible.

Once that boundary becomes clear, HTTPS feels much less magical.

The plain web server had one assumption

Before TLS, the web server had a simple mental model.

A client connects over TCP. The server reads bytes from the socket. Those bytes are HTTP bytes. The server appends them to a per-connection buffer, tries to parse an HTTP request, matches the path, reads a file, and writes an HTTP response.

The flow looked like this:

accept TCP connection
   ↓
read bytes from socket
   ↓
append bytes to HTTP input buffer
   ↓
parse HTTP request
   ↓
match URL against location config
   ↓
read file from root
   ↓
write HTTP response

The buffer is important because TCP is a stream.

One recv() call can give you half a request. Or exactly one request. Or multiple requests joined together. The parser cannot assume that one socket read equals one HTTP request.

In my server, the parser consumes complete HTTP messages from a buffer. If the request is incomplete, the server waits for more bytes. If there are extra bytes after a complete request, they stay in the buffer for the next parse.

For plain HTTP, socket bytes and HTTP bytes are the same thing:

context.http_input.append(data)
self._handle_http_messages(context)

That line works only because the bytes from the socket are already plaintext HTTP.

TLS breaks that assumption.

After TLS, socket bytes are no longer HTTP bytes

Once TLS is added, the TCP socket no longer carries a readable HTTP request.

The client still wants to send this:

GET / HTTP/1.1
Host: localhost
Connection: close

But that request is not placed directly on the wire.

It is wrapped inside TLS records. After the handshake, those records are encrypted and authenticated. The bytes arriving at the server socket are now protocol bytes for the TLS layer, not text for the HTTP parser.

That is the shift:

plain listener:
TCP bytes are HTTP bytes

TLS listener:
TCP bytes are TLS records
TLS records decrypt into HTTP bytes

If the HTTP parser starts caring about certificates, keys, record sequence numbers, or AES-GCM tags, the abstraction has leaked.

The HTTP parser should parse HTTP.

The TLS layer should turn unsafe network bytes into authenticated plaintext bytes.

The integration point belongs below HTTP

Production servers use the same high-level boundary.

NGINX does not ask the HTTP parser to understand encrypted TLS records. A TLS implementation such as OpenSSL handles the connection security first. After that, the HTTP processing layer receives plaintext HTTP bytes.

My project copies that architecture in a smaller educational form:

Production-ish mental model:
TCP → OpenSSL/TLS state → plaintext HTTP → NGINX HTTP processing

This project:
TCP → ToyTLSServerConnection → plaintext HTTP → HTTPParser / route matcher

The analogy is architectural, not protocol-compatible.

The project uses a small TLS-like protocol so the moving pieces are possible to read in one sitting. The handshake messages are simplified. The record framing is simplified. Many real TLS 1.3 details are missing on purpose.

But the boundary is the point:

socket bytes in
   ↓
connection layer decides whether they are plain HTTP or TLS records
   ↓
HTTP layer receives plaintext HTTP bytes either way

That boundary lets the same HTTP code serve both ports.

The config makes the boundary visible

The server uses an NGINX-style config because that was part of the original web-server project.

A single server block can listen on a plain port and on a TLS port:

http {
    server {
        listen 8080;
        listen 8443 ssl;
        server_name localhost;

        ssl_certificate certs/server_cert.pem;
        ssl_certificate_key certs/server_key.pem;
        ssl_certificate_chain certs/intermediate_cert.pem;

        location / {
            root html;
        }
    }
}

listen 8080; creates a normal HTTP listener.

listen 8443 ssl; creates a listener that attaches TLS state to every accepted connection.

Both listeners still share the same routing and file-serving code. The difference is what happens between recv() and HTTPParser.parse_message(...).

That is the part I wanted the project to expose.

The blocking TLS demo had to become a state machine

The original TLS demo was linear and blocking.

That is perfect for teaching the handshake:

receive ClientHello
send ServerHello
send ServerAuth
receive encrypted request
send encrypted response

But the web server I wanted to integrate with was not a one-client-at-a-time demo. It was a selectors-based server.

One event loop handles listening sockets and client sockets. A slow client should not freeze the whole process. That means the TLS implementation cannot sit inside a function waiting for the next record to arrive.

So the TLS code had to become buffer-oriented.

Instead of blocking on the socket, the TLS connection exposes three operations:

tls.feed_wire_data(socket_bytes)
pending = tls.pop_pending_wire_data()
plaintext = tls.read_plaintext()

feed_wire_data(...) accepts whatever bytes TCP happened to deliver.

pop_pending_wire_data() returns handshake bytes or encrypted application data that need to be sent back to the client.

read_plaintext() returns decrypted HTTP bytes after the TLS layer has enough complete records to process.

The web server does not need to know whether TCP split a TLS record into three reads or merged several records together. The TLS record buffer owns that problem.

That small interface is the bridge between the two projects.

Plain and TLS reads now differ by one layer

Here is the core read path in the integrated server.

For plain HTTP, the socket bytes go directly into the HTTP input buffer:

# Plain path: socket bytes are already HTTP bytes.
context.http_input.append(data)
self._handle_http_messages(context)

For the TLS listener, the same socket bytes first pass through the TLS connection state:

# TLS path: socket bytes are handshake/application records, not HTTP.
context.tls.feed_wire_data(data)
self._queue_output(context, context.tls.pop_pending_wire_data())

plaintext = context.tls.read_plaintext()
if plaintext:
    context.http_input.append(plaintext)
    self._handle_http_messages(context)

The HTTP handler never receives encrypted bytes. It receives plaintext after the TLS layer has done its job.

After that, request handling is the same:

request, consumed = HTTPParser.parse_message(context.http_input.data)
context.http_input.reduce_data(consumed)
response, should_close = build_http_response(request, context.server_block, self.base_dir)
self._send_application_bytes(context, response)

The response path mirrors the read path.

The HTTP layer builds a normal plaintext HTTP response. If the connection is plain HTTP, those bytes are written directly to the socket. If the connection is TLS, the server wraps the response in an encrypted application record first.

if context.tls is not None:
    wire = context.tls.protect_application_data(plaintext_response)
else:
    wire = plaintext_response

This is the part I like most about the final design.

The web server still feels like a web server.

TLS becomes a connection concern, not an HTTP concern.

What the toy TLS layer does

The TLS-like layer in this project is intentionally small, but it keeps the important shape of the original TLS series.

The handshake starts with ephemeral X25519 keys.

The client sends a ClientHello containing its ephemeral public key. The server generates its own ephemeral key for this connection and sends back a ServerHello.

Then the server sends authentication material:

the server certificate,
intermediate certificates,
a signature over the two ephemeral public keys.

That signature is the simplified version of CertificateVerify. It proves that the peer presenting the certificate also controls the matching private key for this specific key exchange.

After that, both sides compute the same X25519 shared secret and derive directional AES-GCM keys through HKDF.

Directional keys matter. The client-to-server key protects client records. The server-to-client key protects server records.

client_write_key: records sent by the client
server_write_key: records sent by the server

Application data records also carry sequence numbers into the authenticated encryption. That lets the receiver detect tampering and sequence mismatches.

Again, this is not real TLS 1.3. The real protocol has much more structure: transcript binding, alerts, Finished messages, SNI, ALPN, session resumption, real record headers, and many other details.

For learning the web-server boundary, the smaller version is enough:

handshake creates keys
records protect HTTP bytes
HTTP parser sees only decrypted bytes

The useful lesson was not only cryptography

I started the project thinking the interesting part would be TLS.

The cryptographic pieces do matter. It is useful to understand why ephemeral keys exist, why certificates are about identity rather than encryption, and why authenticated encryption needs sequence numbers.

But the integration lesson was more general.

When you add a lower-level protocol to an existing system, the hard part is often choosing the seam.

If the seam is wrong, everything above it starts learning details it should not know.

If the seam is right, most of the application code stays boring.

In this case, the seam is between TCP and HTTP.

The HTTP parser should not know about X25519. The route matcher should not know about certificates. The file-serving code should not care whether the client used the plain port or the TLS port.

Those parts operate on HTTP.

The connection layer is responsible for turning the outside world into HTTP bytes.

That pattern shows up in a lot of infrastructure code:

network bytes become parsed messages,
encrypted records become plaintext streams,
unreliable external systems become retried operations,
hardware signals become typed events,
logs become structured facts.

A design question I keep coming back to is:

What should the next layer be allowed to assume?

For this project, the HTTP layer is allowed to assume one thing:

I receive plaintext HTTP bytes.

Everything below that belongs to the connection layer.

Why build this if it is not real HTTPS?

A fair objection is: if the project is not browser-compatible, why build it?

For me, browser-compatible TLS is too large if the goal is to understand the integration boundary.

If I used OpenSSL directly, the project would be more practical, but the interesting internals would disappear behind a library call.

If I tried to implement full TLS 1.3, the project would become mostly about protocol correctness. That is valuable, but it would bury the web-server lesson under too much detail.

The educational middle ground is a TLS-like secure channel with the same high-level shape:

handshake
certificate verification
key derivation
encrypted records
application data

Then I can integrate that channel into the web server as if it were a real TLS stack.

The result is small enough to read, but realistic enough to expose the architecture.

That is the kind of learning project I like: simplified around one specific question.

Here the question is:

What does a web server need from TLS?

My answer after building it:

A secure byte stream that produces plaintext HTTP for the existing HTTP layer.

How to run the companion project

The repository is here:

https://github.com/DmytroHuzz/tls_web_server

The visual walkthrough is here:

https://dmytrohuzz.github.io/tls_web_server/

From a fresh clone:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -e ".[dev]"

Then run the one-command demo:

python3 scripts/demo.py

It starts the server in a background thread, sends one plain HTTP request, then sends one HTTP request through the self-written TLS-like layer.

Both paths should return:

HTTP/1.1 200 OK

There are also separate server_side.py and client_side.py files if you want to see the workflow from each side, closer to the style of the original projects.

The tests cover the integration points that are easy to get wrong:

fragmented TLS handshake records,
fragmented encrypted application records,
wrong DNS name certificate rejection,
tampered encrypted record rejection,
sequence-number mismatch rejection,
HTTP requests split across multiple encrypted TLS records,
the same web server serving plain HTTP and HTTP-over-self-written-TLS.

The mental model I keep now

Before building this, I would have described HTTPS as “HTTP with encryption.”

That is fine as a user-level description, but it is not precise enough when writing the server.

The server-side model that helped me is this:

TCP gives bytes.
TLS turns encrypted bytes into authenticated plaintext bytes.
HTTP parses the plaintext bytes.

That sounds simple, but it changes where the code belongs.

You do not sprinkle TLS checks through the HTTP parser.

You do not make the route matcher aware of certificates.

You do not duplicate the file server for secure and insecure connections.

You put a protocol layer below HTTP and keep the rest of the server honest.

That is what I wanted from this project.

Not a production HTTPS server. Not a replacement for OpenSSL.

Just a clearer picture of where HTTPS belongs.

If you want to follow the full path from encrypted TCP bytes to a parsed HTTP request and back again, I put the code and visual walkthrough here:

Companion repo: https://github.com/DmytroHuzz/tls_web_server
Visual walkthrough: https://dmytrohuzz.github.io/tls_web_server/

DEV Community