DEV Community

Ethan
Ethan

Posted on

Building an HTTP Server from Scratch in C: A Journey into Network Programming

When you type a URL into your browser and hit enter, a complex dance of network protocols springs into action. But how many developers actually understand what's happening under the hood? I decided to find out by building a bare-metal HTTP/1.1 server in C—no frameworks, no libraries, just sockets and the HTTP specification.

Why Build This?

I had three main motivations for this project:

Learning by doing. Reading about TCP sockets and HTTP is one thing; implementing them yourself is entirely different. You discover edge cases the documentation never mentions and develop an intuition for how these fundamental protocols actually work.

Preparing for security work. My ultimate goal is to add TLS encryption and build security testing tools (scanner, fuzzer, analyzer) on top of this server. You can't secure what you don't understand, and understanding starts at the lowest level.

The satisfaction of first principles. There's something deeply satisfying about writing socket(), bind(), and listen() yourself and watching your browser successfully connect to your creation.

What I Built

The server currently supports:

  • HTTP/1.1 protocol with proper socket programming
  • GET requests for static file serving
  • POST requests with form data parsing
  • Multiple route support (/, /about, /submit)
  • Proper HTTP status codes (200, 404, 400, 405, 500)

The server listens on port 8080 and handles one connection at a time (yes, I know—more on that later).

The Technical Deep Dive

Setting Up the Socket

The foundation of any network server is the socket. Here's the basic flow I implemented:

  1. Create address info structure using getaddrinfo() to handle both IPv4 and IPv6
  2. Create the socket with socket()
  3. Set socket options with SO_REUSEADDR to avoid "Address already in use" errors during development
  4. Bind to a port so the OS knows which port belongs to our server
  5. Listen for connections with a backlog of 1 (handling one connection at a time)

The beauty of using getaddrinfo() is that it makes the code protocol-agnostic. The same code works for IPv4 and IPv6 without modification.

Parsing HTTP Requests

Once a connection arrives via accept(), I receive the raw HTTP request into a buffer:

bytes_received = recv(new_fd, buffer, sizeof(buffer) - 1, 0);
buffer[bytes_received] = '\0';
Enter fullscreen mode Exit fullscreen mode

The request comes in as plain text that looks like this:

GET /about HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0...
Enter fullscreen mode Exit fullscreen mode

I parse the request line using sscanf() to extract the method, path, and HTTP version. Simple, but effective for this learning project.

Serving Static Files

For GET requests, the server maps URLs to files and serves them. The routing logic is straightforward—special paths like / map to index.html, while other paths are treated as literal filenames.

The file serving function handles several scenarios:

Success case (200 OK): Read the file, calculate its size, send proper headers with Content-Length, then send the file content.

File not found (404): Return an HTML error page explaining the file doesn't exist.

Memory allocation failure (500): Return an internal server error if we can't allocate memory for the file content.

One interesting challenge here was getting the Content-Length header right. Browsers need to know how many bytes to expect, so I use ftell() after seeking to the end of the file to get the exact size.

Handling POST Requests

POST request handling is significantly more complex than GET. Here's why:

The HTTP request body might not arrive all at once. The headers come first, terminated by \r\n\r\n, and then the body follows. The body might be small enough to arrive in the same TCP packet, or it might require multiple recv() calls.

My approach:

  1. Extract Content-Length from the headers to know how many body bytes to expect
  2. Find the body start by locating the \r\n\r\n delimiter
  3. Calculate partial body bytes already received in the initial buffer
  4. Loop and receive additional data until we have the full content length
  5. Parse the form data (key=value pairs separated by &)
  6. Write to output.txt for persistence

Here's the loop that ensures we get the complete body:

while(body_bytes_received < content_length) {
    ssize_t new_bytes = recv(new_fd, full_body + body_bytes_received, 
                            content_length - body_bytes_received, 0);
    if (new_bytes > 0) {
        body_bytes_received += new_bytes;
    }
}
Enter fullscreen mode Exit fullscreen mode

This was crucial to learn—network programming is fundamentally asynchronous, and you can't assume data arrives all at once.

Challenges and Lessons Learned

Memory Management in C is Unforgiving

Coming from higher-level languages, C's manual memory management was a wake-up call. Every malloc() needs a corresponding free(), and forgetting one creates a memory leak. I use Valgrind during development to catch these issues.

For the file content and POST body, I dynamically allocate based on the size:

char *file_content = malloc(size + 1);  // +1 for null terminator
// ... use it ...
free(file_content);  // must not forget!
Enter fullscreen mode Exit fullscreen mode

Buffer Overflows Are Real

With a fixed BUFFER_SIZE of 8192 bytes, I had to be careful about buffer overflows. Using sizeof(buffer) - 1 when receiving data and properly null-terminating strings became second nature. One typo using strcpy() instead of strncpy() could lead to security vulnerabilities.

HTTP Parsing Edge Cases

The HTTP specification has many edge cases I didn't initially consider:

  • What if there's no Content-Length header in a POST request?
  • What if the request is malformed with no \r\n\r\n delimiter?
  • What if the Content-Length value isn't a valid integer?

Each of these required adding defensive checks and returning proper 400 Bad Request responses.

The Debug Printf Dilemma

You'll notice the code has many printf() statements with "DEBUG:" prefixes. During development, these were invaluable for understanding the flow of data. In a production server, these would be replaced with a proper logging framework, but for learning purposes, seeing the raw data flow in the terminal was incredibly educational.

Concurrency (or Lack Thereof)

The biggest limitation of my current implementation is that it handles only one connection at a time with BACKLOG 1. While one client is being served, others must wait. The traditional solutions are:

  • Fork a new process for each connection (the Apache model)
  • Use threads with pthread
  • Implement non-blocking I/O with select/poll/epoll (the nginx model)

I plan to explore these approaches as the project evolves.

What's Next: Security Focus

The natural next step is adding TLS/SSL encryption, but I'm taking this further by building a security testing suite:

TLS 1.2/1.3 Implementation: Understanding encryption protocols from the ground up using OpenSSL or mbedTLS.

Security Scanner: A tool to scan web servers for common vulnerabilities (missing headers, outdated protocols, weak ciphers).

HTTP Fuzzer: Automated testing that sends malformed requests to find parsing bugs and potential crashes.

Vulnerability Analysis: Documentation of common web server vulnerabilities and how to prevent them.

The goal is to understand not just how to build a server, but how attackers might exploit one and how to defend against those attacks.

Key Takeaways

Building this HTTP server taught me more about networking in a week than years of using high-level frameworks:

Abstractions hide complexity. Express.js and Flask make web development easy, but they obscure the fundamental protocols. Understanding the foundation makes you a better developer at every level.

C demands precision. Every byte matters. Every pointer must be valid. Every buffer must be sized correctly. This discipline translates to better code in any language.

Network programming is asynchronous by nature. You can't assume data arrives all at once or in the order you expect.

Security starts at the design level. Every parsing decision, every buffer allocation, every user input is a potential vulnerability if not handled carefully.

Try It Yourself

If you're interested in learning network programming, I highly recommend building your own server. Start simple with a hello-world HTTP response, then gradually add features. The Beej's Guide to Network Programming is an excellent resource for learning socket programming in C.

The complete source code for this project is available on my GitHub, and I'll be documenting the security additions in future posts.

What low-level project are you working on? I'd love to hear about your experience in the comments below.


This is part of my journey into network security. Follow along as I add TLS encryption and build security testing tools on top of this foundation.

Top comments (0)