Intro
Welcome. We're gonna talk about how web servers actually work, in C++, without magic.
If you want to skip the conceptual lore dump, go right ahead to the code sections.
On the topic of theory vs practice, here's a phenomenal quite by Donald Knuth:
If you find that you're spending almost all your time on theory, start turning some attention to practical things; it will improve your theories.
If you find that you're spending almost all your time on practice, start turning some attention to theoretical things; it will improve your practice.---Donald Knuth
This article walks through building a minimal HTTP/1.1 server directly on top of TCP using Boost.Asio.
The focus is not production readiness, but understanding how modern asynchronous C++ maps to real web protocols.
You don't need to be super comfortable with C++ or networking,
but it would certainly help. If you're trying to transition to C++ from a higher-level language, props.
If you want to skip straight to a concurrent HTTP server with templating, read part 2.
HTTP in C++... why?
The first question anyone with sense would pose, upon hearing of someone writing an HTTP server in C++, is why?
Perhaps with a few added exclamation points.
Now, there are a few reasons one would get started with such a feat. Primarily: learning!
For learning purposes, there is no better way to understand HTTP and how the web truly works than by writing an HTTP server in a language where the web isn't a first-class citizen.
(Or at least, where the web wasn't meant to be first-class...)
Moreover, there truly do come times when a very high-performance backend is necessary. It doesn't necessarily have to be for a website, though that is a common use-case.
If you want performance, it just isn't feasible with JavaScript or other high-level languages. For such an endeavour, I would honestly recommend using a language that has more support for HTTP in it's standard library.
For example, Go and Rust. Now, C++ is my bread and butter, but even the people on the standards committee can agree that networking is one place where C++ is lacking.
If you still want to use C++ for a production-grade application, check out libraries like Drogon or Boost.Beast.
They provide higher-level abstractions for HTTP (so you don't actually have to write an HTTP server from TCP sockets).
Where Does that Leave Us?
We're gonna use TCP sockets. There are certainly a few ways you can do that in C++, most people will point you to 3 different methods:
- OS-specific sockets libraries (i.e. WinSock, POSIX/Berkeley sockets)
- libuv (What Nodejs uses)
- Boost.Asio
We're going to use Boost.Asio. From now on, I'm going to refer to Boost.Asio as Asio, but don't get confused with the older, standalone Asio library (whenever I refer to Asio, I mean Boost.Asio).
Asio is distributed under the generous Boost Software License, which is quite similar to other popular licenses like the MIT License, Apache, BSD, etc. Check it out here.
Why did I choose Boost.Asio over OS-specific options? Well, it has a nicer API and is cross-platform.
And why over libuv? For no particular reason other than I was already familiar with Asio and wasn't with libuv. Both are great libraries, and I would suggest you look at libuv for a wider perspective afterward.
What other dependencies will we need? In the future, we're going to need Inja and a JSON library. That's for a templating engine, like Jinja2, Go/Templ, etc.
The Theory
We all love whiteboard sessions, right? No... just me? Whiteboarding is both good and bad. I won't get into that here, but it is necessary to lay down some fundamentals of HTTP before we start coding the thing.
Firstly, let's define our client. Who will actually be sending the requests? We could write our own client in C++ or a higher-level language, but if you read the title, you already know that our client will be HTMX.
HTMX is a great library; if you're unfamiliar with it, it is essentially raw HTML, but you can attach HTTP requests to any element, not just <a> and <form>.
If we were to write an HTTP client in C++, that would be super boring. All we would see would be console logs like "REQUEST SENT!" and "RESPONSE SENT!" and "RESPONSE RECEIEVED!"
Instead, we can have a nice frontend in our browser! At the end of the day, the browser is doing the heavy lifting for the client-side application. But that means all we will have to write is some HTMX code.
HTTP
Now, let's define HTTP itself. An HTTP request is a text-based and consists of a few different parts. Well, technically bodies are binary and headers may encode data.
That means that we don't need any fancy serialization or deserialization logic. To be clear, we're going to write a server for HTTP/1.1, not HTTP/2.
That just makes life a little easier for us. To understand HTTP/2, you need to first understand HTTP/1.1.
HTTP is just a protocol, like TCP and UDP. It stands for Hypertext Transfer Protocol, so you can probably infer it has something to do with HTML.
That intuition is correct, as HTTP is most commonly used in websites (though it's not limited to that).
HTTP Requests have a few main parts:
- Method
- Path
- Version
- Headers
- Body
HTTP Responses are slightly different:
- Version
- Status Code
- Status Message
- Headers
- Body
The body and headers are optional in both requests and responses. An example request is this:
GET / HTTP/1.1
Host: developer.mozilla.org
Accept-Language: fr
And a sample response:
HTTP/1.1 200 OK
Date: Sat, 09 Oct 2010 14:28:02 GMT
Server: Apache
Last-Modified: Tue, 01 Dec 2009 20:18:22 GMT
ETag: "51142bc1-7449-479b075b2891b"
Accept-Ranges: bytes
Content-Length: 29769
Content-Type: text/html
<!doctype html>… (here come the 29769 bytes of the requested web page)
In this scenario, the user asks for a GET request to the path "/", essentially requesting the index.html file, which is the primary HTML document the browser is meant to show. The server then sends back raw HTML within the response body.
The headers simply contain additional details regarding the request or response; these are sometimes needed, such as Content-Type and Content-Length, especially when a body is present.
Yes, I got both of these from the phenomenal HTTP docs by mozilla. Definitely give them a read for more in-depth information about the HTTP protocol.
HTTP isn't magic, and if you still have some questions, they will certainly get dispelled once we reach the coding section. Or ask them at the bottom of this page!
TCP
So, Boost.Asio has no clue about HTTP. All it knows about is TCP (or UDP, but we HTTP requests need to be reliable, hence we use TCP).
Boost.Asio will handle all the low-level details about TCP. It delegates even lower-level TCP details like the 3-way handshake to the OS.
Boost.Asio exposes to us a TCP socket, from which we can read data (i.e., on the server we read the request) and write data (we write back to the client with our HTTP Response).
This dynamic enables us to worry solely about the HTTP data itself and not about the low-level TCP connections that actually underpin HTTP.
If you want to learn more about TCP and fundamental networking in general, I recommend the book TCP/IP Illustrated.
You only really need Volume 1. Whether you get the first or second edition doesn't matter.
From Humble Beginnings
We'll write our HTMX client now. Just so that we can get started. If you want an idea about the project structure, here's a link to the final project.
That is an old commit, but it's probably as far as we'll get to in this post. Check the main branch's HEAD for some more features.
Let's start with some HTML.
test_client/client.html
<!DOCTYPE html>
<html>
<head>
<title>HTMX Client</title>
</head>
<body>
<button>
Get Time
</button>
<div id="time">Waiting for button click</div>
</body>
</html>
Neat, right? No? Oh... you're wondering where the HTMX is?
Wait, wait, I see it!
test_client/client.html
...
<title>HTMX Client</title>
<script src="https://cdn.jsdelivr.net/npm/htmx.org@2.0.8/dist/htmx.min.js" integrity="sha384-/TgkGk7p307TH7EXJDuUlgG3Ce1UVolAOFopFekQkkXihi5u/6OCvVKyz1W+idaz" crossorigin="anonymous"></script>
...
<button hx-get="/time" hx-target="#time" hx-trigger="click" hx-swap="innerHTML">
Get Time
</button>
...
There we go. Now, we inject HTMX via a CDN, and we hook up the button to some HTML events. We're basically saying that when the button is clicked, sent a GET request to the path '/time' and swap the innerHTML of the element with id time.
That element is our div. The innerHTML of the div is just the text inside it.
Straightforward, I know, but enough to get us started!
Onto C++
Now, make sure your C++ build system is linking to Boost.Asio and is set to C++23 (you may be able to get by with older versions, but I haven't tested them). Again, if you want more info on the build system, see the GitHub repo I linked earlier.
Let's just start in the main function to see how we can actually use Asio.
src/main.cpp
#include <boost/asio.hpp>
#include <string>
#include <print>
static constexpr auto port = 3000u;
using boost::asio::ip::tcp;
int main() {
try {
boost::asio::io_context io_context{};
tcp::acceptor acceptor(io_context, { tcp::v4(), port }); // the acceptor connects to a port and waits for a socket
for (;;) {
tcp::socket socket{io_context}; // the socket itself
acceptor.accept(socket); // blocking function call that waits for a connection
// this is intentionally blocking, we'll get to async code in part 2
const std::string message = "hi from server";
boost::system::error_code ignored_error{}; // we ignore the error of the write
boost::asio::write(socket, boost::asio::buffer(message), ignored_error); // this writes a buffer to the socket, in this case just a string
}
} catch (std::exception& e) {
std::println("Error: {}", e.what());
}
}
Now, this is no HTTP, but interestingly enough, we'll have a very similar code flow for the actual HTTP server. We wait for a socket to arrive to our designated port, then read from it, then write to it!
Currently, we're not doing any reading from the socket, but we will soon enough.
Now that we have the basic understanding of how we can actually do networking stuff, let's consider what our http server will look like in code.
Project Design
We start at our HTMX client, the browser. That sends an HttpRequest to our server.
The server itself is composed of 4 layers.
- TCP Accept/Connect Layer (
acceptor.accept(socket);) - HTTP Request Parser
- Router/Dispatcher
- Handler functions
We'll go into more detail as we get to coding each part, but these are the 4 major parts of our project.
At layer 1, it's the same code we already covered, we wait for a TCP socket connection.
At layer 2, we parse the HTTP Request into 3 distinct fields for now: method, path, version. We don't necessarily need to keep track of the version, as we know it's going to be HTTP/1.1, but it doesn't hurt.
At layer 3, we see dispatch to the proper handler based on the path in the http request.
At layer 4, the handler functions will do some logic, build an HTTP Response, and eventually write them back to the socket.
Pretty simple, right?
Layer 1: Connections
Let's make a few new files for the sake of organization. First, though, let's modify main.cpp.
src/main.cpp
#include <print>
#include "tcp_server.h"
// rest of the code...
try {
http::tcp_server server(port); // we will make this class soon
for (;;) {
server.run(); // can you imagine what server.run() will do?
// again, intentionally blocking, we'll get to concurrency in part 2
}
}
// rest of the code...
Now, a new file:
src/tcp_server.h
#pragma once
#include <boost/asio.hpp>
#include <cstdint>
#include <string>
namespace http
{
using boost::asio::ip::tcp;
// this struct represents all the data we'll parse out of the HttpRequest coming from the client
struct HttpRequest
{
std::string method; // yes, these aren't string_views; these need to own the strings
std::string path;
std::string version;
};
class tcp_server
{
public:
tcp_server(uint32_t port);
// call this to actually run the server for 1 single TCP connection, hence the loop in main.cpp
void run();
private:
// look familiar?
boost::asio::io_context m_IoContext;
tcp::acceptor m_Acceptor;
};
}
So far, pretty self-explanatory. The tcp_server will handle layers 1 and 2, of connection and parsing.
Let's start with the run method, which we will define in a new file:
src/tcp_server.cpp
#include "tcp_server.h"
#include <print>
using namespace std::literals; // just some niceties
using boost::asio::ip::tcp;
namespace http
{
tcp_server::tcp_server(uint32_t port)
:
// m_IoContext is default constructed before m_Acceptor automatically
m_Acceptor(m_IoContext, tcp::endpoint(tcp::v4(), static_cast<boost::asio::ip::port_type>(port))) // the constructor for acceptor technically doesn't take in a uint32_t
{}
void tcp_server::run()
{
tcp::socket socket(m_IoContext);
m_Acceptor.accept(socket);
parse(std::move(socket)); // we'll get to this soon
}
}
Nothing extraordinary yet, we've just reorganized the code we had before. But trust me, it's finally about to start to get interesting.
Layer 2: Parsing
Parsing HTTP, for our case so far, is surprisingly easy. We don't need to worry about headers or a body for our purposes today.
All we care about is the stuff defined in HttpRequest in tcp_server.h.
In case you're wondering what parse(std::move(socket)); was doing earlier, here we are.
src/tcp_server.cpp
#include "response_handler.h" // we'll get here later
// existing code...
namespace http
{
namespace // a C++ anonymous namespace
{
// why do we take socket by value you may be wondering?
// well, once we get to concurrency, the parser needs to take ownership of the socket for lifetime's sake
// for now, we could just pass it by reference
void parse(tcp::socket socket)
{
HttpRequest request;
// this is the buffer we will read into
boost::asio::streambuf request_buffer;
// this is the magic
boost::asio::read_until(socket, request_buffer, "\r\n"); // why is this our delimiter? read on
// this stream just makes it easier to extract strings from our request_buffer
std::istream request_stream(&request_buffer);
// why can we just do this? it's because the method, path, and version are deliniated by spaces in between, so we can reliably just extract them using the c++ std::istream
request_stream >> request.method >> request.path >> request.version;
http::response_handler::dispatch(socket, request); // again, we'll get here later. make sure to add the include at the top of the file
// we finally shutdown this socket
// we don't have logic for reusing a socket, though it wouldn't be too difficult to add if you wanted to.
boost::system::error_code ignored_error{};
socket.shutdown(tcp::socket::shutdown_both, ignored_error);
socket.close(ignored_error);
}
}
// existing code...
}
Now, all the code comments should explain what's going on.
However, you may be interested that "\r\n" is our delimiter for reading the socket. That's because in HTTP, all new lines are separated by "\r\n" like CRLF line-endings on Windows.
The method, path, and version are all on one line.
Then, each header has its own line.
Then, before the body, there is an extra blank line.
Then, there is the body.
So, if we wanted to read all the headers as well, we could just change the delimiter in read_until to be "\r\n\r\n",
as the last header ends with "\r\n", then another "\r\n" after the last header for an extra blank line.
All of this is guaranteed by the HTTP/1.1 spec.
This is sufficient for our simplified parser, but production servers must handle partial reaads and malformed requests.
Layer 3: Router/Dispatcher
If you were wondering what the http::response_handler::dispatch(socket, request); was doing, here we are.
Now that we have parsed the information we need out of the HTTP request, we can finally get to executing our custom logic based on each route.
First, let's dispatch to those routes (routes are just paths).
Add a new file:
src/response_handler.h
#pragma once
#include <boost/asio.hpp>
namespace http
{
// forward declaration. Would be cleaner to seperate out HttpRequest into a seperate file
struct HttpRequest;
using boost::asio::ip::tcp;
// we don't need a whole class for this, so I just put it into a namespace
namespace response_handler
{
void dispatch(tcp::socket& socket, const HttpRequest& request);
}
}
Quite a short file. The implementation file will be a little more lengthy, though.
Let's get on to that.
Add a new file:
src/response_handler.cpp
#include "response_handler.h"
#include "tcp_server.h"
namespace http::response_handler
{
void handle(tcp::socket& socket, const HttpRequest& request)
{
if (!path_handlers.contains(request.path))
{
const auto response = HttpResponse{}
.set_version(request.version)
.set_status(404, "Not Found")
.set_body("Not Found");
response.write(socket);
}
auto response = path_handlers.at(request.path)(socket, request);
response.write(socket);
}
}
This may seem a little foreign, for good reason, considering I didn't put in any code comments.
However, I hope you see the idea here.
We have path_handlers, essentially a jump table. It's an unordered_map with a key of std::string_view (path) and a function that does some logic.
That will be part of layer 4: handler functions.
We just execute the function that corresponds to the given path. If it doesn't exist, we respond with a 404 error.
Speaking of 404, in HTTP there are certain status codes that have given meanings. 200 means OK. 404 means NOT FOUND. If you want to see more, see the MDN docs for HTTP.
Status codes are only in HTTP responses, not requests.
Layer 4: Handler Functions
We're finally at the last layer. Let's wrap this up with a nice little bow.
Edit the file:
src/response_handler.cpp
// existing code
#include <unordered_map>
#include <map>
#include <string_view>
#include <chrono>
#include <format>
#include <fstream>
using namespace std::literals;
namespace http
{
namespace // again, a C++ anonymous namespace
{
// this gives a nice API to build and write HTTP Responses
struct HttpResponse
{
private:
std::string version{}; // i.e. HTTP/1.1
int status{0}; // i.e. 404
std::string status_line{}; // i.e. "NOT FOUND"
std::string body{};
// some default headers we want
std::map<std::string, std::string> headers{
{"Content-Type", "text/plain; charset=utf-8"},
{"Access-Control-Allow-Origin", "*"},
{"Connection", "close"}
};
public:
HttpResponse() = default;
HttpResponse& set_version(std::string new_version)
{
version = std::move(new_version);
return *this;
}
HttpResponse& set_status(int new_status, std::string new_status_line)
{
status = new_status;
status_line = std::move(new_status_line);
return *this;
}
HttpResponse& set_header(std::string header, std::string value)
{
headers[std::move(header)] = std::move(value);
return *this;
}
HttpResponse& set_body(std::string new_body)
{
body = std::move(new_body);
// Content-Lenght expects bytes
// length == bytes for UTF-8
headers["Content-Length"] = std::to_string(body.length());
return *this;
}
void write(tcp::socket& socket) const
{
std::string message = version + ' ' + std::to_string(status) + ' ' + status_line + "\r\n";
for (const auto& [key, value] : headers)
{
message += key + ": "s.append(value).append("\r\n");
}
message += "\r\n" + body;
boost::asio::write(socket, boost::asio::buffer(message));
}
};
// typedef for a function pointer, in our case we'll use lambdas. You could use any sort of function pointer, though
using path_handler_fn = HttpResponse (*)(tcp::socket& socket, const HttpRequest&);
const std::unordered_map<std::string_view, path_handler_fn> path_handlers = {
// the handler for the '/' path. The server will serve the whole html client html file
{
"/", [](tcp::socket& socket, const HttpRequest& request) -> HttpResponse {
// only the GET method is allowed at this path
if (request.method != "GET")
{
const auto response = HttpResponse{}
.set_version(request.version)
.set_status(405, "Method Not Allowed")
.set_body("Method Not Allowed");
return response;
}
// binary mode is useful, and we start at the end of the file so that we can easily tell its length
std::ifstream file("test_client/client.html", std::ios::binary | std::ios::ate);
if (!file.is_open() || !file.good())
{
const auto response = HttpResponse{}
.set_version(request.version)
.set_status(500, "Internal Server Error")
.set_body("Failed to open client.html");
return response;
}
const auto size = file.tellg();
std::string buffer(size, '\0');
file.seekg(0);
file.read(buffer.data(), size);
const auto response = HttpResponse{}
.set_version(request.version)
.set_status(200, "OK")
.set_header("Content-Type", "text/html; charset=utf-8")
.set_body(buffer);
return response;
}
},
// the handler for the '/time' path (if you remember, its for when the button is clicked)
{
"/time", [](tcp::socket& socket, const HttpRequest& request) {
const auto currentTime = std::chrono::system_clock::now();
// nicely format the current time
const auto timeString = std::format("{:%Y-%m-%d %H:%M:%S}", currentTime);
const auto response = HttpResponse{}
.set_version(request.version)
.set_status(200, "OK")
.set_body(timeString);
return response;
}
}
}
}
// existing code
}
Wow, that was a doozy. But with that, we've added handler functions. We also added the API for building HttpResponses and writing them back to the client via the TCP socket.
You've made the transition from a rookie understanding of HTTP to a solid, mid-level to senior-level understanding of HTTP.
We started from raw Boost.Asio TCP sockets, and made it to serving an HTMX frontend.
If you want to see your website, run the server and open http://localhost:3000 in your browser.
Now, there are a few things left to implement. Those being concurrency and a templating engine.
We also may have to handle CORS in the future. If you are encountering issues with CORS, see my up-to-date GitHub repo.
The templating engine is really quite straightforward.
Also, the Asio in Boost.Asio stands for "Async IO," so concurrency should be native to Asio.
We'll find out soon.
Read part 2
Notes On the Build System
Personally, I chose to use CMake for my build system.
CMake is the de-facto standard for C++. Once you get familiar with CMake, it's not too bad.
Build systems are the type of things you write once and copy paste into every new project you make (until it doesn't work).
Regardless, to install Boost.Asio, I used vcpkg, a C++ package manager made by Microsoft.
In my experience, it's been the most reliable and easy to use C++ package manager.
I would write out the build scripts here, but it's easier if you just look at the repo.
Look at the CMakeLists.txt file, the CMakePresets.json file, and the vcpkg.json file.
Top comments (0)