Introduction
Running a program is broken, and in this post I will explain the reasons in detail, as well as the solution. I will be talking primarily about server programs running on Linux, but the same applies to any other program running on a modern OS.
The UNIX process model
Any program can be thought of as a client-server program, where a server is a program that provides a machine-friendly interface to the client program, which exists to provide an interface to a person, or to another program. A server gets a request (a command or a query, to be precise), performs computations, and returns a response. So far so good.
Modern computers run more than one program at a time, and they need to share the resources of a computer. These are more conventional ones like CPU, Memory, and less conventional, like TCP and UDP ports, which one computer can have a limited number of, as a limitation of the networking protocol. There is a special program tasked with managing shared access to these resources - the operating system. It is an abstraction, that allows the program to run as if it's the only one present on the computer - features such as virtual memory help with that.
Another advantage is that the program needs to communicate only with the OS to perform its work. Need more memory? Call the "mmap" system call. Need to process a request in parallel? Call "clone" or "fork".
But resources of a computer are not infinite. How does a unix model of a running program - which is a process - deal with resource exhaustion?
A process owns most OS resources that it uses (aside from the ones owned by the OS kernel, in service of the program). When a process uses more memory than is available - it gets killed by the kernel. If it uses more CPU time than its limit allows - it also gets killed. Same for any other limits imposed on a program, with the "setrlimit" syscall.
With this model, a process can run only one computation at any moment. Imagine a process running multiple computations in parallel (in threads, for example, or asynchronously) - what would happen if one computation uses up more resources than is available? The entire process gets killed - and with it, all computations within it. But why should other computations get terminated, besides the offending one? By that logic, why shouldn't a computer reboot, if a user opens a file that is too large in a text editor? By logic, only the text editor should be terminated - but all other computations should keep running.
In fact, this is how programs used to work on Unix. A request comes into the server - the daemon forks off a child process, (or passes a socket to the worker process via "UNIX domain socket file descriptor passing/transfer") that handles the request. If the request uses up too much resources - it gets killed, a user is returned an error. But the other requests are unaffected. One can argue for spawning a new process per request - this way, memory is zeroed, CPU time is accounted correctly.
Non-blocking I/O
There are two issues with this - context switching between threads (and processes) is more expensive than necessary, and thread (and process) stacks by default have too much memory. Which is why industry moved to non-blocking I/O - as you can see for yourself in Ryan Dahl's presentation in Node JS at JSConf 2009.
But using non-blocking I/O with a thread per CPU core breaks the entire model of resource ownership. To be safe, you need to set arbitrary limits on the number of concurrent requests - and hope that you left enough of a safety buffer for outliar requests - if you didn't, the entire process gets OOM killed, resetting the session for every user. And you can't measure memory consumption for the "current request" and have it terminate itself if it runs out - you simply have no such ability in languages with automatic memory management. Only the language Zig allows you to do that - but it also depends on the web framework, if it adopts this mindset of keeping track of resources. And even in this case, having an event loop, and manually keeping track of resources in userspace - that is just duplicating work, which is already being done in the kernel. This is just bad architecture.
The wrong abstraction
Besides, why do we need to roll our own scheduler - our own event loop - when there already is one in the kernel? Well, because the kernel scheduler is not optimal, it operates on time slices, and not when operations themselves pause and resume, its multitasking is preemptive, not cooperative. But setting the scheduling policy of a process to SCHED_FIFO makes it cooperative - such threads will only be preempted when a syscall "sched_yield" is called. So, why can't asynchronous programs rely on the kernel's event loop?
Because they are running on "metal", and not on "kernel" - they can't call "sched_yield" and remain cross-platform! But in reality, they are running on an "OS", in a "virtualised" environment - they need to cooperate with the kernel. The kernel provides its API, for programs to clearly express their intentions, and fit into the process execution model nicely - only for them to disregard all of it, and only half-commit to running on a kernel, and half - on "metal". It's trying to make use of two incompatible abstractions.
But the kernel (the Linux kernel) also has work to do. Firstly, make its interfaces more performant - make an incompatible "v2" interface, drop support for antiquated features, only leave what's most necessary. If the reason for poor performance is the userspace-kernelspace boundary - allow programs to be loaded as kernel modules, with a more convenient interface. Maybe the kernel can get a WASM interpreter, with performance on par with machine code, to get around having to recompile programs for every possible kernel they can run on. I imagine there are better options, but this is enough to start a discussion.
Conclusion
In conclusion - modern programs are running half-broken, because they don't adhere to the process model for the sake of performance, and roll their own event loops due to not commiting to an appropriate abstraction, instead picking a lower-level abstraction, and reinventing the wheel (the event loop), while doing a poor job at it (as evidenced by the debugging experience of event-loops).
Top comments (0)