DEV Community

Meet Gandhi
Meet Gandhi

Posted on

Writing an event loop in C is easy, Until...

I wrote an event loop using epoll in C this week and I was surprised at how easy it was! But... TCP did not like the idea 🫣

I am 2 weeks late, but this time I have written an event loop in C. NodeJS and Nginx both have similar event loops running inside them. In fact Nginx's event loop is much much faster since it avoids system calls as much as possible. And I wrote the event loop before I came to know of this.

So if you are a new reader, I along with some more friends am building CodeArena, a place for students to learn programming and complete assignments given by the professors and the Teaching Assistants. And my job is to design a system which can run the student's code on a remote server because apparently it is hard to run it on their browser 😂

So my initial (and naive) approach was to use pthreads for everything. Literally, you want to send a message? A pthread, you want to hold elections, boom two more pthreads at your service. But... (Yeah there's always a "But...")

This created a huge problem - mutexes! I love what mutexes do, but they involve a ton of system calls making the thread approach very very inefficient 😓

So to solve that, I brainstormed with gemini, and after hours of nice brainstorming I came to the event loop. I googled what it was how it fitted to my use case, what were the tradeoffs and should someone as naive as me use it. To my surprise, this is the first time something actually happened to fit right into my usecase!

An event loop can be explained as simply a waiter in a restaurant which only takes your order if you raise your hand, it juggles multiple programs on its own without calling the OS for help in context switching. These "programs" are mostly functions, hence the usage is super duper simple.

But is the implementation also that simple? Surprisingly yes! You just have to read the man pages of 4 c functions and you are done (those 4 will take to 4 more but I focus on the positive part, so let's stick to the original 4 right now 😂).

Now once the loop was implemented I started rewriting my c code to use the event loop. I created tons of new functions, each doing its own job. For a minute it felt like using NodeJS with a C syntax.

That was until I hit TCP 🫣. TCP has its own rabbit holes and you might wanna be careful when trying it. I mean TCP can be non blocking but for it to be non blocking you'll have to have accept and connect in a non-blocking fashion, and that requires more work. I did try it, but I am going too slow here due to a superpower. When I face hard things, I want to leave them 😵

A lot of times I thought, why not just give this to a LLM and just understand the concept? I mean that should be valid right? But then I force myself to do it, not because AI usage is not good or I won't learn anything when using AI but because whatever I do myself will become a part of my memory and when I actually require it sometime in the future I would know where to find things or where to start looking instead of straight up opening the LLM and delegating the task to it.

I have completed writing the monitor node (for now 😂) and I am writing the assigner node, but TCP is in the way 🫥 and I am trying to tackle it...

Can I get some CODE?

So this is the part when I share the code for the event loop. You can also find it on github once my repository becomes public.

To write this code, you might need to go through these man commands (atleast I went through these):

man 7 epoll
man epoll_create1
man epoll_event
man epoll_wait
man epoll_ctl
man 2 fcntl
man 2 time
man 2 gettimeofday
man ctime
man clock_gettime
man timerfd_settime
man timer_create
man timerfd_create
Enter fullscreen mode Exit fullscreen mode

And the code is this:

void startLoop(int epollfd, int max_events, int nfds, ...) {
    struct epoll_event ev, events[max_events];

    ev.events = EPOLLIN | EPOLLRDHUP | EPOLLET;

    va_list args;
    va_start(args, nfds);

    for (int i = 0; i < nfds; i++) {
        struct socketDetails *sd = va_arg(args, struct socketDetails *);
        ev.data.ptr = sd;
        if (epoll_ctl(epollfd, EPOLL_CTL_ADD, sd->fd, &ev) != 0) {
            printc(RED, "monitor", "Failed to add fd to interest list\n");
            return;
        }
    }

    va_end(args);

    while (1) {
        // epoll_wait should listen for ev events
        // with a max of 6 events since 2 sockets can have 3 events each
        // so even if all occur simultaneously all the events will be caught
        nfds = epoll_wait(epollfd, events, max_events, -1);
        if (nfds <= 0) {
            printc(RED, "async loop",
                   "epoll_wait returned invalid number of ready sockets\n");
            return;
        }

        for (int i = 0; i < nfds; i++) {
            struct socketDetails *sd =
                (struct socketDetails *)events[i].data.ptr;
            sd->handler(sd);
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Yep... This is the whole code. But if you were to write this on your own honestly it would take some time and some man pages to write this. There's a lot of things to take care in here and I'll explain them.

Firstly, look at what socketDetails is:

struct socketDetails {
    int fd;
    void *data;
    void *(*handler)(struct socketDetails *sd);
};
Enter fullscreen mode Exit fullscreen mode

socketDetails is a struct containing the file descriptor of the socket, any additional data you might want to pass in to the handler function (for context, you'll know what I mean later) and a handler function to call when the file descriptor has an event.

Explaining the Code:

Let's start slow

void startLoop(int epollfd, int max_events, int nfds, ...) {
Enter fullscreen mode Exit fullscreen mode

The ... are just variadic function arguments in C. I also got to know them due to this project. So this syntax allows you to pass in any number of arguments to your c function but you have to handle the segmentation fault errors. It does not tell you if you reached the end of the arg list, you as a programmer have to take care by passing an explicit number to indicate the number of arguments coming in (here nfds is doing that)

epollfd is just a file descriptor created like this:

int epollfd = epoll_create1(0);
Enter fullscreen mode Exit fullscreen mode

but this is outside the function so that other functions can access it when their file descriptor receives an event.

max_events is the maximum events epoll_wait can hand you. If there are more events, you get those remaining ones in the next epoll_wait call, just a small limiting system to help the programmer and the kernel together.

nfds is the count of how many arguments are passed to the function

The next is:

    struct epoll_event ev, events[max_events];

    ev.events = EPOLLIN | EPOLLRDHUP | EPOLLET;
Enter fullscreen mode Exit fullscreen mode

So this code snippet does two things, firstly it declares an epoll event and then declares an array of events. the variable ev is used to add a file descriptor along with the events to monitor in the interest list of the program internal to the OS.

So ev represents a single event. And events is an array which is filled by the OS when an event occurs on any file descriptor in the programs interest list.

The second line is simply setting what events we want to monitor on our file descriptors. Since all the sockets I pass it initially are UDP sockets, I tell epoll to monitor EPOLLIN event which tells me if the socket is ready to read. I also look for the EPOLLRDHUP event which simply tells me if the connection is broken from the other side (not useful for UDP but I still kept it if I change my mind to even add TCP sockets during the initialization 😅).

The EPOLLET event is an interesting one, it was introduced in a new version of epoll and it stands for "Edge Triggered EPOLL".

For an easier explanation, picture this:

epoll_wait wakes up to find 1024 bytes of data in socket A. Currently it is in its default mode "Level Triggered" (it does not have an explicit flag like EPOLLET). So a function is called which reads 1000 bytes of data from socket A and again calls epoll_wait. But since 24 bytes are still in the socket buffer, epoll_wait instantly returns control to the function. It won't actually sleep until all the data in socket A is read.

Now let's see EPOLLET, initially socket A was empty and it receives some data. Now as expected epoll_wait wakes up and passes control to the function. Now is the important part, if the function just reads 1000 bytes and again calls epoll_wait, then this function will never be called again for socket A. Edge Triggered epoll has a strict rule, it only wakes up if the socket was empty and now has data. It won't wake up if there was data unread and more data comes in. This means you have to read a socket until you hit the EAGAIN error and if you don't, you are doomed (not literally though 😂)

Now the next snippet is:

    va_list args;
    va_start(args, nfds);

    for (int i = 0; i < nfds; i++) {
        struct socketDetails *sd = va_arg(args, struct socketDetails *);
        ev.data.ptr = sd;
        if (epoll_ctl(epollfd, EPOLL_CTL_ADD, sd->fd, &ev) != 0) {
            printc(RED, "monitor", "Failed to add fd to interest list\n");
            return;
        }
    }

    va_end(args);
Enter fullscreen mode Exit fullscreen mode

This snippet parses the arguments passed into the variadic function.

va_list is a type which will contain all the arguments passed to the function

va_start, va_arg and va_arg are just C macros. va_start tells the compiler the address of the last named argument. va_arg simply puts data equal the size of the passed type into the left hand side variable (which is a pointer in this case). va_end is simply the cleanup guy, it cleans up any memory allocated by va_start and also deallocates va_list.

Next I simply attach the pointer to the data of the epoll event. This helps me know what to do when a specific socket receives an event (much better than a if-else or switch-case ladder). And after adding the context, I add this event to the interest list of the process for the OS to refer.

That's it, this snippet seems big, but it does not do a lot

The next snippet is this:

    while (1) {
        // epoll_wait should listen for ev events
        // with a max of 6 events since 2 sockets can have 3 events each
        // so even if all occur simultaneously all the events will be caught
        nfds = epoll_wait(epollfd, events, max_events, -1);
        if (nfds <= 0) {
            printc(RED, "async loop",
                   "epoll_wait returned invalid number of ready sockets\n");
            return;
        }

        for (int i = 0; i < nfds; i++) {
            struct socketDetails *sd =
                (struct socketDetails *)events[i].data.ptr;
            sd->handler(sd);
        }
    }
Enter fullscreen mode Exit fullscreen mode

Okay, too big? Yes. Does too much? NO.

So firstly the loop is obvious I guess, the name is "event loop" 😂

epoll_wait is the function that wakes up when some file descriptor receives an event. And ndfs contains the number of events to process, so we have to loop for each event.

The rest is trivial I guess, when processing an event in the for loop, just call the handler and pass in the socketDetails pointer.

And as always, but... epoll_wait returns -1 if you press Ctrl + C or give any signal to the program. I just wanted the program to end on any signal so I kept ndfs <= 0, you could keep it ndfs <= -1 if you don't want the event loop to get interrupted.

Can you write your own Event Loop?

Of course! Now that you have seen the code and probably referred the man pages, you can write the event loop faster than just starting from scratch. But don't you trust me blindly, I might be wrong at a ton of places which even I might not know (yet 😂) so keep in mind:

Meet is Human and can make mistakes.

(If you did not catch it, that is written at the bottom of every gemini chat, just a small joke 🫠)

So what now?

Now I will push my mind to write the non blocking tcp code and complete the assigner and, you know it, "stay more consistent on blogging" 😂

And this is just one part, this time I have decided to split my blogs into parts so that is writable for me, and readable for you kinda a microservices architecture where everything is independently scalable (no need to point out the cons, I know it 😂🫣)

Top comments (0)