DEV Community

Gerardo Enrique Arriaga Rendon
Gerardo Enrique Arriaga Rendon

Posted on

Planning an online C compiler for IPC144

This is the start of a trilogy of posts that I will be providing over the week. I will be talking about my thought process and progress on a specific challenge I have to work on: Creating an online C compiler for running code examples.

Motivation

On a previous blog post, I had mentioned that I was working on editing a course notes for the migration of the new website. The course, IPC144, focuses on introduction to programming with C.

C is a language that you love or that you hate. You either understand the concepts that are involved, or you struggle to understand them until the end of the course. A lot of people tend to struggle to learn and understand pointers, which is probably one of the most important concepts you need to know to do useful programs. Also, there are other issues that people tend to not understand even after ending the course, such as "why I don't have to return the address of an array?" or "why does the C compiler not accept my code? What do all of these errors mean?", along with some other questions that a beginner to programming may have.

I was always thinking, "why is this case?" Why do people still struggle with C, despite learning it constantly for 13 consecutive weeks? One of the answers that I had arrived was: "It is not easy to experiment with C at this level."

What do I mean by this? Well, a good way to ingrain concepts in your brain is by experimentation. I believe people are better at learning a concept when they have to put into practice: it puts the brain to think about it just learned and how to use it; making connections between the concepts to form a better image of the concept, and thus achieve a better understanding.

However, experimenting with C can be tedious. If you are in Windows, you may try practice with Visual Studio, but Visual Studio is slow. Every time that you want to practice a new concept, you have to open an old project that you were using, or you have to create a new project. Either way, it is really slow. It is somewhat understandable, since Visual Studio does a lot more than just text editing and compilation, but this is not something I will be concerned about when I am practising.

As a side note, I will like to clarify something. You might be asking, "why would I use Visual Studio, when I can use Visual Studio Code?" You are right. This makes sense when you are already familiar with coding enough that you are aware that environments like Visual Studio exist, but we are talking about beginners. In the IPC144 course, they recommend using Visual Studio to students because that's what they use throughout the course, so it makes sense that the only IDE they know is Visual Studio. Also, in some computers, Visual Studio Code can be as painfully slow as Visual Studio, although this might depend a lot on the project you are dealing with (at least that was my experience when doing webdev for my WEB422 course).

Let's imagine that you are in Linux, because they told you in your ULI101 (ULI is the Introduction to Unix/Linux course) class to use it. So, the only option you may have is vi, vim, or nano. Remember, you are a beginner to this whole new world of programming, and you are being bombarded with loads of information every week. The fact that you installed Linux in your own Windows machine already shows up that you are enthusiastic of what you are learning, so this case is not even the average student. The average student would be a Windows user, hesitating about installing Linux because they do not know much about it.

In either case, every little thing that you need to go through just to experiment is a distraction. Every little distraction that adds on top of the pile is going to be another motive for you to experiment with C.

So the question is, how can we facilitate this experimentation?

Runnable and Editable code examples

A solution I have thought for a long time since I finished the course was allowing for not only runnable, but also editable examples on the website.

Why is a runnable example not enough? Why would you need editable examples as well? These two questions have the same answer: they facilitate experimentation.

Be aware that I am not trying to motivate the student to experiment. After all, motivation depends on the student, everybody has different levels of motivations. In my case, I did not have a lot to experiment on my own, because I found it tedious back then.

What I want to do is to make it easier for anybody to run these examples. So, the next question is, how can we allow that?

What do we need to accomplish?

In the repository of the new website, I had created the request for this feature. You may read the issue to understand what might be the next steps, but I am going to explain it here too.

Most of the C environments that you can find on the web, the process to compile a C project you wrote on the browser is by sending a POST request to some web server, which will handle the compilation of the program. After the program is compiled, the server will also run it and send the output of the program as a response to the POST request.

There are some important implications with this method. If you are not safeguarding properly, this is literally a remote code execution exploit: the fact that the server is going to run the program I wrote means that I could do whatever I wanted, right? Well, yeah, if the server is not properly configured. And even if you safeguard against these exploits, you may still have to worry about resource usage of the compilation or of the running program. As a funny example of this, read this answer to the codegolf challenge to produce a "Compiler Bomb".

Either way, we are not going to handle something like this. After all, the IPC144 website is a static web server! It will not handle any kind of POST requests, as most of the interactivity is provided by using React.

So, are we at a stalemate? Not really, there is something else that might save us: WebAssembly.

If you haven't heard about it, WebAssembly is a rather new technology to run code that is not JavaScript. Essentially, it is an instruction set for a virtual machine that is run by the browser. It is fairly similar to what Java does: "compile once, and run everywhere!" In this case, however, it tends to be more language agnostic than Java bytecode.

So, how can we use WebAssembly to accomplish the task?

WebAssembly to the rescue

Essentially, we want to compile a C-to-WebAssembly compiler, which will then provided as a wasm file to the browser client. Then, we need a way to redirect the text code that the user writes to this compiler. The reason the compiler needs to compile from C to WebAssembly is because we also need to let the browser run the code so that the user can see the effects of their code.

This, of course, sounds easier than it actually is. However, most of the technical details is going to be for future me to figure out.

Expectations

The major expectation I have is that I have a live demo of the program, but I am aware I won't be able to achieve so quickly.

Thus, I have other smaller objectives to look for. The first thing I am looking for is to make the simplest code example, a "Hello World" program to be compilable and runnable in the browser. This will require a lot of configuration, since emscripten, a C-to-WebAssembly compiler, actually has some dependencies like LLVM and clang, which took me over 6 hours to compile a single time so I could set the environment on my computer. I would have to do a heavy research on the internals of emscripten, clang and LLVM so I can make educated guesses on the configuration I would need to compile it into a wasm file.

Wish me the best of luck... I might go insane after this!

Discussion (0)