DEV Community

Discussion on: Is "C Programming language" Still Worth Learning in 2021?

Collapse
 
benaryorg profile image
#benaryorg

Working in a PHP-ish environment (I'm a sysadmin and I talk to a lot of programmers) I can attest that knowledge of how the underlying operating system works is very sparse with PHP devs, even though PHP is very close to C (up to the point of having very neat FFI, bindings for most libc functions, etc., as well as being written in C and originally having been a templating engine for the C language).

This causes some nerve-wracking tickets when you have to explain to a customer than, No, we're not capable of limiting the RAM usage of your cronjob (a PHP script) and running it for an extended time instead, because that is not how memory works.
Background being that the program used some not-quite perfect algorithm for something or other causing exponential memory usage, which in turn was just more than was available, OOM-ing the job every single time.
Yes, you can of course make that job run better, but unless you're planning on adding a terrabyte of swap (in which case exponential memory usage still means that it might break eventually), you'll have to remove some references to objects so they may be garbage collected (e.g. if your script does collect data into one huge PHP array but never removes elements even when they are not needed anymore).

This of course is common knowledge for anyone having ever written software in C.
Furthermore people who know about malloc() tend to also know about the caveats of it's return value; sometimes it returns NULL telling you there is no more memory, sometimes it will return a value and later on crash on access due to CoW, lazy allocation, virtual memory (just throwing out buzzwords here) which on Linux is tied to the sysctl vm.overcommit_memory

It's not that C itself being omnipresent is a reason people should have a look at it, rather than the concepts of computing being so hard-wired into the C programming language style, principles, pitfalls, and whatnot, that knowing C also makes your code better in other languages; for example knowing C you might think of using sendfile in Go which allows the kernel to take care of efficiently sending a file from your filesystem directly to your TCP connection.
Similar things apply to file accesses; people often say that on Linux everything is a file, which isn't really that correct (it'd be more correct on Plan9, but still), on Linux everything is an int, because every resource you work with tends to be an integer.
Your shared memory allocation is represented by its int magic, your TCP connection is a bidirectional socket represented as an int, your file is an int, your cwd is an int, and so on.
No wonder that it's hard for people who tend to work with other more abstract languages to comprehend that you can actually delete an open file on Unix-like systems (mostly talking about Linux here), which then is removed from the directory on your filesystem, but your fd is still open and usable, the file still writable and so on (Firefox uses that because it's the closest you can get to having an otherwise inaccessible, unleakable (automatically deleted upon close()), disk-backed storage).
The same goes for your shell sitting in a directory which is then moved from another shell, and all your calls suddenly fail very oddly, and you have no clue what the hell is wrong, but as soon as you cd "$PWD" everything is back to normal, because your shell's cwd is still just an int and not a string.
The vast amount of issues with mutable strings, the reason and issues of rope libraries and so on, they are all present in other languages, yet you only ever grasp the underlying problem if you know about memory management.

All of these are examples of why I think every sysadmin and at least one developer per team (dunno how your team structure is, what I'm trying to say is that with every pull/merge request one of the involved people) should know the basics of the C programming language at least to the point of memory management and file descriptors.
This is just a personal notion of course, and if you disagree, then that's fine.
I'm just saying that I'm tired of explaining to people that on Linux fork()s CoW mechanism is actually a really neat way of ensuring you can write your current process state consistently to disk like Redis does, as long as you know what you're doing (I remember another PHP dev complaining to me how dumb Redis is to fork() for that, because of the RAM requirements).

Sorry, this turned out to be half rant, half dumping some common pitfalls, but I hope you (the reader of this comment) now have a vague idea how many issues there are which are incomprehensible without some basic understanding of low-level concepts which the C programming language does incorporate in its very essence.