Discussion on: Is modern C better then Rust?

View post

I have a few gripes with the content of this post:

I bet that almost no Rust programmer can tell me what enums look like in memory

Rust will likely never have a stable ABI. This is why there is #[repr(C)]. With #[repr(C)], I believe you cannot represent tagged unions with enums, but they at least have a defined size and padding as to be compatible with C's ABI (or more technically, the ABI of the underlying platform) There's also #[repr(u{8,16,32,64})] which allow you to define the size the enum is defined as, which is something I think you can't do in C.

(Not being able to know what the assembly would look like by just looking at the code)

You can't even do this in C. C compilers have hundreds of incredible tricks they have up their sleeve to try and make your code be efficient. If you're running on any type of optimization, which you are going to end up using in production anyway, it's almost guaranteed the assembly will not be what you expect, perhaps in a very different way.

Codegen systems are designed not only to make code efficient, but also to minimize instruction size which ends up going to binary/library size. That's why you'll see things like xor eax, eax instead of mov eax, 0. It doesn't seem like it, but the latter is 3 bytes shorter. And when you have millions of instructions in a large program or library, those bytes add up.

But anyway, why would knowing what assembly would be generated matter anyway? If you need to code in assembly, code in assembly.

Many say that C is obsolete because it doesn't have templates/generics, private member variables or namespaces, but with a bit of creativity, you can actualy achieve all of these, more or less in C.

Sure, that's true. But in Rust these things are first-class. You don't need creativity to achieve them, because they're already there for you. Generics in Rust work almost exactly the same as they would in C: monomorphization—generating new code for each type instantiation. This has its downsides (such as bloated binary sizes sometimes), but it's great for optimization and simplifies other parts of the compiler a bit too.

When you're hanging on creativity, it means things will go wrong at some point. Creativity isn't infallible—and I'm not saying Rust is either, it's had its fair share of bugs and certain design decisions that maybe weren't the best—but when you're relying on people pretty much reimplementing something every time they need it (in this case, generics macros), perhaps with new bugs each time, you're bound to run into problems.

Rust provides more safety guarantees with this. By having generics, templates, namespaces, private member variables, etc. in the language itself, it ensures that you can't bypass things by accident (or on purpose) and do something dumb. It also offloads the immediate complexity from the user to the compiler. Sure, say what you will about lifetimes in Rust, but it's much easier to deal with the compiler yelling at you than it not and having silent bugs that you never notice until they become a serious problem (read: security vulnerability).

I want to talk about refrences in Rust. While they are very safe, they are the devil if you are working low-level.

What are your sources for this? References in Rust are entirely compile-time. There is no overhead for references, perhaps excluding fat pointers (for slices, dyn Trait and other unsized types). And you have the unsafe escape hatch with raw pointers if you need to implement something that can't be expressed with the borrow rules of Rust, but that you know is safe.

[Y]ou still see [Rust programmers] on the internet saying that you sould rewrite C code in Rust

This is a problem in the Rust community, frankly, though it's getting better. For small projects, sure, that may be beneficial if they actually have large use. But I mean small. Where people should be asking for Rust should not necessarily be for old projects, but for new ones. There's an opportunity cost here—the cost of porting to Rust, versus the opportunity of the safety you'll get. For many projects, that's just not worth it. And that's why it's better to try to get Rust on the new frontier: projects that would be written in C but someone suggested Rust (kindly) to the developer or the developer found out themselves.

That's why I think Rust in Linux is a great frontier. There's almost no chance Rust will replace any core kernel code in Linux. But it's a great idea to have modules, which are usually third-party code, with fairly high-level access, to use a memory-safe language like Rust. Doing just that can probably mitigate a number of security issues. But yes, it is infeasible to replace C in the core kernel with Rust. It would stick out like a sore thumb, and yes, having more Rust in the kernel would probably increase build time.

But modules are optional. Don't need a module? Don't build it!
Don't want all the code bloating up your kernel image? Just don't include it with the image. Have modprobe deal with it later!

There's still a lot of work that needs to be done with Rust and its community, yes, but I think that there's a lot of misconceptions that need to be solved first.

The conclusion to this article I think it is that you should try something before you say its bad.

I agree with this sentiment. And I think it goes both ways. If you have used Rust and never C, perhaps try C. If you have used C and never Rust, perhaps try Rust. Rust isn't a fit for everyone. So is C.

But I think that it's not helpful discounting the idea of Rust with claims that are partially factually incorrect or just not helpful towards your argument. Don't get me wrong—you're allowed to have your opinions, and I'm allowed to mine. But when your opinions are partially based on misconceptions, perhaps it's time to rethink them.

Please let me know if you think anything here I said is factually incorrect, or simply unproductive; I don't intend to be a hypocrite in what I'm saying. Some of these thoughts are off the top of my head, so it's possible some things may have been stated wrong.

David • Sep 13 '21

First of all, you still didn't say what Rust enums look like in memory, so I think that point still stands(I didn't say that they are not stable). Second of all, it never happened to me to have such code that doesn't resemble what I wrote, but from what I heard it can happen from time to time. If you have a problem like this then a good solution would be to search what things the compiler might optimize. I strongly disagree with this sentence: "If you need to code in assembly, code in assembly.", writing in assembly takes too much time and is even more error prone than C. C was made to be a portable assembly. I know that having a language feature is very nice, but what I am saying is that you can still do that in C. Many people say that they don't use C for this reason and I wanted to show them that C is not as obsolete as they think. I didn't say that references are slow or bloat, I said that the way they work doesn't really fit in the low level programming. When I did some low-level programming in Rust I felt like I was spending more time on how to use refrences(not learning, just integrating them in the project) then the project itself, tho I can see why they are very useful for making desktop applications and stuff like that.

Wren [Undefined] • Sep 14 '21 • Edited

First of all, you still didn't say what Rust enums look like in memory, so I think that point still stands(I didn't say that they are not stable).

The part about not being stable is the important part, as it's not really possible to know exactly how a (#[repr(Rust)]) enum looks without looking at the compiler. (And even then, layouts can still be randomized as Rust makes no guarantees about the layout of structures and enums in #[repr(Rust)], though it can guarantee that for #[repr(C)].) Generally, for unions without data, it's a byte. For unions with data, it's a tagged union with a byte for the discriminant and the data as an inline union. Note that as padding is not strictly defined, it may be arbitrarily bigger to align by platform-specific aligning rules. I don't know what they look like in #[repr(C)] but I'm sure it's somewhat similar and probably is in the Nomicon.

An unstable ABI for enums makes it really nice for things like Option<&T>, because references are internally pointers (at the machine code level), so None can be represented by a null pointer, so it only takes up the size of a pointer, not the size of a pointer plus one. This single byte can add up and also due to alignment it may be padded to 2, 4, or even larger amounts of bytes which affects performance and memory usage.

Second of all, it never happened to me to have such code that doesn't resemble what I wrote, but from what I heard it can happen from time to time. If you have a problem like this then a good solution would be to search what things the compiler might optimize.

Sure, that's fine and probably is a normal experience. I was kind of exaggerating, I guess. But still I don't think it's really necessary to know what your code will look like at the low level, as long as that low-level code is just as fast as you would expect. And Rust, for the most part, is. (There's still a lot of improvement to be done here, though too!)

I strongly disagree with this sentence: "If you need to code in assembly, code in assembly.", writing in assembly takes too much time and is even more error prone than C. C was made to be a portable assembly

Sorry, that was phrased badly. What I meant is that if you need the precision of specific assembly instructions, perhaps you should just use assembly.

I know that having a language feature is very nice, but what I am saying is that you can still do that in C. Many people say that they don't use C for this reason and I wanted to show them that C is not as obsolete as they think.

I absolutely get that! C is by no means obsolete, and it probably won't be for a while; but there are still new features that are in other languages like Rust that add on to C and make it (relatively) easier and much safer to write low-level code.

I didn't say that references are slow or bloat, I said that the way they work doesn't really fit in the low level programming. When I did some low-level programming in Rust I felt like I was spending more time on how to use refrences(not learning, just integrating them in the project) then the project itself, tho I can see why they are very useful for making desktop applications and stuff like that.

Sorry, I misunderstood what you meant as you didn't explain on that. I feel that the way they work absolutely works for low-level programming because it allows you to have fairly precise access over memory while not sacrificing memory safety. There are only a few places where you need to use raw pointers in Rust over references, and most programmers in Rust, even lower-level ones, I'd say, very rarely have to use raw pointers. Especially because there are APIs over pointers like Pin that allow certain features with unsafe but without the error-prone raw pointers.