Mixing C++ with AMD64 (x86_64) assembly

#asm #c #cmake

Lately, I’ve been dabbling into some “closer to the metal” kind of programming.

On most compilers (Visual Studio’s one for instance) it used to be rather easy to mix assembly code and C++ code using a feature called inline assembly where the ASM code will be put in a block (decorated with a special macro/symbol like _asm for instance), and when the compiler sees that, it will put the content of this “as is” inside of the compiled code.

Well, that feature don’t exist once you compile in 64bit mode, and also, mixing programming languages in a single file is a bit gross. Each time I see this, I think it feel rather ugly…

Instead, it would be good to separate that code into files, that will be assembled into object code you can link to your program as you do with C++ compilation units and static libraries…

So, how do you go about doing this, and doing it in a simple, and cross platform manner.

Well, first of all, there’s 2 things to consider when writing code in assembly.

1) The code you write obviously depend on the machine instruction set, since you are basically asking a CPU to do work, but you do it *instruction per instruction*. This means that you can only target one family of processors with that code. So when I say cross platform, I mean different rutnime on compatible machines. For example Windows VS Linux VS MacOs running on good old Intel (or compatible) chips.

2) Generally, you don’t just write instructions when writing assembly, you also write directive for the program that turn your code from human readable thing with letters and spaces and comments into the digital gibberish the CPU actually eats up to work. And there are multiple of them, and they aren’t really compatible with each other.

I was on a windows machine when I started experimenting, and saw that there was obviously the MASMoption, the macro assembler from Microsoft that was already installed on my box, as it is distributed with Visual Studio. Great… And I started playing with it, it’s not hard. I even got CMake to generate automatically a build system that work. But sadly these effort were vain as I quickly released that, I would not be able to use Microsoft’s assembler on Linux or a Mac for quite obvious reasons…

So, as always… Open-Source software win the day!

I was working on something totally unrelated that involved me having to build from source the OpenSSL library. When looking at the dependencies I needed to install on a bog standard windows box to do so, beside their horrendous build system written in Perl, one program stood up: NASM.

NASM is, my friend Google told me, the Netwide Assembler. It’s a BSD licensed Open-Source macro-assembler for 16, 32 and 64bit intel chips! Great news! It is cross platform and work on Linux and MacOS too (and I bet a bit more stuff!)

Also, I started googling around to see if CMake could auto-magically handle it, and indeed it does since a prehistoric version.

So, I got around to test that, and well, beside a tiny snag on MacOS that is apparently due to the way MachO object files differ from ELF files (a sad story of respecting standards about leading underscores) that is quickly fixed by telling NASM to prefix all global symbols with a _ character, I got a thing that just worked.

So, the result of this silly experiment is a demonstration repository that has nothing impressive to show beside how easy it works. It’s here, there’s C++ code, there’s ASM code, there’s a CMake project file that compiles everything together, the C++ code declare some extern "C" functions with names and interfaces that just happen to match the globally defined exportable symbols in the assembly code, and… voilà!

https://github.com/Ybalrid/cmake-cpp-nasm

Feel free to steal the literal 4 lines of configuration needed for this thing to work and go do some cool stuff!