DEV Community

Lahari Tenneti
Lahari Tenneti

Posted on • Edited on

LLVM #8 — Setting Up Infrastructure for a Custom Pass

Welcome, welcome, welcome to the next leg of my LLVM journey!

This is where I get into the "proper" backend of compilation. I know I've said that before, but for real. Proper backend now because at its core, Kaleidoscope was an extended front-end. Although I did do things like emitting object code and setting up JIT execution engines, Kaleidoscope took files, parsed them into ASTs, generated LLVM IR, and then handed it over to LLVM's pre-built infrastructure. That's pretty nice, but now I'm going to interact with LLVM's core optimisation engine itself (which is sort of like the middle-end, but humour me).

My goal for this leg is to write a 'Loop Vectorisation Hint' pass, which should be able to look at control flow and inject metadata hints that tell LLVM how to optimise loops by vectorising them through SIMD.

What I built: Commit 9da4d33


The What and Why:

  • Right now, our CPU adds numbers one at a time.
  • Imagine we had to write a simple loop in C++ to add thousands of numbers together.
  • Modern CPUs can do the same operation on multiple numbers at once (like 4 or 8 pairs) using special wide instructions like SIMD, which greatly help with speed.
  • LLVM has an inbuilt "auto-vectoriser" that looks at our loops and automatically tries to rewrite them to use these SIMD instead of doing one element at a time.
  • But the auto-vectoriser is also cautious. It only vectorises a loop if it can prove it's safe to do so.
    • Ex: If a loop calculates a value that depends directly on the result of the previous iteration, the operations cannot be run in parallel
  • Real code is often messy enough that LLVM can't exactly prove safety, even when the loop is okay enough to vectorise. So LLVM sort of backs off, leaving the loop slow.
  • Now the good news is that LLVM lets us manually tell it to do something using metadata. Normally, a human programmer adds this by writing little notes directly into their C/C++ source code.
  • So instead of us manually annotating the source code loop by loop, this pass automatically scans compiled code and attaches these "go-ahead and vectorise it" tags to loops.
  • A little word of caution. There's a reason the safety-check exists. I'm doing this merely for learning how passes work. Production grade optimisation must absolutely not be like this.

What I understood:

  • Because LLVM is a massive library in C++, it has all the core data structures a compiler needs.
  • To write a tool that optimises code, we need to write a small C++ program (a pass) that can work with LLVM.
  • As I'd mentioned here, the Pass Manager determines how the passes must be run on the generated LLVM IR.

1) Setting it up:

  • Like before (when we set up LLVM), we create a CMake to declare our requirements and automatically write a blueprint telling our computer how exactly to compile our code.
cmake_minimum_required(VERSION 3.15)
project(MyVectorPass)

find_package(LLVM REQUIRED CONFIG)
message(STATUS "Found LLVM ${LLVM_PACKAGE_VERSION}")
message(STATUS "Using LLVMConfig.cmake in: ${LLVM_DIR}")

include_directories(${LLVM_INCLUDE_DIRS})
link_directories(${LLVM_LIBRARY_DIRS})
separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND "${LLVM_DEFINITIONS}")
add_definitions(${LLVM_DEFINITIONS_LIST})

add_library(MyPass MODULE MyPass.cpp)

if(APPLE)
    target_link_options(MyPass PRIVATE "-undefined" "dynamic_lookup")
endif()

target_compile_features(MyPass PRIVATE cxx_std_17)

Enter fullscreen mode Exit fullscreen mode
  • This time, instead of telling CMake to make a standalone application executable (like we did for Kaleidoscope), we tell it to build a dynamic MODULE
  • Executables have their own main(). When run, the OS starts executing from the very first line of main(), and our program controls the entire CPU process.
  • Back then, CMake had to fetch the massive LLVM libraries and physically inject them into our binary so out compiler had its "brains."
  • Now, to connect our pass to the LLVM infrastructure, we compile our code into a Plugin (a shared module like a .so or a .dylib), which is basically compiled code without a main(), so it can't really run by itself.
  • So we use LLVM's command line optimisation tool called opt, which when run in the terminal, reads our instruction flag (-load-pass-plugin=./libMyPass.so), reached out into our folder, opens our module, and injects our code (the pass) directly into its own running process.

2) The skeleton (MyPass.cpp):

  • To ensure our plugin can talk to the modern LLVM Pass Manager infrastructure, I wrote a basic C++ boilerplate.
  • It doesn't optimise anything yet and just registers a callback under the name "my-pass". - Whenever it sees a function, it receives it, fetches its name using F.getName(), and prints it out.
#include "llvm/IR/PassManager.h"
#include "llvm/Passes/PassPlugin.h"
#include "llvm/Passes/PassBuilder.h"
#include "llvm/Support/raw_ostream.h"

using namespace llvm;

namespace {
  struct MyPass : public PassInfoMixin<MyPass> {
    PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM) {
      //printing the fxn name
      outs() << "Visiting function: " << F.getName() << "\n";
      return PreservedAnalyses::all();
    }
  };
}

//registering the pass so 'opt' can find it by name
extern "C" LLVM_ATTRIBUTE_WEAK ::llvm::PassPluginLibraryInfo llvmGetPassPluginInfo() {
  return {
    LLVM_PLUGIN_API_VERSION, "MyPass", LLVM_VERSION_STRING, [](PassBuilder &PB) {
      PB.registerPipelineParsingCallback([](StringRef Name, FunctionPassManager &FPM, ArrayRef<PassBuilder::PipelineElement>) {
        if (Name == "my-pass") {
          FPM.addPass(MyPass());
          return true;
        }
        return false;
      });
    }
  };
}
Enter fullscreen mode Exit fullscreen mode

3) Verifying the plugin:

  • I created an LLVM IR file (test.ll) with two empty functions, @foo and @bar.
define void @foo() {
entry:
    ret void
}

define void @bar() {
entry:
    ret void
}
Enter fullscreen mode Exit fullscreen mode
  • Then I built it:
mkdir build && cd build
cmake -G Ninja -DLLVM_DIR=$(llvm-config --cmakedir) ..
ninja
Enter fullscreen mode Exit fullscreen mode
  • To test it, I fed my dummy IR file into LLVM's opt tool and loaded the new shared module:
opt -load-pass-plugin=./libMyPass.so -passes=my-pass -disable-output ../test.ll
Enter fullscreen mode Exit fullscreen mode


What I didn't understand:

1) Linker problem:

  • The project compiled to 50% when I ran the ninja build command, after which the linker gave me this:
Undefined symbols for architecture arm64:
"llvm::outs()", referenced from: ...
"llvm::Value::getName() const", referenced from: ...
ld: symbol(s) not found for architecture arm64
clang++: error: linker command failed with exit code 1
Enter fullscreen mode Exit fullscreen mode
  • When we compile a standard executable program on any OS, the linker's job is to make sure every single function call in our code maps to a concrete definition.
  • If we call llvm::outs(), the linker will go searching through the static libraries or .dylib files on our computer, find the compiled binary code for outs(), and either inject it into our executable or explicitly link against a library that contains it.
  • If it can't find it, it halts compilation with an Undefined symbols error.
  • Our pass is a plugin designed to be loaded into the host program (opt), which already contains code for things like llvm::outs(), llvm::Value::getName(), and the rest of the LLVM architecture.
  • If our linked forced our little plugin to statically include those giant LLVM functions, the plugin file would be excessively large and with duplicate code!
  • macOS' linker ld demands that all symbols must be resolved at compile time, even for dynamic modules. When it saw outs() in our MyPass.cpp and realised we weren't actively injecting LLVM's massive engine into our plugin, it went into error mode.
  • On linux, the linker is content with leaving symbols unresolved while building the shared library (at compile time). It assumes that at runtime, whatever loads the library will also provide the missing functions.
  • So we add a flag in our CMake for the macOS linker to switch to Linux-style behaviour, and treat unresolved symbols at compile time as normal, and wait for opt to provide the required binaries at runtime:
if(APPLE)
    target_link_options(MyPass PRIVATE "-undefined" "dynamic_lookup")
endif()
Enter fullscreen mode Exit fullscreen mode

2) .dylib vs .so

  • This didn't work:
opt -load-pass-plugin=./MyPass.dylib -passes=my-pass -disable-output ../test.ll
Enter fullscreen mode Exit fullscreen mode
  • So I had to switch to this:
opt -load-pass-plugin=./libMyPass.so -passes=my-pass -disable-output ../test.ll
Enter fullscreen mode Exit fullscreen mode
  • This is because a shared library (.dylib on Mac, .so on Linux) is meant for a program to link against at compile time.
  • But a shared module (.so on both Linux and Mac) is a library that is specifically meant to be loaded at runtime.
  • Because LLVM is designed to be cross-platform and work exactly the same way across Linux, Windows, and macOS, its build conventions favour more universal defaults.
  • So when CMake processes add_library(MyPass MODULE ...), it follows the platform-independent rule for a plugin module rather than the native macOS rule for a generic system library.
  • Hence, even if we are on an ARM64 Mac, CMake intentionally outputs libMyPass.so

What's next: Looking for loops!


Musings:
It's past midnight as I write this. And I'm supremely satisfied I finished the day's task. But I can't for the life of me decide whether I'm an early bird or late owl. I do well during both times. My productivity is independent of the time of the day, because if I sit down to finish something, I finish it. Not like a flex (though it can be considered one, hehe). The only thing I probably need is some way to structure this... persistence? I've tried timetables, but I don't seem to stick to them for too long. I sometimes envy people in institutions like the armed forces because their consistency is absolutely insane. To put it very simply, I just need to finish my work before sunset and sleep on time. But for now, buonanotte (and buongiorno)!

Top comments (0)