Most programs are linked at build time. dlopen is for everything else: plugin systems, audio engines that load VST modules, language runtimes that pull in extensions, game loops that reload logic without restarting — and GPU drivers. The mechanism is identical across all of them. Four POSIX functions, one opaque handle, and a contract the compiler cannot enforce for you.
This is that contract.
API vs ABI
These two acronyms get conflated constantly. They describe different things, enforced at different times, by different tools.
An API (Application Programming Interface) is a source-level contract. It defines what you can call, with what argument types, and what you get back. The compiler enforces it. Pass an int where a const char * is expected and you get a type error before a binary exists. Change a function signature in a header and every file including that header fails to compile. The feedback is immediate and impossible to miss.
An ABI (Application Binary Interface) is a binary-level contract. It defines how compiled code actually executes a call: which registers carry which arguments, how structs are laid out in memory, and what a function's name looks like in the symbol table after the compiler has processed it. Nothing enforces this automatically. Two translation units can agree at the source level — identical header, identical types, identical function names — and still disagree at the binary level if compiled with different compilers, different compiler versions, different flags, or different target ABIs.
The gap between them is where dlopen lives.
When you link two .o files together, the linker sees both sides simultaneously. Any ABI inconsistency either resolves cleanly or produces a linker error. When you dlopen a .so at runtime, the two sides were compiled separately — possibly years apart, by different people, with different toolchains. The API contract (the header) might be unchanged. The ABI might not be.
A concrete example: a library ships struct Config { int version; char *name; }. Everything compiles. Two years later, a field is inserted between version and name. The function signatures are untouched; the API is compatible if you recompile. But a plugin compiled against the old header has name at byte offset 4. The new library reads name from offset 8. No compiler error. No linker error. A pointer read from the wrong address, producing garbage or a crash.
This is why "we didn't change the API" is not sufficient. The relevant question is whether the ABI changed.
ABI stability is a harder property than API stability. It requires:
- never reordering struct fields
- never inserting fields between existing ones
- never changing the size of any exported type
- never changing a function signature in ways that affect register assignment
- using the same name mangling scheme — which implies the same language and often the same compiler family
Linux distributions maintain ABI stability for core system libraries across release cycles. Most application libraries do not, and signal breaks via SONAME version bumps — libfoo.so.1 → libfoo.so.2 — so the dynamic linker refuses to load the wrong version. dlopen by bare filename bypasses even that check. You own the contract entirely.
why dynamic loading exists
A statically linked binary resolves every symbol at link time. Addresses are baked in before the binary touches disk — predictable, fast, and completely inflexible. dlopen goes further than the normal dynamic linker: not only are the libraries separate from the host binary, the decision of which library to load and when happens at runtime, in your code.
The cost is real:
- No compile-time type checking across the boundary
- No ABI guarantees by default
- The linker cannot dead-strip code it never sees
What you get in exchange depends on what problem you're solving:
Plugin systems — discover and load behavior the host binary never knew about at build time. A text editor loading syntax highlighters, a game engine loading user mods, an audio DAW loading instrument plugins.
Hot reloading — recompile a .so while the host process is running, swap the old handle for the new one, and continue with updated logic without restarting. The key design constraint is that state lives in the host and behavior lives in the .so. The host allocates the game world, the simulation state, the in-memory database — whatever must survive across reloads. The .so contains only the code that operates on it. When the .so is recompiled, the host calls dlclose on the old handle and dlopen on the new one between two iterations of its main loop, re-resolves the function pointers via dlsym, and continues. The state was never in the .so, so nothing is lost. This is a development workflow rather than a production deployment strategy, but it eliminates the restart-and-reproduce cycle for anything with a slow startup — a game engine loading assets, a simulation with expensive initialization, a server with a warm cache. Tsoding has demonstrated this pattern repeatedly on stream: game logic in a .so, main loop polling for mtime changes, swap between frames, state intact.
Language runtimes — PHP, Python, Ruby, and Lua all use dlopen to pull in native extensions. The interpreter is the host; the extension is the plugin; dlsym finds the entry point by a known name convention.
GPU and audio drivers — the OS loads the right driver for the installed hardware without recompiling anything.
the four functions
// <dlfcn.h>
void *dlopen(const char *filename, int flags);
void *dlsym(void *handle, const char *symbol);
char *dlerror(void);
int dlclose(void *handle);
dlopen maps the library into the process's address space and returns an opaque handle. Two flag pairs matter:
-
RTLD_LAZY— resolve symbols only when called for the first time. Faster startup; missing symbols fail at call time, mid-execution. -
RTLD_NOW— resolve everything immediately. A missing symbol aborts atdlopentime, not later. -
RTLD_LOCAL(default) — keeps the library's symbols private to this handle. -
RTLD_GLOBAL— dumps all exported symbols into the process-wide namespace. Every library loaded afterward can see them. PHP uses this; the cost is that two extensions with the same symbol name silently shadow one another based on load order.
dlsym does a hash lookup for a named symbol and returns its address as void *. The correct error-checking pattern is:
dlerror(); // clear any stale error
void *sym = dlsym(handle, "name");
const char *err = dlerror();
if (err) { /* sym is unusable, do not call it */ }
Checking sym != NULL is not enough — a valid symbol can theoretically reside at address zero, and a failed lookup can return NULL without setting an error in some edge cases. The dlerror round-trip is the only reliable path.
minimal C23 example
Two translation units: the plugin compiled to .so, and the host that loads it.
// plugin.c — cc -std=c23 -fPIC -fvisibility=hidden -shared -o plugin.so plugin.c
#include <stdio.h>
__attribute__((visibility("default")))
int greet(const char *name) {
return printf("hello, %s\n", name);
}
-fPIC (position-independent code) is required for shared libraries. Addresses inside the .so are relative offsets rather than absolute, so the same file maps at different virtual addresses in different processes. -fvisibility=hidden makes hidden the default; visibility("default") opts individual symbols back in. Without it, your entire symbol table is exported — a maintenance hazard and a collision risk in large plugin ecosystems.
// host.c — cc -std=c23 -o host host.c -ldl
#include <dlfcn.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
typedef int (*greet_fn)(const char *);
int main(void) {
auto handle = dlopen("./plugin.so", RTLD_NOW | RTLD_LOCAL);
if (!handle) {
fprintf(stderr, "dlopen: %s\n", dlerror());
return EXIT_FAILURE;
}
dlerror();
greet_fn greet = (greet_fn)(uintptr_t)dlsym(handle, "greet");
const char *err = dlerror();
if (err) {
fprintf(stderr, "dlsym: %s\n", err);
dlclose(handle);
return EXIT_FAILURE;
}
greet("world");
dlclose(handle);
return EXIT_SUCCESS;
}
auto handle is valid C23 — the type is inferred as void * from dlopen's return type. Same semantics as C++ auto, standardized in C 13 years later.
The cast (greet_fn)(uintptr_t)dlsym(...) is the standard-conforming path from void * to a function pointer. ISO C does not guarantee void * and function pointers share the same representation. POSIX guarantees it specifically for dlsym, but the double-cast suppresses the pedantic diagnostic cleanly.
how PHP loads extensions
PHP is one specific application of this exact pattern. Its loader lives in main/dl.c. The core of php_load_extension() — simplified but structurally faithful:
// main/dl.c (php-src)
void *handle = DL_LOAD(libpath); // DL_LOAD is dlopen() on POSIX
zend_module_entry *(*get_module)(void) = dlsym(handle, "get_module");
zend_module_entry *module_entry = get_module();
zend_register_module_ex(module_entry);
zend_startup_module_ex(module_entry);
get_module is PHP's update — a known-name entry point the host finds via dlsym. The difference is the payload: instead of a function pointer to game logic, it returns a pointer to zend_module_entry, PHP's struct describing everything the extension provides.
zend_module_entry myext_module_entry = {
STANDARD_MODULE_HEADER,
"myext",
myext_functions, /* NULL-terminated Zend function table */
PHP_MINIT(myext), /* called once at process startup */
PHP_MSHUTDOWN(myext),
PHP_RINIT(myext), /* called per request */
PHP_RSHUTDOWN(myext),
PHP_MINFO(myext),
"1.0.0",
STANDARD_MODULE_PROPERTIES
};
/* expands to: zend_module_entry *get_module(void) { return &myext_module_entry; } */
ZEND_GET_MODULE(myext)
RINIT/RSHUTDOWN run once per request; MINIT/MSHUTDOWN once per worker process lifetime. PHP passes RTLD_GLOBAL | RTLD_LAZY by default. RTLD_GLOBAL is intentional — some extensions wrap C++ libraries that need their own symbols visible to later-loaded libraries. The consequence is that two extensions exporting the same symbol name silently shadow each other by load order. No warning.
Rust without headaches
libloading wraps dlopen/dlsym with a type-safe API. The unsafe surface shrinks to the unavoidable: loading the library and declaring the expected function signature.
[dependencies]
libloading = "0.8"
// src/main.rs
use libloading::{Library, Symbol};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// SAFETY: plugin.so is a well-formed shared library we control.
let lib = unsafe { Library::new("./plugin.so")? };
// SAFETY: "greet" exists, takes *const c_char, returns c_int.
let greet: Symbol<unsafe extern "C" fn(*const i8) -> i32> =
unsafe { lib.get(b"greet\0")? };
let name = c"world";
unsafe { greet(name.as_ptr()) };
Ok(())
}
Library::new calls dlopen and maps the error to Rust's Error trait. lib.get calls dlsym — b"greet\0" is the null-terminated symbol name exactly as dlsym expects. Symbol<F> carries the function signature, so calling greet(...) type-checks the arguments at compile time.
Symbol<T> borrows from Library, so the borrow checker prevents calling a symbol after the library is dropped — a use-after-free class of bug that C gives you silently.
For anything real, wrap the FFI behind a typed struct:
pub struct Plugin {
_lib: Library,
update: unsafe extern "C" fn(*mut State, f32),
}
impl Plugin {
pub fn load(path: &str) -> Result<Self, libloading::Error> {
let lib = unsafe { Library::new(path)? };
// Dereference copies the fn pointer out of Symbol — fn pointers are Copy,
// so we own the value before Symbol's borrow of lib ends.
let update = *unsafe {
lib.get::<unsafe extern "C" fn(*mut State, f32)>(b"update\0")?
};
Ok(Self { _lib: lib, update })
}
pub fn update(&self, state: &mut State, dt: f32) {
unsafe { (self.update)(state as *mut State, dt) }
}
}
_lib: Library keeps the shared library alive for the struct's lifetime. Dropping Plugin calls dlclose. Callers never see unsafe.
the ABI in detail
dlsym returns an address and trusts you've declared the right type to call through. There are three independent ways to get that wrong: symbol name, calling convention, and struct layout. Each fails silently.
calling convention
A calling convention is the machine-level agreement between caller and callee: which registers carry arguments, who restores the stack, and where the return value lands.
The x86-64 System V ABI (Linux, macOS, BSDs) assigns integer and pointer arguments in this order:
| position | register | 32-bit alias |
|---|---|---|
| 1st | rdi |
edi |
| 2nd | rsi |
esi |
| 3rd | rdx |
edx |
| 4th | rcx |
ecx |
| 5th | r8 |
r8d |
| 6th | r9 |
r9d |
| 7th+ | stack |
Return value comes back in rax. Float arguments use xmm0–xmm7. Registers rax, rcx, rdx, rsi, rdi, r8–r11 are caller-saved — the callee may clobber them. rbx, rbp, r12–r15 are callee-saved — the callee must restore them.
// update(&state, 0.016f) on x86-64 System V:
// rdi = &state ← pointer to State struct
// xmm0 = 0.016f ← first float argument
// call <address from dlsym>
Windows x64 differs: rcx, rdx, r8, r9 for the first four, then stack, with 32 bytes of shadow space reserved regardless. A function compiled for one convention, called through the other, reads arguments from wrong registers. No error at any stage — just wrong values.
extern "C" in C++ and Rust selects the platform C calling convention. Without it the compiler chooses whatever it wants, and there is no guarantee two compilers choose the same thing.
struct layout
C lays out fields in declaration order, each aligned to its own natural size: char to 1, int to 4, long and pointers to 8 on LP64. The struct is padded at the end to a multiple of its largest member's alignment. Gaps are inserted between fields to satisfy alignment.
struct Example {
char a; // offset 0, size 1
// ← 3 bytes padding
int b; // offset 4, size 4
char c; // offset 8, size 1
// ← 7 bytes padding
long d; // offset 16, size 8
};
// sizeof == 24, not 14
| field | offset | size | padding after |
|---|---|---|---|
a |
0 | 1 | 3 |
b |
4 | 4 | 0 |
c |
8 | 1 | 7 |
d |
16 | 8 | 0 |
Reordering to { char a; char c; int b; long d; } produces a 16-byte struct. Same fields, same types, different size. A plugin compiled against the 24-byte layout that reads d at offset 16 reads from the middle of b and c in the 16-byte layout. No error at any point in the toolchain.
The standard mitigation: put a version or size field first in any struct that crosses the plugin boundary, and check it at load time.
struct PluginAPI {
uint32_t version; // must be first, must never move
void (*update)(State *, float);
};
// At load time:
PluginAPI *api = get_api();
if (api->version != PLUGIN_API_VERSION) { /* reject */ }
name mangling
C symbol names are function names, verbatim. dlsym(handle, "greet") finds greet. C++ encodes namespace, class, and parameter types into the symbol to support overloading. Rust does the same for generics.
# C: int greet(const char *name)
$ nm -D plugin_c.so | grep greet
00000000000010f0 T greet
# C++: int greet(const char *name)
$ nm -D plugin_cpp.so | grep greet
00000000000010f0 T _Z5greetPKc
_Z5greetPKc means: function named greet (5 chars), taking PKc (pointer-to-const-char). Decode it with c++filt _Z5greetPKc. The scheme is not standardized — it differs between GCC and Clang, between versions, between platforms. dlsym(handle, "_Z5greetPKc") works today and breaks silently after a compiler upgrade.
extern "C" suppresses C++ mangling:
// Mangled — unstable symbol name
int greet(const char *name) { ... }
// Not mangled — stable, dlsym-able by plain name
extern "C" int greet(const char *name) { ... }
Rust needs both extern "C" (for calling convention) and #[no_mangle] (to emit the plain name):
// Mangled + Rust ABI — dlsym("greet") won't find it
pub fn greet(name: *const i8) -> i32 { ... }
// C ABI, plain symbol name — dlsym("greet") finds it
#[no_mangle]
pub extern "C" fn greet(name: *const i8) -> i32 { ... }
struct layout in Rust
Without #[repr(C)], Rust may reorder struct fields to minimize padding. The layout is unspecified and can change between compiler versions.
// repr(Rust) — layout unspecified, compiler may reorder
struct Foo { a: u8, b: u32, c: u8 }
// Possible layout: b(0), a(4), c(5), pad(6-7) → 8 bytes
// repr(C) — C rules, declaration order preserved
#[repr(C)]
struct Foo { a: u8, b: u32, c: u8 }
// Guaranteed: a(0), pad(1-3), b(4), c(8), pad(9-11) → 12 bytes
Any struct passed by pointer to C, returned from C, or embedded in a C struct must be #[repr(C)]. Without it, C reads fields from wrong offsets.
where it goes wrong
Symbol versioning exists in glibc — .symver directives, SONAME in the ELF header — but almost nothing outside of libc and GPU vendors uses it correctly. If your plugin ABI changes, you change the symbol name or the filename. There is no automatic enforcement.
Crashes inside a loaded plugin take down the whole process. PHP isolates this with shared-nothing worker processes — one per request — not with any sandbox inside the loader. If you need fault isolation, you need process boundaries or a WASM sandbox, not dlopen.
what it isn't
- It's not a substitute for static linking when you control both sides and ship them together. Static linking gives the linker visibility to strip dead code, inline across translation units, and catch missing symbols at build time.
- It does not give you ABI stability for free. Reordering struct fields, adding a parameter, or switching compiler versions can break a loaded plugin with zero compile-time signal.
-
It is not safe to pass Rust-native types across the boundary.
Vec<T>,Box<T>,Arc<T>all depend on allocator identity and internal layout that is not stable across compiler versions. Use*const T/*mut Twith#[repr(C)]structs. -
RTLD_GLOBALis not a default you want. It pollutes the process-wide symbol namespace. UseRTLD_LOCALunless you have a concrete reason, and document it. -
It does not work on Windows as written. The Windows equivalent is
LoadLibrary/GetProcAddress.libloadingabstracts over both; the raw POSIX API does not exist on Windows.
where to start
# Plugin
cc -std=c23 -fPIC -fvisibility=hidden -shared -o plugin.so plugin.c
# Host (Linux requires -ldl; macOS includes it in libSystem automatically)
cc -std=c23 -o host host.c -ldl
./host
# hello, world
For the hot-reload loop: compile the above, run the host, then edit logic.c, run make, and watch the swap happen without restarting.
For Rust: cargo add libloading, use the typed wrapper pattern above, keep unsafe blocks minimal and commented.
I built the ScyllaDB PHP driver on this foundation — every ZEND_GET_MODULE is the structured version of what's described here.
Top comments (0)