This blog post originally started off as another one of my Unix command deep dives (remember those?), where I dive into the internals of a common Linux command. I was trying to run
strace to determine the system calls that were invoked by the command that I was exploring when I recalled that I was on macOS and that macOS did not have
strace but instead used a tool called
dtruss to track syscalls invoked during the execution of a program.
Now, I've been relatively ignorant about the distinction between
dtruss before. All I knew was that
dtruss did what I needed it to do and I didn't much bother looking into the details of how it worked or what it was.
But today is the day to shed the cloak of ignorance, friends!
strace is a system call tracer and also one of the few things in tech that has a name that reasonably matches what it does. You might be familiar with
strace from Julia Evans' strace zine. I think Julia's zine is a great way to learn about
strace but here's my two point summary on what strace is.
- System calls are an interface that allows a program to request some functionality from the operating system. These system calls do things like changing the current working directory, changing the permissions on files, and so on. You can view a full list of system calls here.
stracelists out the system calls that a program invokes as it executes.
One thing that the zine doesn't going into is how
strace works under the hood. I'll dive into that here. Under the hood,
ptrace, which stands for process trace, a system call that allows a parent process to watch and control the execution of a child process. It's used in
strace, but it also enables things like the
gdb debugger. The
ptrace system call uses some internal Linux data structures to establish a relationship between the tracer (the parent process) and the traced (the child process). Whenever a system call is invoked in the traced process, the tracer will be notified of the system call and the traced process will be temporarily stopped. At this point in time, whatever program is invoking
ptrace, whether it is
gdb, will process the information about the system call it was notified of and then return control back to the child process. This jumping back and forth between a child process, ptrace, and a higher level program highlights one of the downfalls of
strace. Because the operating system has to switch contexts between several processes repeatedly,
strace is not that fast.
ptrace acts as a mediator between the running process and a higher level tool such as
Now, this is where I had to do a little bit of research. The first definition I found of
dtrace was on Brendan Gregg's website which defined
dtrace, or I guess I can call it DTrace, as "an implementation of dynamic tracing." What is dynamic tracing? I had to do quite a bit of digging to find a resource that explained this well. In the end, I came across this article, which helped me grok what was going on.
strace relies on
ptrace to introspect processes,
dtrace goes about things a little bit differently. With
dtrace, the programmer writes probes in a language with a C-like syntax called D. These probes define what
dtrace should do when it invokes a system call, exits a function, or whatever else you'd like. These probes are stored in a script file that looks something like this.
printf("read has been called.");
This script states that whenever the
read system call is invoked, the tracer should print out the string "read has been called." The script file is then invoked with dtrace like so.
$ dtrace -s my_probe.d
dtrace then invokes the logic within the probe whenever it runs to the event outlined in that probe (entering a certain system call or exiting a function and so on). This flexibility lends DTrace its title as a dynamic tracer.
The next thing I set out to uncover was what
dtruss was. The first definition I ran into was from the
dtruss manpage which defined
dtruss as a "a DTrace version of truss." Well, I guess I better figure out what truss is first then. As it turns out,
truss is a Unix-specific command that allows the user to print out the system calls made by a program. It's essentally a varient of the
strace tool that exists on Linux. Knowing this, I think the best way to describe it would be to use an analogy:
strace is to
truss is to
Now, as it turns out,
dtrace aren't the only tools in our toolkit of tracers. My investigation eventaully led me to explore the wider world of tracers. As it turns out, Julia comes to the rescue once again. Brendan Gregg has another blog post with a list of different Linux tracers, how they work, and when you can use them. Brendan seems like quite the authority figure in this space, having published several books on tracing and written many nice blog posts. If you're interested in diving more into this, I would recommend checking out some of his blog posts.
Well, wasn't that a fun slide down the iceberg. It's always pretty fun when you start by posing a simple question (what is the difference between strace and DTrace) and end up discovering something much bigger (a whole new world of tracers).
What tracer do you use on a regular basis? Is there a particular tracing tool that you prefer over others?