DEV Community

Paul J. Lucas
Paul J. Lucas

Posted on • Edited on

Why many Unix structs have Prefixes

One thing I’ve long been curious about is why many Unix C struct fields are named such that they are prefixed by a common abbreviation. For example, for the sockaddr_in struct:

struct sockaddr_in {
    short           sin_family;
    unsigned short  sin_port;
    struct in_addr  sin_addr;
};
Enter fullscreen mode Exit fullscreen mode

all fields are prefixed by sin, an abbreviation for sockaddr_in. There are several other examples, as well, e.g., struct stat.

I originally thought it was perhaps just a style quirk of the original Unix authors. I tried searching for things like “early C style,” but never found anything.

I’ve also thought that perhaps they named the fields that way to allow the use of short pointer names like p. (We are after all talking about guys who named commands cp, rm, etc.) For example, given:

p->st_mode   // You know 'p' is a stat*.
q->mode      // You have no idea what this is.
Enter fullscreen mode Exit fullscreen mode

The prefix of st_ allows you to know the type of p just by looking at it whereas you’d have to find the declaration of q to know what it means. Non-prefixed fields would require putting a mnemonic in the pointer itself:

pstat->mode  // Clear, but more keystrokes.
Enter fullscreen mode Exit fullscreen mode

Prefixes also make fields easily grepable. Both rationales seem reasonable, right?

However, I finally stumbled across the real answer: early C compilers had only a single, global symbol table, so they added prefixes to struct fields to avoid collisions. 😮 Once C compilers improved, this style largely faded away.†


† Though I've since learned that Solaris’ internal style guide still recommends this style to this day, even in new code:

Systematic prefix conventions for ... structure or union member names can be useful.

But it doesn't elaborate as to why, specifically.

Top comments (2)

Collapse
 
stojakovic99 profile image
Nikola Stojaković

This is very interesting. Who knows how many things we take for granted today are actually relics of the past. One of the most popular is 80 characters limit per line which in fact is pretty useful today (even though displays are much bigger now) - it makes reading the code much easier and you can set two editors side by side.

Collapse
 
pauljlucas profile image
Paul J. Lucas

I also still limit my lines to 80 characters. Over the years, I've also gradually reduced my spaces-per-indent. Waaaay back, I started out at parity (8 spaces, or a hard tab, per indent); at some point, I reduced that (4 spaces per indent); and finally again maybe a decade ago (2 spaces per indent). The obvious advantage is fewer lines that have to be wrapped so they still fit ≤ 80 characters. Some might think 2 is too few, but it's still quite readable, IMHO.