What do I mean by "headers" in structures? It's the same notation in other programming contexts. The most common example is headers of file formats or HTTP headers.
For example, many images formats like JPEGs and TIFFs have headers detailing the width, heights, bit-depth, etc of the image.
We can definitely do something similar with structs in C.
For the rest of this post, we'll be implementing an imaginary library called CStr. Our goal is to have an incredibly easy library to use just like what someone is used to when using raw C-Strings in the language.
char *name = c_str_new("Noah");
printf("Your name is %s", name);
A structure will be needed for holding the extra information about the string such as its length and capacity.
typedef struct
{
int length;
int capacity;
} CStrHeader;
But where will this go when the time comes to create it? Just like with image headers, we'll place it before the string in memory.
Because the header will be just before the string, we can use one big allocation.
Our function for the creation of a CStr:
char *c_str_new(char const *init_str);
And the implementation:
char *c_str_new(char const *init_str)
{
int const init_str_length = strlen(init_str) + 1;
CStrHeader *header = malloc(sizeof(CStrHeader) + (sizeof(char) * init_str_length * 2));
header->length = 0;
header->capacity = init_str_length;
char *ptr_to_string = (unsigned char *)header + sizeof(CStrHeader);
strcpy(ptr_to_string, init_str);
return ptr_to_string;
}
The first thing we do is calculate the length of init_str
for how much space we need for the string. Next, we "over"-allocate a CStrHeader
object with enough space for the string.
The header is now allocated and is filled with the metadata for the string.
Now here is the most interesting part of the function: we do some pointer arithmetic to skip over the header portion and get to the string. The reason we cast it to an unsigned char
pointer so that it's effectively interpreted as bytes.
Because of the C types essentially being bytes, the C compiler needs to know how to move the byte cursor if you will.
Every time you index into an array, it moves the byte cursor to that location by multiplying the amount by the size of the type.
Now getting back to our CStr library, we cast it unsigned char
s and move the byte cursor by the size of CStrHeader
.
Because the number of bytes to represent unsigned char
is virtually always going to be equal to one, we can treat the header just like it if were bytes.
Lastly, we copy init_str
into ptr_to_string
and return it.
To free the string and its header is just as simple.
We have a function called c_str_delete()
that takes a char pointer and deletes the header and the string data.
void c_str_delete(char *str)
{
free((unsigned char *)str - sizeof(CStrHeader));
}
First, we rewind back to the beginning and then free.
Closing Remarks
That is how to create a convenient and easy to use string library in C. You can augment it by creating functions that return the length and capacity of a string by rewinding to the beginning and accessing the structure field members. And because the length is precomputed, the "calculation" is constant time. Simply just return the length.
Using headers with structures in C isn't just for string libraries. If you want to, you can use them for strecty buffers. Search them up if you want to learn more. They're more or less what we did except it's like a general purpose vector in C++.
Top comments (0)