The full block of code is listed below.
Introduction
This is a short post on how to write a simple hashmap in C. Hashmaps generally have 3 components
- Hash array
- Hasher function
- A data-structure
The hasher function
The purpose of the hasher is to take in an input and generate a fixed-size number.
In our case we will use the djb2 (Dan J. Bernstein) algorithm because it does 2 things for us,
- It generates a hashed value
- It distributes values well
long hash(char *str) {
unsigned long hash = 5381;
int c;
while ((c = *str++))
hash = ((hash << 5) + hash) + c;
return hash;
};
The hash array (table)
We then need to allocate some memory for the array; this can be done however you want.
I went with
void *bucket_array[1024] = {NULL};
A 'bucket' refers to an element within a hashmap, this because the element is another data-structure.
The Data-Structure
An easy to implement data-structure is a linked-list.
typedef struct Node {
char *key;
char *data;
struct Node *next;
} Node;
The add function
Now, we can implement a system to:
- hash our value
- store it
The code itself is quite self-explanatory. We first get an index value by hashing our string and taking the modulus with the size of our buffer.
Then we check if there is already a value at that index, if no, create a new node, set the data and key and update the index position.
If yes, then scan through the index to see which node matches our key, if we find a match, update that value with our new data.
int add(char *str, char *data, int size, void **buffer) {
int index = (hash(str) % (long)size);
if (buffer[index] == NULL) {
// no bucket exists
Node *node = malloc(sizeof(Node));
node->key = str;
node->data = data;
buffer[index] = node;
} else {
// bucket exists
Node *linked_list = buffer[index];
bool exists = false;
if (strcmp(str, linked_list->key)) {
exists = true;
} else {
while (linked_list->next != NULL) {
if (strcmp(str, linked_list->key)) {
exists = true;
} else {
linked_list = linked_list->next;
}
}
}
if (exists) {
Node *new_node = malloc(sizeof(Node));
new_node->key = str;
new_node->data = data;
linked_list->next = new_node;
} else {
linked_list->data = data;
}
}
return index;
}
Code
As for the other functions, I'll leave it as a exercise to the reader.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct Node {
char *key;
char *data;
struct Node *next;
} Node;
Node *head = NULL;
Node *current = NULL;
long hash(char *str) {
unsigned long hash = 5381;
int c;
while ((c = *str++))
hash = ((hash << 5) + hash) + c;
return hash;
};
int add(char *str, char *data, int size, void **buffer) {
int index = (hash(str) % (long)size);
if (buffer[index] == NULL) {
// no bucket exists
Node *node = malloc(sizeof(Node));
node->key = str;
node->data = data;
buffer[index] = node;
} else {
// bucket exists
Node *linked_list = buffer[index];
bool exists = false;
if (strcmp(str, linked_list->key)) {
exists = true;
} else {
while (linked_list->next != NULL) {
if (strcmp(str, linked_list->key)) {
exists = true;
} else {
linked_list = linked_list->next;
}
}
}
if (exists) {
Node *new_node = malloc(sizeof(Node));
new_node->key = str;
new_node->data = data;
linked_list->next = new_node;
} else {
linked_list->data = data;
}
}
return index;
}
int main(void) {
// create buckets
void *bucket_array[1024] = {NULL};
int x = add("Neo", "bing-bong", 1024, bucket_array);
printf("%s\n", ((Node *)bucket_array[x])->data);
add("Neo", "ching-chong", 1024, bucket_array);
printf("%s\n", ((Node *)bucket_array[x])->data);
add("Neo", "ding-dong", 1024, bucket_array);
printf("%s\n", ((Node *)bucket_array[x])->data);
return 0;
};
Top comments (3)
void*
for the pointers and pass a comparison function for the keys so you can use any type for a key and not just a string.I give a such an implementation in my forthcoming book, "Why Learn C."
I'll definitely check out the book when it's out, thanks for the advice
Some comments may only be visible to logged-in visitors. Sign in to view all comments.