DEV Community

Neo Sahadeo
Neo Sahadeo

Posted on

Hashmap in C

The full block of code is listed below.

Introduction

This is a short post on how to write a simple hashmap in C. Hashmaps generally have 3 components

  • Hash array
  • Hasher function
  • A data-structure

The hasher function

The purpose of the hasher is to take in an input and generate a fixed-size number.

In our case we will use the djb2 (Dan J. Bernstein) algorithm because it does 2 things for us,

  1. It generates a hashed value
  2. It distributes values well
long hash(char *str) {
  unsigned long hash = 5381;
  int c;

  while ((c = *str++))
    hash = ((hash << 5) + hash) + c;

  return hash;
};
Enter fullscreen mode Exit fullscreen mode

The hash array (table)

We then need to allocate some memory for the array; this can be done however you want.

I went with

void *bucket_array[1024] = {NULL};
Enter fullscreen mode Exit fullscreen mode

A 'bucket' refers to an element within a hashmap, this because the element is another data-structure.

The Data-Structure

An easy to implement data-structure is a linked-list.

typedef struct Node {
  char *key;
  char *data;
  struct Node *next;
} Node;
Enter fullscreen mode Exit fullscreen mode

The add function

Now, we can implement a system to:

  • hash our value
  • store it

The code itself is quite self-explanatory. We first get an index value by hashing our string and taking the modulus with the size of our buffer.

Then we check if there is already a value at that index, if no, create a new node, set the data and key and update the index position.
If yes, then scan through the index to see which node matches our key, if we find a match, update that value with our new data.

int add(char *str, char *data, int size, void **buffer) {
  int index = (hash(str) % (long)size);
  if (buffer[index] == NULL) {
    // no bucket exists
    Node *node = malloc(sizeof(Node));
    node->key = str;
    node->data = data;
    buffer[index] = node;
  } else {
    // bucket exists
    Node *linked_list = buffer[index];
    bool exists = false;

    if (strcmp(str, linked_list->key)) {
      exists = true;
    } else {
      while (linked_list->next != NULL) {
        if (strcmp(str, linked_list->key)) {
          exists = true;
        } else {
          linked_list = linked_list->next;
        }
      }
    }
    if (exists) {
      Node *new_node = malloc(sizeof(Node));
      new_node->key = str;
      new_node->data = data;
      linked_list->next = new_node;
    } else {
      linked_list->data = data;
    }
  }
  return index;
}
Enter fullscreen mode Exit fullscreen mode

Code

As for the other functions, I'll leave it as a exercise to the reader.

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct Node {
  char *key;
  char *data;
  struct Node *next;
} Node;

Node *head = NULL;
Node *current = NULL;

long hash(char *str) {
  unsigned long hash = 5381;
  int c;

  while ((c = *str++))
    hash = ((hash << 5) + hash) + c;

  return hash;
};

int add(char *str, char *data, int size, void **buffer) {
  int index = (hash(str) % (long)size);
  if (buffer[index] == NULL) {
    // no bucket exists
    Node *node = malloc(sizeof(Node));
    node->key = str;
    node->data = data;
    buffer[index] = node;
  } else {
    // bucket exists
    Node *linked_list = buffer[index];
    bool exists = false;

    if (strcmp(str, linked_list->key)) {
      exists = true;
    } else {
      while (linked_list->next != NULL) {
        if (strcmp(str, linked_list->key)) {
          exists = true;
        } else {
          linked_list = linked_list->next;
        }
      }
    }
    if (exists) {
      Node *new_node = malloc(sizeof(Node));
      new_node->key = str;
      new_node->data = data;
      linked_list->next = new_node;
    } else {
      linked_list->data = data;
    }
  }
  return index;
}

int main(void) {
  // create buckets
  void *bucket_array[1024] = {NULL};

  int x = add("Neo", "bing-bong", 1024, bucket_array);
  printf("%s\n", ((Node *)bucket_array[x])->data);
  add("Neo", "ching-chong", 1024, bucket_array);
  printf("%s\n", ((Node *)bucket_array[x])->data);
  add("Neo", "ding-dong", 1024, bucket_array);
  printf("%s\n", ((Node *)bucket_array[x])->data);

  return 0;
};
Enter fullscreen mode Exit fullscreen mode

Top comments (3)

Collapse
 
pauljlucas profile image
Paul J. Lucas
  1. You should use void* for the pointers and pass a comparison function for the keys so you can use any type for a key and not just a string.
  2. The values should be stored as a flexible array member so it gives the user a choice as to whether the node data is stored internally (better memory performance, but requires copying) or externally (worse performance, but no copying).
  3. The table should grow automatically.

I give a such an implementation in my forthcoming book, "Why Learn C."

Collapse
 
neosahadeo profile image
Neo Sahadeo

I'll definitely check out the book when it's out, thanks for the advice

Some comments may only be visible to logged-in visitors. Sign in to view all comments.