Code_Regina

Posted on

# Hash Tables

``````                   -Intro to Hash Tables
-Handling Collisions
-Hash Table Keys and Values
``````

### Intro to Hash Tables

Hash tables are used to store key-value pairs.
They are similar to arrays but there is no required order.
Hash tables have a great speed so they are widely used.
Hash tables have different names in other languages but the functionality is the same.

Python hash tables are called dictionaries.
JavaScript hash tables are called objects and maps.

``````
function hash(key, arrayLen) {
let total = 0;
for (let char of key) {
// map "a" to 1, "b" to 2, "c" to 3, etc.
let value = char.charCodeAt(0) - 96
total = (total + value) % arrayLen;
}
}

``````

### Handling Collisions

There are two ways to handle hash table collisions.

1. Separate Chaining
2. Linear Probing

#### Separate Chaining

is when the hash table has each index in the array has stored values with more data similar to an array or a linked list.

#### Linear Probing

When a collision is found, a search is performed throughout the array to find the next empty slot.

### Hash Table Keys and Values

Keys - Loops through the hash table array and returns an array of keys in the table

Values - Loops through the hash table array and returns an array of values in the table

``````
keys(){
let keysArr = [];
for(let i = 0; i < this.keyMap.length; i++){
if(this.keyMap[i]){
for(let j = 0; j < this.keyMap[i].length; j++){
if(!keysArr.includes(this.keyMap[i][j][0])){
keysArr.push(this.keyMap[i][j][0])
}
}
}
}
return keysArr;
}
values(){
let valuesArr = [];
for(let i = 0; i < this.keyMap.length; i++){
if(this.keyMap[i]){
for(let j = 0; j < this.keyMap[i].length; j++){
if(!valuesArr.includes(this.keyMap[i][j][1])){
valuesArr.push(this.keyMap[i][j][1])
}
}

``````

hidden_dude

I don't think your implementation is correct.
When implementing keys() you shouldn't check for duplicates.

Just iterate linearly add add to keysArr. That's because keysArr.includes() is an O(N) operation, so your keys() implementation is O(N^2) which is really bad.

In values() it's the same. You don't need to check for duplicates, in fact duplicates are possible.

I assume you're using separate chaining. There are also other techniques like coalesced chaining where you would keep a certain amount of space preallocated for collisions, but if that runs out you rehash.

Also probing can be linear, quadratic or you can use double hashing (a second hash function) which can reduce clustering.

From a user's perspective it's important to understand what can cause collisions so that a hash table doesn't become a linked list. To avoid that sort of issue, I generally only use primitives as keys or very carefully crafted custom objects (like UUID). I don't casually use a custom object as a key.

hidden_dude

Probably in a language like C probing techniques might be better because you can get a more compact RAM by allocating all at once (giving you better cache performance) whereas separate chaining would come with a fragmentation cost.

In languages like Java (and probably V8 and .Net as well) separate chaining is not so bad because allocating is very cheap with a copy collector and memory is compacted (for long lived memory) so lookups will be very fast since the next object is probably already in the CPU cache.

That's probably why Java doesn't mind using a list to resolve collisions. Though there's a space penalty for pointers.

Blessing Samuel

I will save this for future purposes. Thank you so much!

Falcão