In the next few posts, I'll be talking about some data structures
======================================
Today, I'll be talking about
and how we can implement it from scratch using Java.
Hash table → data structure lets us construct a mapping from keys to values via a technique called "hashing"
Hashing → is a technique that retrieves the value using the index obtained from a key without performing a search.
Each Hash table contains:
Array to hold data.
Hash function (is a function that maps a key 'x' to an index in a fixed range)
Type of hash functions:
→ There are various types of hash functions in the data structure:
- Division Hash Function (h(k) = k mod m )
k → key, m → hash table size
ex: h(1276) = 1276 mod 10 = 6
- Multiplication Method ( h(k) = floor( n( kA mod 1 ) ) )
k → key and A → can be any constant value between 0 and 1, n → hash table size
ex: n = 4, a = 0.568, key = 56 ,
h(56) = floor( 4 * ((560.568) mod 1)) = floor (4 (31.808 mod 1))
= floor (4 * 0.808) = floor(3.232) = 3
- Mid Square Hash Function (h(K) = h(k x k)) → squaring the value of the key and then extracting the middle r digits as the hash value
ex: k = 50 , k*k = 2500, r = 2
h(50) = 50 , then the hash value obtained is 50
The characteristic of a good hash function:
It should be easy to compute.
-
It reduces the chance of collisions.
- A hash collision is when two objects x, y hash to the same value (i.e. H(x) = H(y)).
It should guarantee an even distribution of the values.
Example:
Consider an example of hashtable of size 4, Item are in (key,value) format.
(1 ,20) => key → 1 , hash = 1 % 4 = 1 , array index → 1
(2 ,70) => key → 2 , hash = 2 % 4 = 2 , array index → 2
(3,36) => key → 3 , hash = 3 % 4 = 3 , array index → 3
(42,80) => key → 42 , hash = 42 % 4 = 2 , array index → 2
(4,25) => key → 4 , hash = 4 % 4 = 0 , array index → 0
Hash collisions happened in
(2 ,70) , (42,80) hashed to index → 2
What do we do if there is a hash collision?
We use one of many hash collision resolution techniques to handle this
Hash collision resolution techniques:
-
Separate chaining (Open Hashing)
- Deals with hash collisions by maintaining a data structure (usually a linked list) to hold all the different values which hashed to a particular value.
-
Open addressing (Closed Hashing)
- Deals with hash collisions by finding another place within the hash table until an empty bucket is found.
- There are various methods to find these empty buckets:
- Linear Probing
- Quadratic Probing
- Double Hashing
In the attached images you'll find a two different implementations of HashTables without handling the hash collisions.
Next post I'll be talking about how to handle the hash collisions.
Github: shorturl.at/fvAOZ
Resources:
shorturl.at/gnvN2
shorturl.at/gvwAV
shorturl.at/djG24
shorturl.at/hvzL5
Top comments (0)