[!Info] A Hash table is defined as a data structure used to insert, look up, and remove key-value pairs quickly.
It operates on the hashing concept, where each key is translated by a hash function into a distinct index in an array. The index functions as a storage location for the matching value. In simple words, it maps the keys with the value.
When to use hash table
When we want to find out if an element appears before or whether an element exists in the set.
Load Factor
- It is determined by:
- number of elements are kept
- size of the hash table.
- When load factor is high: the table may be cluttered and have longer search times and collisions.
Collision resolution techniques:
It happens when two or more keys points to the same array index.
- Open Addressing The hash function is applied to the subsequent slots until one is left empty. There are various ways to use this approach: double hashing, linear probing, and quadratic probing.
- Separate Chaining: In separate chaining, a linked list of objects that hash to each slot in the hash table is present. Two keys are included in the linked list if they hash to the same slot. This method is rather simple to use and can manage several collisions.
- Robin Hood hashing: To reduce the length of the chain, collisions in Robin Hood hashing are addressed by switching off keys. The algorithm compares the distance between the slot and the occupied slot of the two keys if a new key hashes to an already-occupied slot. The existing key gets swapped out with the new one if it is closer to its ideal slot. This brings the existing key closer to its ideal slot. This method has a tendency to cut down on collisions and average chain length.
Hashing in Data structure
- array
- set
- map
set
and map
in C++
set
Underlying Implementation | ordered | repeatable | modifiable | query efficiency | addition and deletions efficieny | |
---|---|---|---|---|---|---|
std::set | Red–black tree | Ordered | No | No | O(log n) | O(log n) |
std::multiset | Red–black tree | Ordered | Yes | No | O(log n) | O(log n) |
std::unordered_set | Hash Table | Unordered | No | No | O(1) | O(1) |
map
Underlying Implementation | ordered | repeatable | modifiable | query efficiency | addition and deletions efficieny | |
---|---|---|---|---|---|---|
std::map | Red–black tree | Ordered | No | No | O(log n) | O(log n) |
std::multimap | Red–black tree | Ordered | Yes | No | O(log n) | O(log n) |
std::unordered_map | Hash Table | Unorde | No | No | O(1) | O(1) |
Array as Hash Table
[!Important] Only using array as hash table when keys' range is limited
242 Valid Anagram
Given two string
s
andt
, return true ift
is an anagram ofs
, andfalse
otherwise.the range of the hashes is just from
a
toz
, which is quite small. So we define an array as a hash table, the value of each key is just the occurrence of each letter
bool isAnagram(string s, string t) {
int record[26] = {0};
// get occurrences of each letter in `s`
for(char i : s)
{
++record[i - 'a'];
}
// remove letters appeared in `t`
for(char i : t)
{
if(--record[i - 'a'] < 0) // `t` use more of a specific letter than
return false; // `s`, so they cann't be an anagram
}
for(int i : record)
{
if(i != 0) // `t` couldn't use all of the letters in `s`
return false;
}
return true;
}
383. Ransom Note
Given two strings ransomNote
and magazine
, return true
if ransomNote
can be constructed by using the letters from magazine
and false
otherwise.
Each letter in magazine
can only be used once in ransomNote
.
[!Tip] please notice the range of key is limited to lowercase letters, so we can achieve the hash table with array to save space and time.
bool canConstruct(string ransomNote, string magazine) {
int record[26] = {0};
for(char c : magazine)
{
++record[c - 'a'];
}
for(char c : ransomNote)
{
if(record[c - 'a'] <= 0)
return false;
--record[c - 'a'];
}
return true;
}
unordered_set
and unordered_map
as Hash Table
[!Tip] If the hashes are relatively small, particularly scattered, and span a very large area, using an array results in a huge waste of space.
We should be considering betweenunordered_set
andunordered_map
349. Intersection of Two Arrays
Given two integer arrays num1
and num2
, return an array of their intersection. Each element in the result must be unique and you may return in any order.
The range of values from the two arrays could be very large and scattered, while it requires the uniqueness, it only left us with
unordered_set
vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
unordered_set<int> result;
unordered_set<int> nums1Set(nums1.begin(), nums1.end());
for(int i : nums2)
{
if(nums1Set.find(i) != nums1Set.end())
result.emplace(i);
}
return vector<int>(result.begin(), result.end());
}
202 Happy Number
Write an algorithm to determine if a number n
is happy.
A happy number is a number defined by the following process:
- Starting with any positive integer, replace the number by the sum of the squares of its digits.
- Repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1.
- Those numbers for which this process ends in 1 are happy.
Return true
if n
is a happy number, and false
if not.
The question gives the tip that if a number is not happy, it will loop endlessly, so we just need to know if the sum appears before. so
unordered_set
records the sums along the calculations
int getSum(int n)
{
int sum = 0;
while(n)
{
sum += (n % 10) * (n % 10);
n /= 10;
}
return sum;
}
bool isHappy(int n)
{
unordered_set<int> sums;
int sum = getSum(n);
while(sum != 1)
{
if(sums.find(sum) != sums.end())
{
return false;
}
sums.emplace(sum);
sum = getSum(sum);
}
return true;
}
1 Tow Sum
Given an array of integers nums
and an integer target
, return indices of the two numbers such that they add up to target
.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
You can return the answer in any order.
/*
* iterate the array, get the remain number of each element, if that number
* was in the array, return the pair
* Otherwise, record current element with its index for future returning.
*/
vector<int> twoSum(vector<int>& nums, int target) {
unordered_map<int, int> map;
for(int i = 0; i < nums.size(); ++i)
{
if(auto itr = map.find(target - nums[i]); itr != map.end())
{
return {itr->second, i};
}
map.emplace(nums[i], i);
}
return {-1, -1};
}
454 4Sum II
Given four integer arrays nums1
, nums2
, nums3
, and nums4
all of length n
, return the number of tuples (i, j, k, l)
such that:
0 <= i, j, k, l < n
nums1[i] + nums2[j] + nums3[k] + nums4[l] == 0
Steps
- using an
unordered_map
records the sum from an element fromnums1
and another fromnums2
as key, and the occurrence of the sum as value - Iterating
nums3
andnums4
, find if0 - nums3[i] - nums4[j]
exists in the map.
int fourSumCount(vector<int>& nums1, vector<int>& nums2, vector<int>& nums3, vector<int>& nums4) {
unordered_map<int, int> sum12;
for(int i : nums1)
for(int j : nums2)
++sum12[i+j];
int count = 0;
for(int i : nums3)
{
for(int j : nums4)
{
auto iter = sum12.find(0 - i - j);
if(iter != sum12.end())
{
count += iter->second;
}
}
}
return count;
}
Hash alike questions
15 3Sum
Given an integer array nums
, return all the triplets [nums[i], nums[j], nums[k]]
such that i != j
, i != k
, and j != k
, and nums[i] + nums[j] + nums[k] == 0
.
Notice that the solution set must not contain duplicate triplets.
[!caution] It's easy to be mislead to use hash table to record the sum of first two numbers.
But the question requires unique triplets, which could cost checking duplication every time we find a new combination. As a result, we better solve this by dual pointers.
- Since it doesn't require the indexes of the elements, we can
sort
the array firstly - we have i starts from
0
towardsnums.size() - 3
- make sure that
nums[i]
is different fromnums[i-1]
to prevent duplication
- make sure that
- we have
left = i+1; right = nums.size() - 1
- Keeps testing if
nums[i] + nums[left] + nums[right] == 0
- if equals, that means we find a unique result, record it, then moves
left
andright
till unidentical value - if greater than 0,
--right
- if lesser than 0,
++left
- if equals, that means we find a unique result, record it, then moves
vector<vector<int>> threeSum(vector<int>& nums) {
vector<vector<int>> res;
sort(nums.begin(), nums.end());
for(int i = 0; i < nums.size() - 2; ++i)
{
// remove duplication of i
if(i > 0 && nums[i - 1]==nums[i])
continue;
int left = i + 1, right = nums.size() - 1;
while(left < right)
{
int sum = nums[i] + nums[left] + nums[right];
if(sum == 0)
{
res.push_back({nums[i], nums[left], nums[right]});
// remove duplications of left and right
while (right > left && nums[right] == nums[right - 1]) --right;
while (right > left && nums[left] == nums[left + 1]) ++left;
--right;
++left;
}
else if(sum < 0)
++left;
else
--right;
}
}
return res;
}
18 4Sum
Given an array nums
of n
integers, return an array of all the unique quadruplets [nums[a], nums[b], nums[c], nums[d]]
such that:
0 <= a, b, c, d < n
-
a
,b
,c
, andd
are distinct. nums[a] + nums[b] + nums[c] + nums[d] == target
You may return the answer in any order.
[!Tip] Still use dual pointers as above one
vector<vector<int>> fourSum(vector<int>& nums, int target) {
vector<vector<int>> res;
sort(nums.begin(), nums.end());
// be careful size() return uint, should convert to int for minus
int size = nums.size();
for(int i = 0; i < size - 3; ++i)
{
// pruning
if(nums[i] > target && target >= 0)
break;
// avoid duplication
if(i > 0 && nums[i] == nums[i - 1])
continue;
for(int j = i + 1; j < size - 2; ++j)
{
// pruning
if(nums[i] + nums[j] > target && target >= 0)
break;
// avoiding duplication
if(j > i + 1 && nums[j] == nums[j - 1])
continue;
int left = j + 1, right = size - 1;
while(left < right)
{
// sum may be out of int's range
long sum = (long)nums[i] + nums[j] + nums[left] + nums[right];
if(sum == target)
{
res.push_back({nums[i], nums[j], nums[left], nums[right]});
// avoiding duplication
while(left < right && nums[right] == nums[right - 1]) --right;
while(left < right && nums[left] == nums[left + 1]) ++left;
--right;
++left;
}
else if(sum < target)
++left;
else
--right;
}
}
}
return res;
}
Top comments (0)