DEV Community ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป

DEV Community ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป is a community of 964,423 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in
Ayan Banerjee
Ayan Banerjee

Posted on • Originally published at csposts.com on

Interview Problem: Top k Frequent Words

Using Custom Comparator in Priority Queue

A priority queue is a queue where the elements stay in a certain sorted order. We can also provide a Compare type for custom ordering.

Defined in header queue

template<
    class T,
    class Container = std::vector<T>,
    class Compare = std::less<typename Container::value_type>
> class priority_queue;
Enter fullscreen mode Exit fullscreen mode

The compare type should provide strict weak ordering, that is, it should return true if the first argument comes before the second argument.

Since the priority queue outputs the largest elements first, the elements that โ€œcome beforeโ€ are actually output last. That is, the front of the queue contains the โ€œlastโ€ element according to the weak ordering imposed by Compare.

By default in priority queue, the compare is less that is the largest element that appears at the top. Using greater we can make the smallest element appear at the top.

#include <bits/stdc++.h>
using namespace std;

template<typename T>
void printPriorityQueue(T &pq) {
    while(!pq.empty()) {
        cout << pq.top() << " ";
        pq.pop();
    }
    cout << endl;
}


int main() {
    priority_queue<int> pq1;
    for (int x: {8, 0, 1, 2, 3, 7}) {
        pq1.push(x);
    }
    printPriorityQueue(pq1); // prints "8 7 3 2 1 0"
    priority_queue<int, vector<int>, greater<int>> pq2;
    for (int x: {8, 0, 1, 2, 3, 7}) {
        pq2.push(x);
    }
    printPriorityQueue(pq2); // prints "0 1 2 3 7 8"
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

Top k Frequent Words

Given a list of words, return the k most frequent elements. The answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first. k is always valid, that is 1 <= k <= number of unique words.

For example, for the given list and k = 2,

["hello", "world", "hello", "earth", "planet", "world"]
Enter fullscreen mode Exit fullscreen mode

Return ["hello", "world"] as they have the highest frequency and sorted according to alphabetical order.

Approach 1: Sorting

In a hashmap, keep the count of the words and then push the words along with their counts in a list and then sort the list. At last, return the first k elements.

vector<string> topKFrequent(vector<string>& words, int k) {
    unordered_map<string, int> frequency;
    for (string &word: words) {
        ++frequency[word];
    }
    // word, frequency
    vector<pair<string, int>> wordFreq;
    for (const auto &[word, freq]: frequency) {
        wordFreq.push_back({word, freq});
    }
    auto comp = [](const pair<string, int> &a, const pair<string, int> &b) {
        return a.second > b.second || (a.second == b.second && a.first < b.first);  
    };
    sort(wordFreq.begin(), wordFreq.end(), comp);
    vector<string> ret;
    for (int i = 0; i < k; ++i) {
        ret.push_back(wordFreq[i].first);
    }
    return ret;
}
Enter fullscreen mode Exit fullscreen mode

Time complexity: O(NlogN) due to sorting

Space complexity: O(N)

Approach 2: Priority Queue

Another efficient approach is using a priority queue. In the priority queue, we will always maintain the top k frequent words. Note the comp function below. If the priority queue size exceeds k, we will pop the topmost element (as it has the lowest count).

vector<string> topKFrequent(vector<string>& words, int k) {
    unordered_map<string, int> frequency;
    for (string &word: words) {
        ++frequency[word];
    }

    auto comp = [](const pair<string, int> &a, const pair<string, int> &b) {
        return a.second > b.second || (a.second == b.second && a.first < b.first);  
    };

    priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(comp)> wordFreq (comp);

    for (const auto &[word, freq]: frequency) {
        wordFreq.push({word, freq});
        if (wordFreq.size() > k) wordFreq.pop();
    }

    vector<string> ret;
    while(!wordFreq.empty()) {
        ret.push_back(wordFreq.top().first);
        wordFreq.pop();
    }
    reverse(ret.begin(), ret.end());
    return ret;
}
Enter fullscreen mode Exit fullscreen mode

Time complexity O(Nlogk) as the priority queue size never exceeds k.

Space complexity O(N).

Reference

Top comments (0)

Update Your DEV Experience Level:

Settings

Go to your customization settings to nudge your home feed to show content more relevant to your developer experience level. ๐Ÿ›