Skip to main content

Top K Frequent Elements


Top K Frequent Elements: Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Example 1:
Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Example 2:
Input: nums = [1], k = 1
Output: [1]

Constraints:
  • 1 <= nums.length <= 10^5
  • -10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique.

Try this Problem on your own or check similar problems:

  1. Word Frequency
  2. Kth Largest Element in an Array
  3. Sort Characters By Frequency
Solution:
public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Long> freq = Arrays.stream(nums).boxed().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
List<Integer>[] buckets = new List[nums.length + 1];

freq.entrySet().stream()
.forEach(entry -> {
int key = entry.getKey();
int frequency = entry.getValue().intValue();
if (buckets[frequency] == null) buckets[frequency] = new ArrayList<>();
buckets[frequency].add(key);
});

int[] result = new int[k];
for(int i = buckets.length - 1, j = 0; i >= 0 && j < k; --i){
if(buckets[i] != null){
int bucketIterator = 0;
while(j < k && bucketIterator < buckets[i].size()){
result[j] = buckets[i].get(bucketIterator);
++j;
++bucketIterator;
}
}
}

return result;
}

Time/Space Complexity:
  • Time Complexity: O(n)
  • Space Complexity: O(n)

Explanation:

There are several solutions to this problem both with pros and cons. The first (and probably the most straightforward) solution would be to use the same logic we did in the K Closest Points To Origin, but this time since we're searching for most frequent (maximum) elements we can use minheap and discard the candidates when the size of the heap is larger than k (everytime we will discard the least frequent number leaving as with k most frequent candidates after we iterated over all elements). The time complexity would be O(nlogk) and space complexity would be O(k) where k is the size of heap. The solution is given below:

public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Long> freq = Arrays.stream(nums).boxed().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
PriorityQueue<Integer> pq = new PriorityQueue<>((a, b) -> (int) (freq.get(a) - freq.get(b)));

freq
.forEach((key, value) ->
{
pq.add(key);
if (pq.size() > k) {
pq.poll();
}
});

return pq.stream().mapToInt(i -> i).toArray();
}

So, we first build the frequency map (just iterate over all elements and group them together counting their occurrence) using the stream api magic. We create the minheap by comparing the frequencies of each element that will be added to the queue. We iterate over the map and add keys (distinct elements) to the heap, discarding the top element if the size of the heap is above k. Finally, we convert the PriorityQueue (our heap) to array since it's the desired output type.

Next solution uses the techniques from quickSort called quickSelect. It's a randomization algorithm that helps us find kth smallest element in array in O(n) (note that in worst case scenario we could have O(n^2) if we choose the rightmost element as our pivot, but with randomization the algorithm becomes stable meaning the chances of producing O(n^2) are relatively small).

class Solution {
public int[] topKFrequent(int[] nums, int k) {
// create a frequency map
Map<Integer, Long> freq = Arrays.stream(nums).boxed().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

// create an array of unique elements and their frequencies
int n = freq.size();
int[][] elements = new int[n][2];
int i = 0;
for (Map.Entry<Integer, Long> entry : freq.entrySet()) {
elements[i][0] = entry.getKey();
elements[i][1] = entry.getValue().intValue();
i++;
}

// sort the array using the quickselect method, note that we're searching for n-k smallest keeping the range of max k elements with it
quickselect(elements, 0, n - 1, n - k);

// create an array to store the top k frequent elements
int[] result = new int[k];
for (i = 0; i < k; i++) {
result[i] = elements[n - k + i][0];
}

// note that we start from n-k since that leaves us with k most frequent element in (partially) sorted array

return result;
}

public void quickselect(int[][] elements, int left, int right, int kSmallest) {
// base case: the array contains only one element
if (left == right) return;

// select a random pivot_index
Random randomNum = new Random();
int pivot = left + randomNum.nextInt(right - left);

// find the pivot position in a sorted array
pivot = partition(elements, left, right, pivot);

// if the pivot is in its final sorted position
if (kSmallest == pivot) {
return;
} else if (kSmallest < pivot) {
// go left, pivot is to large
quickselect(elements, left, pivot - 1, kSmallest);
} else {
// go right
quickselect(elements, pivot + 1, right, kSmallest);
}
}

public int partition(int[][] elements, int left, int right, int pivot) {
int pivotFrequency = elements[pivot][1];
// 1. move pivot to end
swap(elements, pivot, right);
int tempIndex = left;

// 2. move all less frequent elements to the left
for (int i = left; i <= right; i++) {
if (elements[i][1] < pivotFrequency) {
swap(elements, tempIndex, i);
tempIndex++;
}
}

// you can think of tempIndex as writter, while i is iterator starting from left
// if we have a less frequent number we move it to the left side and increment the writter
// otherwise we just keep on looping over the array

// 3. move pivot to its final place
swap(elements, tempIndex, right);

// every less frequent element than pivot is now on the left side of it
// while every more frequent element in on it's right side

return tempIndex;
}

public void swap(int[][] elements, int a, int b) {
int[] tmp = elements[a];
elements[a] = elements[b];
elements[b] = tmp;
}
}

Our next and final solution is bucket sort. We know that maximum frequency a number can have is at most n where n is the length of array, so we create array of buckets where index of bucket determines the frequency of elements in that bucket (e.g. second bucket will contain elements that only appear once since we use 0-index arrays), finally we return the last k elements by traversing the last buckets until we have collected k elements:

public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Long> freq = Arrays.stream(nums).boxed().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
List<Integer>[] buckets = new List[nums.length + 1];

freq.entrySet().stream()
.forEach(entry -> {
int key = entry.getKey();
int frequency = entry.getValue().intValue();
if (buckets[frequency] == null) buckets[frequency] = new ArrayList<>();
buckets[frequency].add(key);
});

int[] result = new int[k];
for(int i = buckets.length - 1, j = 0; i >= 0 && j < k; --i){
if(buckets[i] != null){
int bucketIterator = 0;
while(j < k && bucketIterator < buckets[i].size()){
result[j] = buckets[i].get(bucketIterator);
++j;
++bucketIterator;
}
}
}

return result;
}

This leads to O(n) space and time complexity.