Find Median from Data Stream
Find Median from Data Stream: The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
- For example, for
arr = [2,3,4]
, the median is3
. - For example, for
arr = [2,3]
, the median is(2 + 3) / 2 = 2.5
.
Implement the MedianFinder class:
MedianFinder()
initializes theMedianFinder
object.void addNum(int num)
adds the integernum
from the data stream to the data structure.double findMedian()
returns the median of all elements so far. Answers within10^-5
of the actual answer will be accepted.
Example 1:
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]
Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3); // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
Constraints:
-10^5 <= num <= 10^5
- There will be at least one element in the data structure before calling
findMedian
. - At most
5 * 10^4
calls will be made toaddNum
andfindMedian
.
Try this Problem on your own or check similar problems:
Solution:
- Java
- JavaScript
- Python
- C++
class MedianFinder {
PriorityQueue<Integer> min, max;
public MedianFinder() {
min = new PriorityQueue<>();
max = new PriorityQueue<>(Collections.reverseOrder());
}
public void addNum(int num) {
max.add(num);
if(max.size() > min.size() + 1){
min.add(max.poll());
}
if(!min.isEmpty() && min.peek() < max.peek()){
int temp = min.poll();
min.add(max.poll());
max.add(temp);
}
}
public double findMedian() {
if(max.size() > min.size()) return (double) max.peek();
return (max.peek() + min.peek()) / 2.0;
}
}
/**
* Your MedianFinder object will be instantiated and called as such:
* MedianFinder obj = new MedianFinder();
* obj.addNum(num);
* double param_2 = obj.findMedian();
*/
var MedianFinder = function () {
this.min = new MinPriorityQueue();
this.max = new MaxPriorityQueue();
};
/**
* @param {number} num
* @return {void}
*/
MedianFinder.prototype.addNum = function (num) {
this.max.enqueue(num);
if (this.max.size() > this.min.size() + 1) {
this.min.enqueue(this.max.dequeue().element);
}
if (
!this.min.isEmpty() &&
this.min.front().element < this.max.front().element
) {
const temp = this.min.dequeue().element;
this.min.enqueue(this.max.dequeue().element);
this.max.enqueue(temp);
}
};
/**
* @return {number}
*/
MedianFinder.prototype.findMedian = function () {
if (this.max.size() > this.min.size()) return this.max.front().element;
return (this.max.front().element + this.min.front().element) / 2.0;
};
/**
* Your MedianFinder object will be instantiated and called as such:
* var obj = new MedianFinder()
* obj.addNum(num)
* var param_2 = obj.findMedian()
*/
class MedianFinder:
def __init__(self):
self.min = []
self.max = []
def addNum(self, num: int) -> None:
heapq.heappush(self.max, -num)
if len(self.max) > len(self.min) + 1:
heapq.heappush(self.min, -heapq.heappop(self.max))
if self.min and self.min[0] < -self.max[0]:
temp = heapq.heappop(self.min)
heapq.heappush(self.min, -heapq.heappop(self.max))
heapq.heappush(self.max, -temp)
def findMedian(self) -> float:
if len(self.max) > len(self.min):
return -self.max[0]
return (-self.max[0] + self.min[0]) / 2
# Your MedianFinder object will be instantiated and called as such:
# obj = MedianFinder()
# obj.addNum(num)
# param_2 = obj.findMedian()
class MedianFinder {
public:
MedianFinder() {
}
void addNum(int num) {
maxHeap.push(num);
if (maxHeap.size() > minHeap.size() + 1) {
minHeap.push(maxHeap.top());
maxHeap.pop();
}
if (!minHeap.empty() && minHeap.top() < maxHeap.top()) {
int temp = minHeap.top();
minHeap.pop();
minHeap.push(maxHeap.top());
maxHeap.pop();
maxHeap.push(temp);
}
}
double findMedian() {
if (maxHeap.size() > minHeap.size()) {
return maxHeap.top();
}
return (maxHeap.top() + minHeap.top()) / 2.0;
}
private:
std::priority_queue<int> maxHeap;
std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap;
};
/**
* Your MedianFinder object will be instantiated and called as such:
* MedianFinder* obj = new MedianFinder();
* obj->addNum(num);
* double param_2 = obj->findMedian();
*/
Time/Space Complexity:
- Time Complexity: O(Wlogn + R) where
W
is the number of insertion/write operations, whileR
is the number of times we callfindMedian
operation - Space Complexity: O(n)
Explanation:
We maintain two heaps min
and max
while respecting the following rules:
min
will always contain half with larger elements coming from the input array/stream with and the minimum of those will be at the top.max
will contain half with smaller elements coming from the array/stream, the maximum of which will be at the top.min
andmax
should be of roughly the same size, withmax
always being the larger heap if there is odd number of elements in the stream.max.size() > min.size() + 1
- The element at top of
max
should always be smaller (or equal) to the element at top ofmin
heap (this way we can break the stream into larger and smaller elements groups)!min.isEmpty() && min.peek() < max.peek())
.
Doing it this way we can get median in O(1)
since we can just pick the largest element of smaller candidates at the top of max
heap if there is odd number of elements in the stream, or average of top elements of both heaps if the number of elements is even. We still need to do heap operations leading to O(logn)
time complexity for each insertion query and constant time for median query.