Skip to main content

Find Median from Data Stream

How to keep the median ready by splitting the lower half and upper half with two heap structures.

Implement a structure that supports:

  • addNum(num)
  • findMedian()

The structure receives numbers over time and needs to answer the median efficiently.

In the original problem, this usually appears as a MedianFinder class.

In the SeniorPath playground, use:

  • operations: list of operations
  • values: arguments for each operation

And return an array with the answers. Operations without a return value should return null.

Example:

Input:
  operations = ["MedianFinder","addNum","addNum","findMedian","addNum","findMedian"]
  values     = [null,[1],[2],null,[3],null]

Output:
  [null,null,null,1.5,null,2]

What to notice before coding

Common mistakes

Step 1 - Understand what the median really looks at

The median does not want "the entire list nicely sorted".

It wants:

  • the middle element
  • or the average of the two middle elements

So the core idea is keeping two halves separate.

Step 2 - Write the most direct baseline

The natural baseline is:

  • store all numbers in an array
  • on every addNum, insert and sort everything
  • on findMedian, read the middle

It works.

But it re-sorts more than necessary.

Step 3 - Ask which state really needs to stay ready

If the median splits the list into two halves, it makes sense to keep:

  • the lower half
  • the upper half

And only guarantee two rules:

  • almost equal sizes
  • everything on the left is less than or equal to everything on the right

Step 4 - Turn the halves into heaps

The strong structure looks like this:

  • max-heap for the lower half
  • min-heap for the upper half

That gives you quick access to:

  • the largest value on the left
  • the smallest value on the right

And the median comes from there.

The interactive editor needs JavaScript. You can still read the prompt and copy the starter code below.

Starter code

export function runMedianFinder(operations, values) {
  // operations example: ["MedianFinder", "addNum", "findMedian"]
  // values example: [null, [3], null]
  // your solution here
  return []
}
solution.ts
Stuck? Hints available.

Haven't solved it yet?

Viewing the solution now may reduce learning. It's worth trying a bit more.

Open the reference solution

Without JavaScript, the reference solution is shown inline instead of in a dialog.

Solution

Final complexity

Time: O(log n)

Space: O(n)

Solution 1 - Sorted array after every insertion

Good baseline because it makes the median goal very visible.

export function runMedianFinder(operations, values) {
  const numbers = []
  const output = []

  for (let index = 0; index < operations.length; index += 1) {
    const operation = operations[index]
    const args = values[index]

    if (operation === "MedianFinder") {
      output.push(null)
      continue
    }

    if (operation === "addNum") {
      numbers.push(args[0])
      numbers.sort((a, b) => a - b)
      output.push(null)
      continue
    }

    const middle = Math.floor(numbers.length / 2)

    if (numbers.length % 2 === 1) {
      output.push(numbers[middle])
    } else {
      output.push((numbers[middle - 1] + numbers[middle]) / 2)
    }
  }

  return output
}

It works, but re-sorting everything after each insertion does not scale well.

Solution 2 - Two balanced heaps with O(log n) insertion

Now the structure keeps the answer ready without global reordering.

class Heap {
  constructor(compare) {
    this.compare = compare
    this.data = []
  }

  peek() {
    return this.data[0] ?? null
  }

  size() {
    return this.data.length
  }

  push(value) {
    this.data.push(value)
    this.bubbleUp(this.data.length - 1)
  }

  pop() {
    if (this.data.length === 0) {
      return null
    }

    const top = this.data[0]
    const last = this.data.pop()

    if (this.data.length > 0) {
      this.data[0] = last
      this.bubbleDown(0)
    }

    return top
  }

  bubbleUp(index) {
    while (index > 0) {
      const parent = Math.floor((index - 1) / 2)

      if (this.compare(this.data[index], this.data[parent]) >= 0) {
        break
      }

      ;[this.data[index], this.data[parent]] = [this.data[parent], this.data[index]]
      index = parent
    }
  }

  bubbleDown(index) {
    const length = this.data.length

    while (true) {
      let best = index
      const left = index * 2 + 1
      const right = index * 2 + 2

      if (left < length && this.compare(this.data[left], this.data[best]) < 0) {
        best = left
      }

      if (right < length && this.compare(this.data[right], this.data[best]) < 0) {
        best = right
      }

      if (best === index) {
        break
      }

      ;[this.data[index], this.data[best]] = [this.data[best], this.data[index]]
      index = best
    }
  }
}

export function runMedianFinder(operations, values) {
  const lower = new Heap((a, b) => b - a)
  const upper = new Heap((a, b) => a - b)
  const output = []

  function rebalance() {
    if (lower.size() > upper.size() + 1) {
      upper.push(lower.pop())
    } else if (upper.size() > lower.size()) {
      lower.push(upper.pop())
    }
  }

  for (let index = 0; index < operations.length; index += 1) {
    const operation = operations[index]
    const args = values[index]

    if (operation === "MedianFinder") {
      output.push(null)
      continue
    }

    if (operation === "addNum") {
      const value = args[0]

      if (lower.peek() === null || value <= lower.peek()) {
        lower.push(value)
      } else {
        upper.push(value)
      }

      rebalance()
      output.push(null)
      continue
    }

    if (lower.size() === upper.size()) {
      output.push((lower.peek() + upper.peek()) / 2)
    } else {
      output.push(lower.peek())
    }
  }

  return output
}

The strong insight here is thinking in balanced halves, not in one global list, while each insertion stays in O(log n).

What to say in the interview

A strong short explanation would be:

The baseline sorts everything again after each insertion. The stronger version splits the numbers into two halves: a max-heap for the lower half and a min-heap for the upper half. I keep the heaps balanced and guarantee that everything on the left stays less than or equal to everything on the right. The median comes from their tops, and each insertion stays in O(log n).

That shows that you:

  • modeled the median as a boundary between halves
  • chose the right structure for each side
  • explained why rebalancing is part of the rule

When this pattern shows up again

This pattern comes back when you need to:

  • maintain an online statistic without recomputing everything
  • split data into a lower half and an upper half
  • use heaps for dynamic boundaries
  • design a structure that preserves invariants after every operation

The point is not memorizing MedianFinder.

The point is learning how to keep the right answer ready while the data keeps arriving.

Next challenge Longest Increasing Subsequence Previous challenge Serialize and Deserialize Binary Tree

Related challenges

Related articles