## Exercise

Recall that a **stream** is, informally, a possibly infinite sequence of elements which can be read from one at a time, and can’t be rewound (without using extra memory). That is, it’s an iterator that only has a `next`

method.

Given a stream of length $n$ (where $n$ is not known), uniformly choose a random element from that stream using $O(1)$ space and $O(n)$ time.

## Solution

The following solution, in Common Lisp, uses a list as the representation for our stream. However, any data type could be used as long as we can tell if we are at the end of a stream (i.e., `null`

) and how to get the next element of the stream (i.e., `pop`

).

(defun random! (seq) (labels ((rec (rand-elt count) (if (null seq) rand-elt (let ((next (pop seq))) (rec (if (zerop (random count)) next rand-elt) (1+ count)))))) (rec (pop seq) 2)))

The general strategy is this: as we pull items out of the stream, increment a counter (thereby “labeling” the item), and select that as our “new” item with a probability inversely proportional to the counter.

More precisely: Suppose we are at the $k$th item, then with a probability of $1/k$, choose that item as our new random item $r$. When we have exhausted the stream or we have iterated through $n$ elements, our random item will be $r$.

We can prove that the $k$th item in the stream of length $n$ has a probability of $P(k,n)=1/n$. We do so inductively.

**Base Case.** For a stream of length $n=1$, it is clear that we have a probability $P(1,1)=1$.

**General Case.** Suppose we have read $n-1$ elements, and suppose $r$ is a uniformly random element from the items enumerated by $1$ and $n-1$. The probability that $r$ will be set to the $n$th element is $1/n$, by definition. The chance that $r$ will not be updated is $(n-1)/n$. In this case, the probability that any particular (suppose the $k$th) previous stream element will remain in $r$ is \begin{equation*}

P(k,n-1)\cdot(1-P(n,n)) = \frac{1}{n-1}\cdot\frac{n-1}{n} = \frac{1}{n}.

\end{equation*}

Therefore, the probability of $r$ being updated is the same as the probability any previous element has a chance of being $r$.

By induction, for a stream of length $n$, $P(k,n) = 1/n$.