One in a thousand

Say that one in every $n$ people has some rare, special quality. (For argument’s sake I’ll use albinism – but it works for anything, visible or not, and it doesn’t even have to be people.) After you meet $n$ people, what is the chance that you’ve met an albino?

You could be tempted to say it’s a near certainty. After all, if one in a thousand people are albinos, you expect to have met one after 1000 people, two after 2000 people, and so on…

But in truth, the answer isn’t 1: it’s around 63%, or $1 - 1/e$.

Surprising? What if you were to then meet a further $n$ people? Then you’re quite a lot more likely to have seen an albino – 86% or $1 - 1/e^2$ – but still not guaranteed. Only after meeting $3n$ does the likelihood reach 95% – so you can safely say you’ve “probably” met one.

The maths behind this is below; it’s not hard, but not terribly gripping either. But it does (sort of) give the lie to a lot of casual remarks you might make, or hear in advertising campaigns.

For example, you may surf ten websites without ensuring yourself a malware infection (especially if you choose wisely.) And if you have three friends, it doesn’t have to be true that one of them will experience cancer.

But in a crowded room, it can still be sobering to choose your favourite statistic and map it on to your fellows.

– – –

The probability of one randomly-assigned person being albino is $1/n$, so the probability that they aren’t is $(1 - 1/n)$.

Drawing $n$ people from a world population of 6 billion, we can assume the composition of the pool isn’t changed by each draw – and so we use unconditional probabilities. After $an$ draws, the probability that none are albino is $(1 - 1/n)^{an}$.

Thus, the likelihood of finding one or more is $P_{n,a} = 1 - (1 - 1/n)^{an}$.

As $n$ increases, this converges quickly to a limiting value, which we find by taking the limit $n = \infty$ and taking the logarithm:

$\ln (1 - P_{n,a}) = n \ln (1 - 1/an) \approx n [-1/n] = -a$

where we use the approximation $\ln(1+\delta) \approx \delta$. Hence $P_{n,a} = 1 - e^{-a}$.