Inspired by Michael Lugo’s post on reconstructing a person from their DOB, zipcode, and gender.
If you, for whatever reason, ever watch the Today show, you’ll notice that one of the recurring features is the hosts listing the names of some men and women who are turning 100. Becoming a centenarian is a reasonably big accomplishment — in the U.S., it nets you a congratulatory letter from the President, for example. But if you look into it, you’ll notice that you can find someone turning 100 on pretty much any given day. Usually not someone particularly well-known, but certainly someone. (I tried to find someone famous and vaguely math-related who just turned or is turning 100 for this post, but couldn’t; however, the fascinating economist Ronald Coase turned 99 last week.) It’s almost certainly true that on any given day, someone somewhere in the world is in fact celebrating their 100th birthday. But go ten years further, and you find almost no one who lives to 110. Actually, I know of only one supercentenarian, living or not, who is interesting for reasons apart from his longevity — the late Vietoris, the topologist, probably best known as half of the Vietoris-Rips complex and the Mayer-Vietoris sequence. Odds are pretty good that no one alive is turning 110 today, or tomorrow, or (sadly) New Years’ Day.
So… a question is starting to take shape. On every day between December 29, 1909, and today, someone was born who is still living today. But much earlier than that, and the above statement begins to be false. So what’s the most recent day that no one living was born on?
Unfortunately the question seems impossible to answer precisely — even today in some countries there’s no reliable system to record births and demographic data, and 100 years ago the situation was far worse. But we can certainly try to make an educated guess, or at least think about how we’d go about trying to make an educated guess!
So we’ll start off with our simplifying assumptions. First of all, in the analysis we’re going to have to consider the probability that a person who’s lived to N days will live to N+1 days. (Or a coarser version of this statistic.) Obviously this probability is different for each person — a 104-year-old in Bangladesh with a terminal disease and without access to good medical care has a much shorter life expectancy than a healthy person of the same age in, say, New Jersey. But this complicates matters hugely, so we’ll say that for each person the probability is the same.
In addition, we’ll assume that the birth rate was constant, say, 1900 and 1910, and that someone born in 1902 had the same probability of living to 100 and someone born in 1909. Again, these are simplifying assumptions.
So once you’ve made these assumptions, you end up with a sequence of random variables that describe how many people who have lived exactly N days are still living. Under the above (unrealistic) assumptions, is approximately a binomial distribution, with the probability decreasing exponentially with N.
So the first estimate we can make is to see when the expected value . But this isn’t all that good — probably the first N with came way earlier — and so a better thing to do is to estimate
But, when the are binomial, this probability is easy — it’s just for some fixed probability p and integer M. So the sum is
As long as this is small, the probability is high that none of the are 0 for $j \leq N$.
Can we do better than this? What if we replaced the above model by something more realistic, where the death rate increases as N increases?
What’s (approximately) the most recent day no one alive was born?