Posts Tagged ‘probabilistic method’

Bleg: What’s the most recent day no one alive was born

January 5, 2010

Inspired by Michael Lugo’s post on reconstructing a person from their DOB, zipcode, and gender.

If you, for whatever reason, ever watch the Today show, you’ll notice that one of the recurring features is the hosts listing the names of some men and women who are turning 100. Becoming a centenarian is a reasonably big accomplishment — in the U.S., it nets you a congratulatory letter from the President, for example. But if you look into it, you’ll notice that you can find someone turning 100 on pretty much any given day. Usually not someone particularly well-known, but certainly someone. (I tried to find someone famous and vaguely math-related who just turned or is turning 100 for this post, but couldn’t; however, the fascinating economist Ronald Coase turned 99 last week.) It’s almost certainly true that on any given day, someone somewhere in the world is in fact celebrating their 100th birthday. But go ten years further, and you find almost no one who lives to 110. Actually, I know of only one supercentenarian, living or not, who is interesting for reasons apart from his longevity — the late Vietoris, the topologist, probably best known as half of the Vietoris-Rips complex and the Mayer-Vietoris sequence. Odds are pretty good that no one alive is turning 110 today, or tomorrow, or (sadly) New Years’ Day.

So… a question is starting to take shape. On every day between December 29, 1909, and today, someone was born who is still living today. But much earlier than that, and the above statement begins to be false. So what’s the most recent day that no one living was born on?

The coupon collectors’ problem

December 28, 2009

This is a considerably lower-level post than usual, which I’ll (following Terry Tao) also blame on the holidays; there’s another, even less mathematical post in the works which I hope to finish sometime tomorrow.

How many times do you need to flip a coin before you expect to see both heads and tails? How many times do you need to roll a die before you expect to see all the numbers 1-6? These are two instances of the coupon collectors’ problem. Wikipedia gives not one, but two nice solutions to the problem, but there’s an even nicer “back-of-the-envelope” calculation which gives you the correct asymptotics for virtually nothing, and (I like to think) shows the power of thinking “categorically” at even a very low level.

So let’s give a statement of the problem. A company — say Coca-Cola, for concreteness — is holding a contest where everyone who collects one each of n different “coupons” wins some prize. You get a coupon with each purchase of a Coke, and each coupon is equally likely. What’s the expected number of Cokes you have to buy in order to collect all the coupons?

If you do some experimentation (or calculation) with small instances, you’ll see that this number seems to be growing somewhat faster than n. For n = 2, for example, the expected number is 3, and for n = 3 it’s 7. But how much faster? Like $n^2 \text{?}~n \sqrt{n}$? Or just a constant times n?

None of the above, as it happens, and you might have already guessed (or known) that the correct order of growth is $O(n~log~n)$. Here’s how you can figure this out for yourself.

Think of the collection as a function from Coke bottles to (equivalence classes of) coupons. If we’ve collected all the coupons, the function is surjective. So we can rephrase “What is the probability that, after I buy m Coke bottles, I have collected all n coupons” as “What is the probability that a random function from a set with m elements to a set with n elements is surjective?” Actually, we’ll estimate the probability that it’s not surjective.

If the function isn’t surjective, then its image contains at most n-1 elements. Fixing n-1 elements, the probability that our random function takes some element of the domain to this subset is $\frac{n-1}{n}$. Since the appropriate events are all independent, the probability that the random function takes every element to the subset is therefore $(\frac{n-1}{n})^m$.

Now there are n possibilities for the subset of size n-1, so we apply the union bound and say that the probability that our random function is not surjective is at most $n (\frac{n-1}{n})^m$. (Of course, this is an upper bound, and there is an error term; but we’ll return to that in a bit.)

So we want this expression to be smaller than, say, 1/10, which means that $(\frac{n-1}{n})^m = 1/10n$. But when n is large, we have that $(1 - \frac{1}{n})^n$ is about 1/e, so m has to be on the order of $ln~n$!

Now we’ll backtrack a bit. How do we know that the union bound was reasonably tight? After all, we counted functions whose image had size n-2 twice! Well, if you go back through the analysis and do inclusion-exclusion, you’ll see that the probability winds up being close to 1 when $m << n log n$ — but I don’t know of a computation-free way to argue that $O(n log n)$ is asymptotically right! Does anyone else?

So how is this “categorical thinking?” Well, it’s not, really. Category theory only really starts to get mildly interesting when you talk about functors, and doesn’t come into its own right until natural transformations are introduced. But if you’ve learned to think categorically, you see morphisms where other people see objects — in this case, a function where others might see a set — and while this is rarely enough to apply abstract-nonsense tools, it is enough to broaden your intuition and see paths you might have otherwise missed. And this is at least as useful.

GILA 4: The pseudorandom case

September 19, 2009

So in the last post, we defined the discrete Fourier transform and gave some of its basic properties. At the end, we claimed that it gives us a simple notion of pseudorandomness that allows us to make rigorous the intuition that “pseudorandom subsets of $\mathbb{Z}/N\mathbb{Z}$ should have many arithmetic progressions.” Today we’re going to justify this notion of pseudorandomness, and work through this — the easy case — of the proof of Roth’s theorem for $\mathbb{Z}/N\mathbb{Z}$. At the end, we’ll sketch how to modify the method for the regular, finitary version of the theorem.

GILA 2: The probabilistic approach

August 27, 2009

To begin, a historical curiosity. In the last post, I talked about Khinchin’s little book on number theory, which from today’s perspective is one of the earliest books to deal entirely with additive combinatorics and additive number theory. One of Khinchin’s most famous results has to do with the denominators of continued fractions; Khinchin’s original proof of the theorem was quite complicated and involved. Nowadays, the existence of Khinchin’s constant is understood as a simple consequence of the ergodicity of the Gauss-Kuzmin-Wirsing operator.

Where else do we see ergodic theory popping up to simplify complicated proofs? Why, in additive combinatorics! Although we won’t discuss it in this series, ergodic theory has become a central tool for Szemeredi-type theorems, and some results are still only provable by such methods (for instance, until recently, the density Hales-Jewett theorem was in this class.)

To top it all off, note that ergodic theory was originally motivated by problems in statistical physics. Guess who wrote one of the best-known textbooks on mathematical statistical physics? But on to the math!

We’ve seen that traditional arguments based on the pigeonhole principle and such aren’t sufficient to prove Roth’s theorem. So we’ll go to the next tool in our kit — the probabilistic method.

The probabilistic method is one of the most useful weapons in a combinatorialist’s arsenal, even having a whole textbook devoted to it. (The linked page is out of date; a third edition was published last year.) The core of the method is this — if we want to show the existence of an object with a given property, instead of constructing one directly, we instead show that a randomly chosen large enough object has the property with positive probability. Its best-known use in Ramsey theory is for proving lower bounds, where it often gives the best known results, but clever applications can give upper bounds as well, for instance in the proof of Kraft’s inequality. Of particular interest to us is the fact that it provides our first nontrivial proof in the direction of Roth’s theorem.