And if so, what’s just a regular ungraded bowl?
(Hat tip to Vladimir S. for the bad pun; new post coming soon, really, I promise.)
[From Homer the Smithers, Mr. Burns sends Smithers on a forced vacation and tasks him with finding a temporary replacement]
Smithers: I’ve got to find a replacement who won’t outshine me. Perhaps if I search the employee evaluations for the word ‘incompetent…’
[Computer beeps, screen displays "714 Matches Found']
Smithers: 714 names?! Heh. Better be more specific. ‘Lazy,’ ‘clumsy,’ ‘dimwitted,’ ‘monstrously ugly.’
[After a couple seconds, computer beeps, screen displays '714 Matches Found']
Smithers: Ah, nuts to this, I’ll just go get Homer Simpson.
Actually, I tried to find the statistical nomenclature for this kind of thing, but couldn’t. Anyone have any idea what this is? (I want to say selection bias, but that’s not quite it…)
New Extremal Toolbox post should be coming later this weekend.
Apparently it’s used to model some password-protected networks, but I’m still pretty sure that the rainbow connection number of a graph can only have been invented (or at least named) as a joke.
Since it’s not yet on Thomas’ blog, and I need to use the data, I thought I’d make it publicly available. All credit goes to Thomas Sauvaget for the sequence-to-HTML code, and to Polymath for the sequence itself. (Probably someone specific discovered it, but I don’t know who.) Tables below the fold.
I’m starting a new series of posts this semester where I get “back to basics.” One of the few areas of mathematics in which I can claim anything even in the same connected component as “expertise” is extremal combinatorics. Unfortunately for me and my lazy, big-picture brain, though, extremal combinatorics is very much a “problem-solving” subject, with a relatively small number of tools that are used to solve all sorts of different problems. So without some practice solving these problems, or expositing the solutions, it’s easy to get rusty.
Hence, “The Extremal Toolbox.” In each post, I’ll take a (solved!) problem in extremal combinatorics — anything from Sperner’s theorem to Kakeya over finite fields, as long as there’s an extremal flavor — and try to break down a proof into its component parts.
Today I’m going to examine a problem which appeared on MathOverflow some time ago, which I didn’t quite solve (but came within epsilon of!) The relevant post is here; if you don’t care to click through, here’s the problem.
Let
be an
matrix with non-negative integer entries. Suppose further that if
is 0, then the sum of all the entries in the ith row or the jth column is at least
. Then the sum of all the entries in
is at least
.
Inspired by Michael Lugo’s post on reconstructing a person from their DOB, zipcode, and gender.
If you, for whatever reason, ever watch the Today show, you’ll notice that one of the recurring features is the hosts listing the names of some men and women who are turning 100. Becoming a centenarian is a reasonably big accomplishment — in the U.S., it nets you a congratulatory letter from the President, for example. But if you look into it, you’ll notice that you can find someone turning 100 on pretty much any given day. Usually not someone particularly well-known, but certainly someone. (I tried to find someone famous and vaguely math-related who just turned or is turning 100 for this post, but couldn’t; however, the fascinating economist Ronald Coase turned 99 last week.) It’s almost certainly true that on any given day, someone somewhere in the world is in fact celebrating their 100th birthday. But go ten years further, and you find almost no one who lives to 110. Actually, I know of only one supercentenarian, living or not, who is interesting for reasons apart from his longevity — the late Vietoris, the topologist, probably best known as half of the Vietoris-Rips complex and the Mayer-Vietoris sequence. Odds are pretty good that no one alive is turning 110 today, or tomorrow, or (sadly) New Years’ Day.
So… a question is starting to take shape. On every day between December 29, 1909, and today, someone was born who is still living today. But much earlier than that, and the above statement begins to be false. So what’s the most recent day that no one living was born on?
This has nothing to do with the rest of the post, but I’ll put it here so you read it before you get bored. I’d like to thank my readers (all seven of you) for supporting this blog in the first six months or so of its existence, and hope that you’ll stick around (and be joined by hundreds of new readers…) to hear my sporadic ramblings and wild ravings in the next year. Here’s to a happy and successful 2010!
Over at MathOverflow, Gjergji Zaimi asks (in a criminally under-voted-for question): How can we obtain global information from local data in graph theory? This is something that perhaps everyone working in or around graph theory has asked themselves, in some form, at some point — I know I have. So it’s not surprising that Gjergji’s question has received many different answers with many different interesting things to say.
I originally wanted to write a post trying to “answer” Gjergji’s question as best I could, but quickly realized the futility of that goal — it’s such a broad and deep question that I doubt if anyone could answer it concisely, and I know I couldn’t! So instead I’ll just talk about an of the question — what does it even mean, “local data?”
Sorry about the lack of new post; it’s coming. It turned out to be a more interesting problem than I at first thought; look for it around New Years’.
I’m working through some of the holes in my graph knowledge with my shiny new copy of Bollobas’ Modern Graph Theory. Chapter 1, Exercise 19 is a problem I’ve done before, but the way it’s presented makes me want to do it all over again:
Characterize the degree sequences of forests!
Exercise 17 is about the degree sequences of trees, and 18 extends it to forests with a fixed number of components — so this isn’t totally out of the blue. Still, it makes me wonder why more textbooks don’t end problems with exclamation marks.
This is a considerably lower-level post than usual, which I’ll (following Terry Tao) also blame on the holidays; there’s another, even less mathematical post in the works which I hope to finish sometime tomorrow.
How many times do you need to flip a coin before you expect to see both heads and tails? How many times do you need to roll a die before you expect to see all the numbers 1-6? These are two instances of the coupon collectors’ problem. Wikipedia gives not one, but two nice solutions to the problem, but there’s an even nicer “back-of-the-envelope” calculation which gives you the correct asymptotics for virtually nothing, and (I like to think) shows the power of thinking “categorically” at even a very low level.
So let’s give a statement of the problem. A company — say Coca-Cola, for concreteness — is holding a contest where everyone who collects one each of n different “coupons” wins some prize. You get a coupon with each purchase of a Coke, and each coupon is equally likely. What’s the expected number of Cokes you have to buy in order to collect all the coupons?
If you do some experimentation (or calculation) with small instances, you’ll see that this number seems to be growing somewhat faster than n. For n = 2, for example, the expected number is 3, and for n = 3 it’s 7. But how much faster? Like ? Or just a constant times n?
None of the above, as it happens, and you might have already guessed (or known) that the correct order of growth is . Here’s how you can figure this out for yourself.
Think of the collection as a function from Coke bottles to (equivalence classes of) coupons. If we’ve collected all the coupons, the function is surjective. So we can rephrase “What is the probability that, after I buy m Coke bottles, I have collected all n coupons” as “What is the probability that a random function from a set with m elements to a set with n elements is surjective?” Actually, we’ll estimate the probability that it’s not surjective.
If the function isn’t surjective, then its image contains at most n-1 elements. Fixing n-1 elements, the probability that our random function takes some element of the domain to this subset is . Since the appropriate events are all independent, the probability that the random function takes every element to the subset is therefore
.
Now there are n possibilities for the subset of size n-1, so we apply the union bound and say that the probability that our random function is not surjective is at most . (Of course, this is an upper bound, and there is an error term; but we’ll return to that in a bit.)
So we want this expression to be smaller than, say, 1/10, which means that . But when n is large, we have that
is about 1/e, so m has to be on the order of
!
Now we’ll backtrack a bit. How do we know that the union bound was reasonably tight? After all, we counted functions whose image had size n-2 twice! Well, if you go back through the analysis and do inclusion-exclusion, you’ll see that the probability winds up being close to 1 when — but I don’t know of a computation-free way to argue that
is asymptotically right! Does anyone else?
So how is this “categorical thinking?” Well, it’s not, really. Category theory only really starts to get mildly interesting when you talk about functors, and doesn’t come into its own right until natural transformations are introduced. But if you’ve learned to think categorically, you see morphisms where other people see objects — in this case, a function where others might see a set — and while this is rarely enough to apply abstract-nonsense tools, it is enough to broaden your intuition and see paths you might have otherwise missed. And this is at least as useful.
Galois was of course the first to highly successfully use the notion of a field. However, if ones reads his papers they’ll see that he never explicity gave the concept of an algebraic structure closed under addition, subtraction, commutative multiplication, and division a name. Dedekind would be the first to do that; he gave the name Körper, or “body,” to what we’d today call a number field. A couple of decades later, E.H. Moore of Chicago would introduce the term “field” in English.
“Körper” caught on fairly quickly among Continental mathematicians, giving us the French corps, and from there it spread to Spanish and Portuguese; in the other direction, the German mutated into Hungarian “test” and Polish “ciało”, both essentially with the same meaning of “body.”
However, in Italian and most of the Slavic languages, the word for “field” is also the agricultural term. This means that the algebraic terminology didn’t solidify until considerably later, probably between the World Wars at earliest. This is understandable; while both Italy and Russia had strong mathematical communities around the turn of the last century, they were somewhat isolated and if nothing else had relatively fewer top-tier algebraists than the French or, especially, the German schools.
What’s really curious is the following: In both Italian and Russian, as I mentioned, the word for English “field” is a literal translation of “field.” In pretty much every language, the word for “ring” can also refer to a thing that you wear on your finger. But in Italian and (several of) the Slavic languages — and in these languages alone, as far as I know — the word for “skew field”, or “division ring”, translates to English as “body”! This seems to me to be a rather exceptional situation — surely either a modification of “ring” or of “field” will do, as in every other language, but it seems not to be the case. So there are two open problems here: