A Son Named Luigi

Suppose that we choose all families with two children, such that one of them is a son named Luigi. Given that the probability of a boy to be named Luigi is p, what is the probability that the other child is a son?

Here is a potential “solution.” Luigi is a younger brother’s name in one of the most popular video games: Super Mario Bros. Probably the parents loved the game and decided to name their first son Mario and the second Luigi. Hence, if one of the children is named Luigi, then he must be a younger son. The second child is certainly an older son named Mario. So, the answer is 1.

The solution above is not mathematical, but it reflects the fact that children’s names are highly correlated with each other.

Let’s try some mathematical models that describe how the parents might name their children and see what happens. It is common to assume that the names of siblings are chosen independently. In this case the first son (as well as the second son) will be named Luigi with probability p. Therefore, the answer to the puzzle above is (2-p)/(4-p).

The problem with this model is that there is a noticeable probability that the family has two sons, both named Luigi.

As parents usually want to give different names to their children, many researchers suggest the following naming model to avoid naming two children in the same family with the same name. A potential family picks a child’s name at random from a distribution list. Children are named independently of each other. Families in which two children are named the same are crossed out from the list of families.

There is a problem with this approach. When we cross out families we may disturb the balance in the family gender distributions. If we assume that boys’ and girls’ names are different then we will only cross out families with children of the same gender. Thus, the ratio of different-gender families to same-gender families will stop being 1/1. Moreover, it could happen that the number of boy-boy families will differ from the number of girl-girl families.

There are several ways to adjust the model. Suppose there is a probability distribution of names that is used for the first son. If another son is born, the name of the first son is crossed out from the distribution and following that we proportionately adjust the probabilities of all other names for this family. In this model the probability of naming the first son by some name and the second son by the same name changes. For example, the most popular name’s probability decreases with consecutive sons, while the least popular name’s probability increases.

I like this model, because I think that it reflects real life.

Here is another model, suggested by my son Alexey. Parents give names to their children independently of each other from a given distribution list. If they give the same name to both children the family is crossed-out and replaced with another family with children of the same genders. The advantage of this model is that the first child and the second child are named independently from each other with the same probability distribution. The disadvantage is that the probability distribution of names in the resulting set of families will be different from the probability distribution of names in the original preference list.

I would like my readers to comment on the models and how they change the answer to the original problem.



  1. Tanya Khovanova:

    This piece was inspired by an email I received from JeffJo:


    I tripped across your paper about the Tuesday Boy problem on arXiv.
    Unfortunately, I doubt that your argument, which I agree with, will convince
    anybody. When it comes to trivial probability problems like this one, people
    have a way of making up reasons for their own answer to be right, and any
    other answer to be wrong. Which is why such controversies exist in the first
    place. About your argument, they will say “Wednesday” is a different
    problem, not an alternate event in the same problem.

    What is doubly ironic is that (1) in 1889, using an almost identical problem
    as a cautionary tale, Joseph Bertrand warned the world about the
    incorrectness of the arguments that lead to the 1/3 answer; and (2) most
    published accounts that claim 1/3 is correct will also discuss either the
    Monty Hall Problem, or the Three Prisoner’s Problem, which are
    mathematically (but not logically, in some senses) equivalent to the one
    Bertrand used. Many even link them to it. They always use the solution
    method that would get 1/2 in the Two Child Problem.

    Say a random process produces a set of equally likely arrangements. Say you
    learn information X about the arrangement of one trial, such that some
    arrangements become impossible. It is tempting, but generally incorrect as
    Bertrand warned us, to say the remaining cases are still equally likely.
    Instead, you need to know the probability that you would learn X in each of
    the remaining cases.

    All of these problems fit a pattern where you could only learn X in A cases,
    you would learn X half of the time in B cases, and you can’t learn X in C
    cases. If you ask for the conditional probability for the A cases or B cases
    as a set, which these problems also always do, the answers are
    P(A|X)=A/(A+B/2) and P(B|X)=(B/2)/(A+B/2). And technically, the denominator
    should be (A+B/2+0*C). I point that out because it helps to see what the
    formula is doing with these probabilities.

    In Monty Hall (you pick from three doors, one of which has a prize; Monty
    Hall opens one you didn’t pick that does not, and offers to let you switch
    to the remaining door), most people count the two cases that remain
    possible. They say the answer is that the probability for both remaining
    doors is 1/2. But A=1 is the case where you originally chose incorrectly and
    Monty had to open the only other door without the prize, B=1 is the case
    where you chose correctly and Monty could open either of the other two
    doors, and C=1 is the case where you chose incorrectly but Monty opened the
    door with the prize (i.e., it is impossible). The probability your door
    holds the prize is P(B|X) = (1/2)/(1+1/2)=1/3, and that the other unopened
    door holds it is P(A|X)=1/(1+1/2)=2/3.

    In Three Prisoners (either you, or one of two other convicts, will be
    pardoned; the warden knows who but can’t say, but you convince him to tell
    you which of the others won’t get it), most people think you tricked the
    warden and your chances have gone up to 1/2.
    But A=1 is the case the other prisoner the warden didn’t name gets the
    pardon so he had only one choice to name, B=1 is the case where you get it
    and he had to choose between two names, and C=1 is the case where the named
    prisoner gets it. Your chances are still P(B|X)=1/3.

    Bridge offers a similar kind of problem, and they call this the Principle of
    Restricted Choice (you are missing the King and Queen of a suit; East had a
    chance where it was best for him to play one of them if he could, and played
    the King; what are the chances he has the Queen?). Most people would say it
    could be in two places, so the probability is 1/2 it is in either. But A=1
    is the case where he had only the King so he had to play it, B=1 is the case
    where he had both cards, and C=2 are the cases where he did not have the
    King. P(B|X) = (1/2)/(1+1/2)=1/3; the case where his choice of cards to play
    was restricted – King only – are increased in probability. I mention this
    because it actually gets tested, and has been verified by experience.

    Bertrand’s Box Paradox (you are told that three identical boxes each hold
    two coins; one holds two gold coins, one has two silver coins, and one has
    one of each; a box is chosen, and someone looks in and pulls out a gold
    coin?). Most people will say this box is either GG or GS, so the probability
    it is GG is 1/2. That is wrong. The arrangement is either G1G2-G1, G1G2-G2,
    or G1S2-G1, where I indicated the coin you were shown, so the answer is 2/3.
    A conditional probability analysis gets the same formulas as before, but is
    logically different since the choices work differently. A=1 is the case
    where the chosen box has two gold coins, B=1 is the case where it holds a
    gold and a silver coin, and C=1 is the case where it holds two silver coins.

    Now, in Bertrand’s Box Paradox replace silver coins with bronze ones, and
    add a second box with one of each kind. This is now identical, even in the
    labels it uses, to the Two Child Problem. A=1 is the BB case, B=2 is the BG
    and GB cases, and C=1 is the GG case. P(BB|X)=1/(1+2/2)=1/2.


    But you said that you had never seen a similar problem about girls. There
    have been some published, with what appears to be a long history linking
    them that makes it similar to Tuesday Boy. In 2008, it went just a little
    less viral than that one did in 2010.

    In 1988, John Allen Paulos of Temple University published a book titled
    “Innumeracy,” comparing the lack of mathematical skills to illiteracy, the
    lack of reading skills. It was updated sometime in the 1990s, and he
    apparently clarified his. I read the original, but this quote is the edited
    one: “Consider now some randomly selected family of four which is known to
    have at least one daughter. Say Myrtle is her name. Given this, what is the
    conditional probability that Myrtle’s sibling is a brother? Given that
    Myrtle has a younger sibling, what is the conditional possibility that her
    sibling is a brother? The answers are, respectively, 2/3 and 1/2.” The
    version I recall was more like “Consider now some randomly selected family
    of four. What is the probability that Myrtle has a brother?”

    In 1995, J. Laurie Snell of Dartmouth edited a collection of articles titled
    “Topics in Contemporary Probability and Its Applications.” In it, he
    co-authored an article with Robert Vanderbei of Princeton, titled “Three
    Bewitching Paradoxes.” You can preview the appropriate selection at
    ook_result&ct=result&resnum=1&ved=0CBMQ6AEwAA#v=onepage&q&f=false. They said
    that Paulos changed the problem when he called the girl Myrtle. Assume that
    the probability a girl would be named Myrtle is p. Because we know the name,
    just like knowing “Tuesday,” the answer to the more classic question, that
    she has a sister, should be (2-p)/(4-p). This is wrong; and not just for the
    “Myrtle-centric” arguments you might provide. There are at least two other
    errors that I’ll get to. But for now note that a first-order approximation
    to this function of p is 1/2-p/8.

    In 2008, Leonard Mlodinow of Cal Tech published a book called “The
    Drunkard’s Walk: How Randomness Rules Our Lives.” He asked both the
    “classic” version, and the “name” version. These quotes may not be exactly
    how they appeared in the book, but come from a NY Times on-line review: “You
    know that a certain family has two children, and that at least one is a
    girl. But you can’t recall whether both are girls. What is the probability
    that the family has two girls?” And For the second version, he added “you
    remember that at least one is a girl with a very unusual name (that, say,
    one in a million females share).” He used the name “Florida” as an example.
    I think it is quite possible that Mlodinow based this problem on Snell &
    Vanderbei’s assessment of Paulos’ problem; but that is pure speculation.

    Mlodinow hinted at Snell & Vanderbei’s first error, that they included a
    family with two girls named Myrtle/Florida in their count (it’s also
    possible that Gary Foshee got his by removing the concern). I assume he
    didn’t want to try to redistribute that probability in the table, so he
    argued that the p^2 term was too small to be concerned with, and the proper
    answer was (2-p)/(4-P) which is “very nearly 1/2.” But in fact, by his other
    assumptions (which include the second error I mentioned above) you don’t
    need to do any complicated calculations. If the probability an older Florida
    has a younger sister is 1/2, and the probability a younger Florida has an
    older sister is 1/2, then the Law of Total Probability (which you use in
    your argument with Jack) says the Florida-centric answer is exactly 1/2!
    This is different than when people try, incorrectly, to use this law for the
    classic problem. The law requires independent events, which we have if names
    aren’t repeated, but not if the names are left out and “Tuesday” is used

    An Italian mathematician named D’Agostini published a paper, rebutting
    Mlodinow’s answer, on arXiv last January. He derived 1/2 as the answer, in a
    quite roundabout and incorrect way, based on the assumption that you can’t
    have two girls named Florida. It was wrong because you also must assume
    every other name is similarly limited, which produces some interesting
    results. It turns out that the table of the probability for the nine family
    types (it’s 3×3 based on boys, non-Floridas, and Floridas) is not symmetric!
    To see why, imagine a culture with only three girl’s names: Ann, Beth, and
    Mary. The names Ann and Beth are equally popular, but Mary is twice as
    popular as either one. It is easy to see that the probability the first girl
    in a family gets named Mary is 1/2. But that means that half of the second
    girls – who must be the younger in a 2-child family – can’t be named Mary,
    while the other half have a 2/3 probability. And that in turn means the
    overall probability a girl with an older sister will be named Mary is 1/3,
    while it is 1/2 if she has an older brother! Similarly, the probabilities
    for Anns (or Beths) are 1/3 if she has an older sister, and 1/4 if she has
    an older brother. It is a general property that common names become less
    probable, and uncommon ones become more probable.

    I have derived the exact answer to the probability Florida has a sister,
    based on the assumed distribution function for names, but it is rather
    pointless since there is no good distribution to plug into it. But it does
    have an interesting property. It depends on a function that has similar
    properties to a moment-generating function, and if m is the moment under
    that function, a first-order approximation is 1/2 + (m-p)/8. It is a
    function that is essentially parallel to Snell and Vanderbei’s, but strictly
    greater! And if we divide names into “common” or “uncommon” based on this
    moment, common names mean the probability of a sister is less than 1/2,
    uncommon names mean it is greater, and the difference from 1/2 for “very
    unusual names” depends more on m than on p so it is observable!

    But Mlodinow’s problem is not ambiguous at all. The Florida-centric
    assumption is flat-out wrong, based on the wording above. You don’t have to
    use age in the “classic” problem to divide family types into the four
    groups. Any measure that is independent of gender works as well. I prefer to
    alphabetize the children’s names. You could also order them by where they
    sit, relative to mother, at the dinner table, or – and this is what provides
    the answer – whichever child has a name that is more memorable to Leonard
    Mlodinow. It does not matter that this ordering can’t be defined in general;
    it does exist. Using it reduces the Florida problem to the Older Girl
    problem, regardless of any funky properties caused by the probability
    distribution function for names. The answer is an unambiguous 1/2.

    And I claim this same argument – ordering by however you learned one gender
    – applies to most versions of the Two Child Problem.

  2. Fifth Linkfest:

    […] Tanya Khovanova: A Son Named Luigi […]

  3. JeffJo:

    “Suppose there is a probability distribution of names that is used for the first son. If another son is born, the name of the first son is crossed out from the distribution and following that we proportionately adjust the probabilities of all other names for this family. In this model the probability of naming the first son by some name and the second son by the same name changes. For example, the most popular name’s probability decreases with consecutive sons, while the least popular name’s probability increases.

    This clearly is the model to use; but you seem to balk at doing so – probably because the changes depend on the entire, unknowable distribution. But the entire distribution can be characterized with just one parameter, that can be expressed in a number of forms.

    1. Assume the set of all possible names is {NI}, with distribution {PI}, for the first son. Note that sum(all I, PI)=1.

    2. The second son can’t have the same name NJ as the first; but we can assume that the remaining names retain the same relative probabilities. So the distribution is {PI/(1-PJ)} for IJ.

    3. Prob(2nd son named NI) = sum(all JI, PJ*(PI /(1-PJ)))
    = PI * sum(all JI, PJ /(1-PJ))

    4. We can now define QI=PI/(1-PI) and Q=sum(all I, QI). Here, QI is a number that is slightly more than PI, and Q is slightly more than 1. The actual value for Q depends on the entire distribution, but we don’t need to know that value.

    5. Prob(2nd son named NI) = PI*(Q-QI). As you can see, this now depends only on the parameter QI and the probability PI. In practice, some of these probabilities must increase and some must decrease so that the sum remains 1.

    6. And since QI is a monotonically decreasing function of PI, it is the more common names that decrease in probability, and the rarer ones that increase. Let C be the probability of a name that doesn’t change for the second son, so QC=C/(1-C). There may not be an actual name with this probability, but the Mean Value Theorem guarantees exactly one such probability exists. It happens to make a good dividing line between “common” and “uncommon.” Then, Q-QC=Q-C/(1-C)=1, so Q=1/(1-C). Any one of the three parameters (Q, QC, or C) can be used by itself to describe the distribution, but it will be easier to continue using Q in this derivation for a little while.

    7. We can now make a table of the probability for each of the nine possible family combinations that include or exclude boy with any particular name. Here, “boy” means a boy not named NI, and all probabilities are multiplied by 4 to eliminate denominators:

    Younger Child
    NI Boy Girl
    Older NI 0 PI PI
    Child Boy PI*(Q-QI) 1-PI*(1+Q-QI) 1-PI
    Girl PI 1-PI 1

    The term in the center cell was chosen to make the sum of all nine equal 4.

    8. The definition of conditional probability says P(2 boys|NI) = P(2 boys and NI)/P(NI) = (1+Q-QI)/(3+Q-QI) from the table above. It can also be written (2+QC-QI)/4+QC-QI), or (2+2C*PI-C-3PI)/ (4+4C*PI-3C-5PI). The second version compares most directly to the incorrect answer that allows duplicate names, (2-PI)/(4-PI).

    9. I won’t show the derivation, but you can expand the denominator in a power series. Ignoring any terms with two or more of these small probabilities multiplied together, the old, incorrect formula is approximated 1/2–PI/8, and the new one is 1/2+C/8–PI/8. When the number of possible names is large, and it is, this is a very good approximation. It should also be true that C is of the order of magnitude of 1/(# names of reasonable expectancy). By that I mean you should count John and Rufus and Luigi, but not John-Jacob-Jinglehimmer-Smith.

Leave a comment