This clearly is the model to use; but you seem to balk at doing so – probably because the changes depend on the entire, unknowable distribution. But the entire distribution can be characterized with just one parameter, that can be expressed in a number of forms.

1. Assume the set of all possible names is {NI}, with distribution {PI}, for the first son. Note that sum(all I, PI)=1.

2. The second son can’t have the same name NJ as the first; but we can assume that the remaining names retain the same relative probabilities. So the distribution is {PI/(1-PJ)} for IJ.

3. Prob(2nd son named NI) = sum(all JI, PJ*(PI /(1-PJ)))

= PI * sum(all JI, PJ /(1-PJ))

4. We can now define QI=PI/(1-PI) and Q=sum(all I, QI). Here, QI is a number that is slightly more than PI, and Q is slightly more than 1. The actual value for Q depends on the entire distribution, but we don’t need to know that value.

5. Prob(2nd son named NI) = PI*(Q-QI). As you can see, this now depends only on the parameter QI and the probability PI. In practice, some of these probabilities must increase and some must decrease so that the sum remains 1.

6. And since QI is a monotonically decreasing function of PI, it is the more common names that decrease in probability, and the rarer ones that increase. Let C be the probability of a name that doesn’t change for the second son, so QC=C/(1-C). There may not be an actual name with this probability, but the Mean Value Theorem guarantees exactly one such probability exists. It happens to make a good dividing line between “common” and “uncommon.” Then, Q-QC=Q-C/(1-C)=1, so Q=1/(1-C). Any one of the three parameters (Q, QC, or C) can be used by itself to describe the distribution, but it will be easier to continue using Q in this derivation for a little while.

7. We can now make a table of the probability for each of the nine possible family combinations that include or exclude boy with any particular name. Here, “boy” means a boy not named NI, and all probabilities are multiplied by 4 to eliminate denominators:

Younger Child

NI Boy Girl

Older NI 0 PI PI

Child Boy PI*(Q-QI) 1-PI*(1+Q-QI) 1-PI

Girl PI 1-PI 1

The term in the center cell was chosen to make the sum of all nine equal 4.

8. The definition of conditional probability says P(2 boys|NI) = P(2 boys and NI)/P(NI) = (1+Q-QI)/(3+Q-QI) from the table above. It can also be written (2+QC-QI)/4+QC-QI), or (2+2C*PI-C-3PI)/ (4+4C*PI-3C-5PI). The second version compares most directly to the incorrect answer that allows duplicate names, (2-PI)/(4-PI).

9. I won’t show the derivation, but you can expand the denominator in a power series. Ignoring any terms with two or more of these small probabilities multiplied together, the old, incorrect formula is approximated 1/2–PI/8, and the new one is 1/2+C/8–PI/8. When the number of possible names is large, and it is, this is a very good approximation. It should also be true that C is of the order of magnitude of 1/(# names of reasonable expectancy). By that I mean you should count John and Rufus and Luigi, but not John-Jacob-Jinglehimmer-Smith.

]]>Tanya

I tripped across your paper about the Tuesday Boy problem on arXiv.

Unfortunately, I doubt that your argument, which I agree with, will convince

anybody. When it comes to trivial probability problems like this one, people

have a way of making up reasons for their own answer to be right, and any

other answer to be wrong. Which is why such controversies exist in the first

place. About your argument, they will say “Wednesday” is a different

problem, not an alternate event in the same problem.

What is doubly ironic is that (1) in 1889, using an almost identical problem

as a cautionary tale, Joseph Bertrand warned the world about the

incorrectness of the arguments that lead to the 1/3 answer; and (2) most

published accounts that claim 1/3 is correct will also discuss either the

Monty Hall Problem, or the Three Prisoner’s Problem, which are

mathematically (but not logically, in some senses) equivalent to the one

Bertrand used. Many even link them to it. They always use the solution

method that would get 1/2 in the Two Child Problem.

Say a random process produces a set of equally likely arrangements. Say you

learn information X about the arrangement of one trial, such that some

arrangements become impossible. It is tempting, but generally incorrect as

Bertrand warned us, to say the remaining cases are still equally likely.

Instead, you need to know the probability that you would learn X in each of

the remaining cases.

All of these problems fit a pattern where you could only learn X in A cases,

you would learn X half of the time in B cases, and you can’t learn X in C

cases. If you ask for the conditional probability for the A cases or B cases

as a set, which these problems also always do, the answers are

P(A|X)=A/(A+B/2) and P(B|X)=(B/2)/(A+B/2). And technically, the denominator

should be (A+B/2+0*C). I point that out because it helps to see what the

formula is doing with these probabilities.

In Monty Hall (you pick from three doors, one of which has a prize; Monty

Hall opens one you didn’t pick that does not, and offers to let you switch

to the remaining door), most people count the two cases that remain

possible. They say the answer is that the probability for both remaining

doors is 1/2. But A=1 is the case where you originally chose incorrectly and

Monty had to open the only other door without the prize, B=1 is the case

where you chose correctly and Monty could open either of the other two

doors, and C=1 is the case where you chose incorrectly but Monty opened the

door with the prize (i.e., it is impossible). The probability your door

holds the prize is P(B|X) = (1/2)/(1+1/2)=1/3, and that the other unopened

door holds it is P(A|X)=1/(1+1/2)=2/3.

In Three Prisoners (either you, or one of two other convicts, will be

pardoned; the warden knows who but can’t say, but you convince him to tell

you which of the others won’t get it), most people think you tricked the

warden and your chances have gone up to 1/2.

But A=1 is the case the other prisoner the warden didn’t name gets the

pardon so he had only one choice to name, B=1 is the case where you get it

and he had to choose between two names, and C=1 is the case where the named

prisoner gets it. Your chances are still P(B|X)=1/3.

Bridge offers a similar kind of problem, and they call this the Principle of

Restricted Choice (you are missing the King and Queen of a suit; East had a

chance where it was best for him to play one of them if he could, and played

the King; what are the chances he has the Queen?). Most people would say it

could be in two places, so the probability is 1/2 it is in either. But A=1

is the case where he had only the King so he had to play it, B=1 is the case

where he had both cards, and C=2 are the cases where he did not have the

King. P(B|X) = (1/2)/(1+1/2)=1/3; the case where his choice of cards to play

was restricted – King only – are increased in probability. I mention this

because it actually gets tested, and has been verified by experience.

Bertrand’s Box Paradox (you are told that three identical boxes each hold

two coins; one holds two gold coins, one has two silver coins, and one has

one of each; a box is chosen, and someone looks in and pulls out a gold

coin?). Most people will say this box is either GG or GS, so the probability

it is GG is 1/2. That is wrong. The arrangement is either G1G2-G1, G1G2-G2,

or G1S2-G1, where I indicated the coin you were shown, so the answer is 2/3.

A conditional probability analysis gets the same formulas as before, but is

logically different since the choices work differently. A=1 is the case

where the chosen box has two gold coins, B=1 is the case where it holds a

gold and a silver coin, and C=1 is the case where it holds two silver coins.

Now, in Bertrand’s Box Paradox replace silver coins with bronze ones, and

add a second box with one of each kind. This is now identical, even in the

labels it uses, to the Two Child Problem. A=1 is the BB case, B=2 is the BG

and GB cases, and C=1 is the GG case. P(BB|X)=1/(1+2/2)=1/2.

+++++

But you said that you had never seen a similar problem about girls. There

have been some published, with what appears to be a long history linking

them that makes it similar to Tuesday Boy. In 2008, it went just a little

less viral than that one did in 2010.

In 1988, John Allen Paulos of Temple University published a book titled

“Innumeracy,” comparing the lack of mathematical skills to illiteracy, the

lack of reading skills. It was updated sometime in the 1990s, and he

apparently clarified his. I read the original, but this quote is the edited

one: “Consider now some randomly selected family of four which is known to

have at least one daughter. Say Myrtle is her name. Given this, what is the

conditional probability that Myrtle’s sibling is a brother? Given that

Myrtle has a younger sibling, what is the conditional possibility that her

sibling is a brother? The answers are, respectively, 2/3 and 1/2.” The

version I recall was more like “Consider now some randomly selected family

of four. What is the probability that Myrtle has a brother?”

In 1995, J. Laurie Snell of Dartmouth edited a collection of articles titled

“Topics in Contemporary Probability and Its Applications.” In it, he

co-authored an article with Robert Vanderbei of Princeton, titled “Three

Bewitching Paradoxes.” You can preview the appropriate selection at

https://books.google.com/books?id=dJpTR-jfd1kC&pg=PA355&lpg=PA355&dq=%22Topic

s+in+Contemporary%22+%22three+bewitching+paradoxes%22&source=bl&ots=7LqAzc0z

eh&sig=pSrHLcNx1iDhDa7_6VZEadpRGbA&hl=en&ei=16lBTc3oKoTQgAeFg-y6AQ&sa=X&oi=b

ook_result&ct=result&resnum=1&ved=0CBMQ6AEwAA#v=onepage&q&f=false. They said

that Paulos changed the problem when he called the girl Myrtle. Assume that

the probability a girl would be named Myrtle is p. Because we know the name,

just like knowing “Tuesday,” the answer to the more classic question, that

she has a sister, should be (2-p)/(4-p). This is wrong; and not just for the

“Myrtle-centric” arguments you might provide. There are at least two other

errors that I’ll get to. But for now note that a first-order approximation

to this function of p is 1/2-p/8.

In 2008, Leonard Mlodinow of Cal Tech published a book called “The

Drunkard’s Walk: How Randomness Rules Our Lives.” He asked both the

“classic” version, and the “name” version. These quotes may not be exactly

how they appeared in the book, but come from a NY Times on-line review: “You

know that a certain family has two children, and that at least one is a

girl. But you can’t recall whether both are girls. What is the probability

that the family has two girls?” And For the second version, he added “you

remember that at least one is a girl with a very unusual name (that, say,

one in a million females share).” He used the name “Florida” as an example.

I think it is quite possible that Mlodinow based this problem on Snell &

Vanderbei’s assessment of Paulos’ problem; but that is pure speculation.

Mlodinow hinted at Snell & Vanderbei’s first error, that they included a

family with two girls named Myrtle/Florida in their count (it’s also

possible that Gary Foshee got his by removing the concern). I assume he

didn’t want to try to redistribute that probability in the table, so he

argued that the p^2 term was too small to be concerned with, and the proper

answer was (2-p)/(4-P) which is “very nearly 1/2.” But in fact, by his other

assumptions (which include the second error I mentioned above) you don’t

need to do any complicated calculations. If the probability an older Florida

has a younger sister is 1/2, and the probability a younger Florida has an

older sister is 1/2, then the Law of Total Probability (which you use in

your argument with Jack) says the Florida-centric answer is exactly 1/2!

This is different than when people try, incorrectly, to use this law for the

classic problem. The law requires independent events, which we have if names

aren’t repeated, but not if the names are left out and “Tuesday” is used

instead.

An Italian mathematician named D’Agostini published a paper, rebutting

Mlodinow’s answer, on arXiv last January. He derived 1/2 as the answer, in a

quite roundabout and incorrect way, based on the assumption that you can’t

have two girls named Florida. It was wrong because you also must assume

every other name is similarly limited, which produces some interesting

results. It turns out that the table of the probability for the nine family

types (it’s 3×3 based on boys, non-Floridas, and Floridas) is not symmetric!

To see why, imagine a culture with only three girl’s names: Ann, Beth, and

Mary. The names Ann and Beth are equally popular, but Mary is twice as

popular as either one. It is easy to see that the probability the first girl

in a family gets named Mary is 1/2. But that means that half of the second

girls – who must be the younger in a 2-child family – can’t be named Mary,

while the other half have a 2/3 probability. And that in turn means the

overall probability a girl with an older sister will be named Mary is 1/3,

while it is 1/2 if she has an older brother! Similarly, the probabilities

for Anns (or Beths) are 1/3 if she has an older sister, and 1/4 if she has

an older brother. It is a general property that common names become less

probable, and uncommon ones become more probable.

I have derived the exact answer to the probability Florida has a sister,

based on the assumed distribution function for names, but it is rather

pointless since there is no good distribution to plug into it. But it does

have an interesting property. It depends on a function that has similar

properties to a moment-generating function, and if m is the moment under

that function, a first-order approximation is 1/2 + (m-p)/8. It is a

function that is essentially parallel to Snell and Vanderbei’s, but strictly

greater! And if we divide names into “common” or “uncommon” based on this

moment, common names mean the probability of a sister is less than 1/2,

uncommon names mean it is greater, and the difference from 1/2 for “very

unusual names” depends more on m than on p so it is observable!

But Mlodinow’s problem is not ambiguous at all. The Florida-centric

assumption is flat-out wrong, based on the wording above. You don’t have to

use age in the “classic” problem to divide family types into the four

groups. Any measure that is independent of gender works as well. I prefer to

alphabetize the children’s names. You could also order them by where they

sit, relative to mother, at the dinner table, or – and this is what provides

the answer – whichever child has a name that is more memorable to Leonard

Mlodinow. It does not matter that this ordering can’t be defined in general;

it does exist. Using it reduces the Florida problem to the Older Girl

problem, regardless of any funky properties caused by the probability

distribution function for names. The answer is an unambiguous 1/2.

And I claim this same argument – ordering by however you learned one gender

– applies to most versions of the Two Child Problem.