Conditional Probability and “He Said, She Said”
by Peter Winkler
As a writer of books on mathematical puzzles I am often faced with delicate issues of phrasing, none more so than when it comes to questions about conditional probability. Consider the classic “X has two children and at least one is a boy. What is the probability that the other is a boy?”
It is reasonable to interpret this puzzle as asking you “What is the probability that X has two boys, given that at least one of the children is a boy” in which case the answer is unambiguously 1/3—given the usual assumptions about no twins and equal gender frequency.
This puzzle confounds people *legitimately*, however, because most of the ways in which you are likely to find out that X has at least one boy contain an implicit bias which changes the answer. For example, if you happen to meet one of X‘s children and it’s a boy, the answer changes to 1/2.
Suppose the puzzle is phrased this way: X says “I have two children and at least one is a boy.” What is the probability that the other is a boy?
Put this way, the puzzle is highly ambiguous. Computer scientists, cryptologists and others who must deal carefully with message-passing know that what counts is not what a person says (even if she is known never to lie) but *under what circumstances would she have said it.*
Here, there is no context and thus no way to know what prompted X to make this statement. Could he instead have said “At least one is a girl”? Could he have said “Both are boys”? Could he have said nothing? If you, the one faced with solving the puzzle, are desperate to disambiguate it, you’d probably have to assume that what really happened was: X (for some reason unconnected with X‘s identity) was asked whether it was the case that he had at least one son, and, after being warned—by a judge?—that he had to give a yes-or-no answer, said “yes.” An unlikely scenario, to say the least, but necessary if you want to claim that the solution to the puzzle is 1/3.
Consider the puzzle presented (according to Alex Bellos) by Gary Foshee at the recent 9th Gathering for Gardner:
I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?
If the puzzle was indeed put exactly this way, and your life depended on defending any particular answer, God help you. You cannot answer without knowing, for example, what the speaker would have said if he had one boy and one girl, and the boy was born on Wednesday. Or if he had two boys, one born on Tuesday and one on Wednesday. Or two girls, both born on Tuesday. Et cetera.
Now, there is nothing mathematically wrong (given the usual assumptions, including X being random) about saying that “The probability that X has two sons, given that at least one of X‘s two children is a boy born on Tuesday, is 13/27.” But if that is to be turned into an unambiguous puzzle attached to a presumed situation, some serious hypothesizing is necessary. For instance: you get on the phone and start calling random people. Each is asked if he or she has two children. If so, is it the case that at least one is a boy born on a Tuesday? And if the answer is again yes, are the children both boys? Theoretically, of the times you reach the third question, the fraction of pollees who say “yes” should tend to 13/27.
Kind of takes the fun out of the puzzle, though, doesn’t it? Kudos to Gary for stirring up controversy with a quickie.Share:
The author says: “You cannot answer without knowing, for example, what the speaker would have said if he had one boy and one girl, and the boy was born on Wednesday. Or if he had two boys, one born on Tuesday and one on Wednesday. Or two girls, both born on Tuesday. Et cetera.” Sounds like nonsense no me. If you know that your friend is an mythical average Joe, the answer is 13/27. The trouble is that there are no average Joes, not that you don’t know what he would have said under the different circumstances. On the other hand, if you knew all the potential utterances of your friend under all the possible combinations of sexes and days of birth of his children, you could have figured out the probability of his second child being a son too. So there is some merit to the argument by the author.2 May 2010, 8:29 pm
He also concludes: “Kind of takes the fun out of the puzzle, though, doesn’t it?”2 May 2010, 9:20 pm
Well, it depends of what is fun for you. For some it’s fun to understand what will happen under some realistic circumstances. For others it’s fun to throw some artificial puzzles at people and enjoy watching them making fools of themselves.
I meant … it depends on what is fun for you.2 May 2010, 9:25 pm
Sorry for sloppy editing.
This whole discussion of the Tuesday Son problem reminds me of a “paradox” I concocted a few years ago, the Cable Guy paradox.
Suppose you have an appointment for a cable guy to install cable. He told you he’ll come between 1:00 and 2:00. As it gets closer and closer to 2:00 with no cable guy appearing, what happens to the probability that he’ll show up in the next instant (or, to avoid probability of a measure 0 event, what’s the probability that he’ll show up between now and (now+2:00)/2)? In a certain intuitive sense, if you assume this particular cable guy has a probability p of making his appointments *in general*, then as time goes to 2:00 with no cable guy, the probability of him suddenly show up seems like it should approach p. But of course that’s nonsense. In reality, by 1:59, you basically give up hope.2 May 2010, 9:38 pm
Right, this particular cable guy may have a probability p of making his appointments *in general*, but if he hasn’t shown up by 1:59, it’s not probability p for you to be serviced, but probability 1-p of being screwed. Enough of this juvenile nonsense already, I’m out of here!2 May 2010, 11:01 pm
I would like to address Xamuel’s Cable Guy paradox.
The Cable Guy makes his appointment if he arrives no earlier than 1:00 but before 2:00. There are 60 minutes in an hour, and given no other information, we can assume he is equally likely to arrive during each of those 60 minutes.
So the probability that he arrives in any given minute within that window would seem to be p/60.
In the case given, 59 of the 60 minutes have elapsed. So the chances that he will make his appointment are the chances that he arrives in the next minute given that he has not arrived yet: (p/60)/((p/60)+(1-p)). Depending on the value of p, it may be reasonable to give up hope, but the math supports this conclusion rather than contradicting it.
But let’s say that the Cable Guy has a 99% success rate in meeting appointments. This means that 1 time out of every 100, he misses his appointment. But for the rest of the appointments, 1 time out of every 60, he arrives at the last minute. So at 1:59, it’s actually more likely that he will make his appointment than not! For every 2000 appointments, he will miss 20, and arrive at the last minute 33 times. For p=0.99, then, the chances of his making his appointment when it is already 1:59 are 33/53.
This assumes that missing the appointment means arriving late. If arriving before 1:00 is a distinct possibility that has already been eliminated the odds might change. So if p is the probability that he makes the appointment, q is the probability that he arrives early, r is the probability that he arrives late, and s is the probability that he does not show up at all, then p+q+r+s=1. In that case, the probability that he will arrive in the next minute when it is already 1:59 would be (p/60)/((p/60)+r+s).3 May 2010, 4:28 pm
I saw Gary’s G4G9 talk, and my memory of it was that he posed the question as
I have two children. One is a boy born on Tuesday. What is the probability I have two boys?
[He was invited in to rephrase it as “a boy born on *a* Tuesday,” (as you have it via Alex Bellos, above) but to my mind clearly showed reluctance to do that]
These puzzles always give me a big headache and in particular I have no strong opinion whether that matters, but I thought I’d pass on this information.4 May 2010, 11:15 pm
The way I put it is that, in probability, there is no such thing as a statement, only an answer to a question. So “One is a boy born on a Tuesday” has little meaning until you know what question is being answered.24 May 2010, 10:26 am
This problem has been making me nuts since I first read about it the other day.
Seems to me that everyone’s overlooking the key factor here, which isn’t so much about the question being asked and whether or not it’s vague, but the mechanism proposed for producing the answer. That is, the probability hinges not on which father you picked, but on what exactly you’ve established that the father is going to say, when he announces his result.
That is, if the father will announce “My first child is a son, my second child is a son” then the odds of him making that statement (everything being equally probably) is 1/4. Because there are only four distinguishable outcomes. If in that scenario I was to tell you, before the answer was revealed, that one of the children was born on a Tuesday it wouldn’t change anything – there are still 4 distinguishable outcomes, thus the odds are still 1/4. If I told you one of the children was a boy, then now there are only 3 possible outcomes, and the odds are 1/3. Telling me that one child is a boy born on Tuesday doesn’t provide any additional information, no more so than telling you one child is a son and I had a salad for lunch would – because the number of distinguishable outcomes hasn’t changed.
Note that if the father was to announce “My other child is a son,” then the odds of that child being a son or daughter are 1/2 (and it doesn’t matter what the first child is, or on what day it was born). Why? Because there are only two possibilities, and the problem didn’t establish any conditions or dependencies.
Finally, it’d be similar if the father were to announce without specifying the order, e.g., “two boys,” “two girls,” or “one of each,” then each outcome is 1/3, and if you stipulate one is a boy it’s now 1/2 that he has two boys.
Given the way the original puzzle was worded, I think it’s reasonable to conclude that Foshee intended the first case, that he would announce “the son born on tuesday is my first child, the second child is also a boy” in which case there are 3 possible outcomes given the stipulation and thus the odds are 1/3.
–s27 May 2010, 12:16 pm
Would it be helpful to suggest that one should keep in mind two meanings of ‘the probability of X is p’? One is frequency — if a certain set of circumstances repeats a large number of times, then in a proportion p of these cases, the outcome will be X. The other meaning is ‘degree of certainty’ — a set of circumstances is presented to me, and I have confidence p that the outcome will be X.28 May 2010, 10:33 am
The two usually come together very easily — by supposing that the particular case in one which one finds oneself has been sampled at random from a large number of sets of circumstances, which are equivalent in all relevant respects.
When puzzles like this have an unequivocally correct answer, it is because they have been carefully framed according to the frequency or random sampling model.
The trouble is that in this form, they may make good examples for a class test in Probability 101, but they don’t make good _puzzles_: ‘Who cares?’ would be commonest response to ‘Given that a child picked at random from a family picked at random from a large set of two-child families, is a boy, what is the probability that both children in this family are boys?’
When such a puzzle is made interesting by being framed in ordinary speech — as Gary Foshee apparently did — then commonly no answer is unequivocally correct.
The clearest way to see where this equivocation arises is by using the Bayes formulation; to calculate the probability unequivocally, one would need to know the probability that the speaker will say ‘I have at least one boy’ (rather than ‘I have at least one girl’) in the case that s/he has one of each.
May I try again?
Here is a bit of imaginary dialogue:
Math Whiz: ‘… so if someone tells you she has two children and at least one son, the probability is 1/3 that she has two sons!’
AC: ‘…and, likewise, is she tells you she has two children and at least one daughter, the probability is 1/3 that she has two daughters?’
MW: ‘Of course…..’
AC: ‘so, whatever she tells you, the chances are 1/3 that if she has two children, they will be two sons or two daughters? I’ve always been able to convince myself that 1/2 randomly selected 2-child families will have same-sex children….’
==== silence =======30 May 2010, 2:34 pm
I was thinking along similar lines, then I ran across https://en.wikipedia.org/wiki/Boy_or_Girl_paradox where they discuss the ambiguities (for the simpler question).
When I read “This is the first year since at least 1990 that Tuesday wasn’t the biggest birth day” at this page, https://www.babycenter.com/0_22-surprising-facts-about-birth-in-the-united-states_1372273.bc (based on 1996 data), I thought there is just much more to this boy-puzzle-question, and adding in the distribution of identical twins makes it just too silly altogether.
Stephan14 June 2010, 4:42 pm
Build an event tree in Excel for the Boy Born on Tuesday puzzle. Call the two children the Revealed Child and the Unrevealed Child. Put in the probabilities that would be estimated if the only information available was that there were 2 children. The tree will calculate the probability of 2 boys as 1/4.14 June 2010, 5:26 pm
Now change the probability that the revealed child is a boy from 1/2 to 1 and the probability that the revealed child is a girl from 1/2 to 0. The tree will now calculate the probability of 2 boys given that the revealed child is certainly a boy. The calculated probability of 2 boys changes to 1/2 not 1/3.
Now change the probability that the revealed child was born on Tuesday from 1/7 to 1 and the probability of Not Born on Tuesday from 6/7 to 0. The tree will now calculate the probability of 2 boys given that the revealed child is certainly a boy born on Tuesday. The calculated probability of 2 boys remains at 1/2.
What is wrong with this procedure?
Having thought more about the Tuesday Boy puzzle I now know why my solution is correct.16 June 2010, 7:25 am
If someone tells you he has 2 childern and at least 1 of them is a boy he has made a disticntion between the two children. One has had information about it revealed and the other has not. Therefore the two children can be labeled as Revealed and Hidden.
What are the possible Revealed/Hidden combinations that can exist before the information is revealed? Answer: B/B, B/G, G/B, G/G. After the information is revealed the possible combinations are B/B and B/G. G/B is not possible because the revealed child is a boy. Therefore the probability of 2 boys given that at least one is a boy is one half.
The same logic applies to the Tuesday Boy puzzle. The probability of 2 boys given that at least one is a boy born on Tuesday is one half.
I have sent this solution to New Scientist where I first read about this puzzle.
I made the OP’s point on the Usenet group rec.puzzles way back in 1997 in response to a wrongly-phrased FAQ: https://groups.google.com/group/rec.puzzles/msg/34e1421b187033fb
I dread getting asked a question like this in an inteview – I can picture the interviewer phrasing it poorly and me stubbornly insisting on correcting him/her…17 June 2010, 4:44 am
The left-handed boy problem « Division by Zero:
[…] deceptively simple problem quickly made the rounds. The knee-jerk answer is , of course—the gender of one child doesn’t change the […]5 August 2010, 8:53 pm
I found this old blog after it was referenced by Jason Rosenhouse.
Peter said “It is reasonable to interpret this puzzle as asking you ‘What is the probability that X has two boys, given that at least one of the children is a boy’ in which case the answer is unambiguously 1/3—given the usual assumptions about no twins and equal gender frequency.” I maintain that this is *not* unambiguous.
In most fields of math, it would be. Information that is “given” to you is meant to be accepted as true. But that is not enough in probability. You need to construct an event in the sample space that represents what outcomes are possible. TO do that, you need to be “given” information that tells you not only what to include in the event, but what to exclude as well. That is, you need both a necessary and a sufficient condition, and being “given” information about one outcome only constitutes a necessary condition.
The error is easy to make in probability, tho, because the word “given” is used formally in similar context. In the usual reading of a conditional probability, P(A|B) means “the probability that an outcome in event A occurred, given that an outcome in event B occurred.” But what is being *given* here is not information about B, as in Peter’s statement above, but the already-defined set itself.
The unambiguous wording Peter intended to make is “What is the probability that X has two boys, given that at X was chosen from the set of families where at least one of the children is a boy.”21 November 2011, 3:29 pm
For whatever it’s worth, this problem resurfaced at https://www.quora.com/Probability/What-is-the-reasoning-behind-Gary-Foshees-boy-born-on-a-Tuesday-problem-where-providing-additional-apparently-irrelevant-information-changes-the-resulting-probabilities
To this day, people get it wrong and defend it fervently.21 July 2014, 4:52 pm