I thought about titling this post, “In Which My Readers Grow Bored With Me.” But, I’ve been thinking about this all day, so I’ve decided to press on. I have simplified the argument here to bare essentials, though, so I hope you’ll excuse the lack of nuance. It’s a blog post, not a journal article.
In an interesting post on statistics, Tony mentions the Monty Hall Problem and the controversy surrounding the problem when it was written about by Marilyn vos Savant in her Parade column in 1990. This problem has always bugged me because it is often stated in a misleading fashion. In fact, I think that the statement of the problem in vos Savant’s column (quoted by Wikipedia) can’t quite be definitively answered.
Let’s look at vos Savant’s version of the problem:
Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?
The fact that the host has a priori knowledge of the location of the car is extremely important (and is sometimes omitted from the statement of the problem). The implication here, which is frankly a bit of a stretch, is that of the two doors not selected by the contestant, the host will always open a door with a goat behind it. (There are, in fact, numerous other ways that the host might behave. Maybe sometimes the contestant is forced to open the door that she initially chose. Maybe sometimes the host actually opens the door with the car thereby showing you that you lost. Any possibility in which the host is not required to reveal a goat would lead to a different analysis.)
But, if you understand the basics of how the game is intended to work, then it turns out that switching doors is at least a weakly dominant action. Here’s the analysis usually presented: If I initially chose the door with the car behind it, which occurs 1/3 of the time, then the host will reveal one of the other two doors and, by switching, I will lose. If I initially didn’t choose the door with the car behind it, though, which occurs 2/3 of the time, then the host will be forced to open the door with the other goat behind it and I will win by switching. Thus, 1/3 of the time I win by staying put and 2/3 of the time I win by switching. Sounds pretty good. But, this does not answer the question posed. To answer the question posed, I don’t need to know the probability of a win when I use the strategy “switch no matter which door the host opens” – which is what we just computed. Instead, I need to compute P(Car Behind #1|Door #3 Opened).
In this case, though, you can’t actually compute P(Car Behind #1 | Door #3 Opened). The reason is that problem still doesn’t reveal the strategy used by the host, nor does it even clue you in that this might be relevant. Suppose we try to compute this probability: P(Car Behind #1 | Door #3 Opened) = P(Car Behind #1 AND Door #3 Opened)/P(Door #3 Opened) = P(Car Behind #1)P(Door #3 Opened|Car Behind #1)/P(Door #3 Opened) = (1/3)P(Door #3 Opened|Car Behind #1)/P(Door #3 Opened). And neither of the remaining probabilities in this expression can be computed. Specifically, the problem doesn’t tell you what the host will do in the case where you initially chose the door with the car. In that case, the host can open either door. Actually giving a definite answer as to whether or not it is strictly better to switch requires a more knowledge than the problem provides. Let me provide an example.
Suppose the host uses the following (admittedly contrived) rule: If the contestant chooses door #1 and this is the door with the car behind it, then always open door #3. Now, if door #3 gets opened, then the probability that the car is behind door #1 is 1/2! If the host is using this rule, then he opens door #3 if and only if the car is behind either door #1 or door #2 (which are, a priori, equally likely events). Hence, switching and staying put have the same probabilities of winning.
Now, it turns out that for any rule the host uses, switching will be at least as good as staying put. In fact, for every rule other than the one described in the previous paragraph, switching is strictly better than staying put. Thus, if you don’t know what the host is up to, you’d definitely ought to switch doors. But you can’t actually know whether or not it is to your advantage to switch in this specific case without more information about how the host will behave.
I agree with vos Savant in the sense that you can’t go wrong with switching doors in this particular problem. One can prove that the strategy “always switch” is strictly better than the strategy “always stay put.” Even better, the strategy “always switch” weakly dominates every other strategy regardless of what the host does. But, in certain pathological cases it may not be “to your advantage” to switch. I am crystal clear about the probability concepts involved. But, I feel strongly think that this problem has lots of unstated assumptions and nuances that are far from obvious. It isn’t just that it defies expectations – the problem is, in fact, under-specified.