When Data Confounds Our Intuition

Suppose you’ve been selected to participate in a game show. The game show host asks you to pick one of three doors. Behind one, the grand prize awaits. Behind the other two are goats. You choose Door 1. Then the hosts opens Door 3, revealing a goat. The host prompts you again, “Would you like to select Door 2?” Should you choose it?

This statistics question rose to fame in 1990 when Marilyn Vos Savant asked it in Parade Magazine. In the weeks that followed, Vos Savant received more than 10,000 letters pronouncing her wrong. One thousand of these letters had been penned by PhDs, and many bore the insignia of prestigious universities.

A professor of mathematics at Georgetown University wrote to Vos Savant, ”You are utterly incorrect. How many irate mathematicians are needed to get you to change your mind?" Another from George Mason University piled on, ““You blew it!… As a professional mathematician, I’m very concerned with the general public’s lack of mathematical skills. Please help by confessing your error and, in the future, being more careful.” Even Paul Erdős, the famed Hungarian mathematician, refused to believe it.

To the chagrin of the ten thousand vehement and highly-educated contradictors, the result stands. Vindicated by computer simulation, Vos Savant exposed the often counter-intuitive nature of probability.

There are many different ways to explain why selecting Door 2 will grant you a 67% chance of choosing the car. This is the simplest I’ve found. When you choose Door 1, you have a 33% chance of winning the car (1 in 3 doors). You also have a 2/3 chance that the car is not behind Door 1. After the host reveals Door 3 hides a goat, there is still a 2/3 chance that the car is not behind Door 1. But now there is only one door, Door 2. So you have a 2/3 chance that the car is behind Door 2. The additional information that Door 3 contains a goat improves your chances and this can be proven using Bayes’ Theorem.

But even after that explanation, the answer remains unintuitive.

These confounding conclusions aren’t rare. K.C. Cole, a professor at USC explained our challenge in perceiving the relative sizes of quantities in an article entitled “Why You Didn’t See It Coming.”

Both $1 million and $1 billion sound like “a lot,”… [But] even those who understand the true scale of the chasm between those numbers intellectually don’t always “get it” viscerally. It feels like the difference between a million and a billion is closer to a factor of three than a factor of 1,000. That’s because our brain naturally works using something like a logarithmic scale, so that it can condense information like vast ranges in loudness and brightness efficiently. That can get us into trouble…

These blindspots confuse us. And when prompted with data that confounds our expectations, only 10% of the time will we trust the data rather than our intuition, according to the Economist Intelligence Unit’s Decisive Actions: How Businesses Make Decisions report. 57% of the time, survey respondents would reanalyze the data to ensure no issue with data collection or math errors.

Our ultimate goal with data is to defeat bias. In the “Philosophy of Data,” New York Times Op-Ed columnist David Brooks articulates the two ways data exposes when our hunches are just plain wrong.

First, it’s really good at exposing when our intuitive view of reality is wrong. For example, every person who plays basketball and nearly every person who watches it believes that players go through hot streaks, when they are in the groove, and cold streaks, when they are just not feeling it. But Thomas Gilovich, Amos Tversky and Robert Vallone found that a player who has made six consecutive foul shots has the same chance of making his seventh as if he had missed the previous six foul shots.

Second, data can illuminate patterns of behavior we haven’t yet noticed. For example, I’ve always assumed that people who frequently use words like “I,” “me,” and “mine” are probably more egotistical than people who don’t. But as James Pennebaker of the University of Texas notes in his book, “The Secret Life of Pronouns,” when people are feeling confident, they are focused on the task at hand, not on themselves. High status, confident people use fewer “I” words, not more.

More than simply explain what has occurred in the past and why, data is a powerful tool to expose our biases and point the way to the right decision, especially when the data contradicts our instincts. If 1000 PhDs can be fooled by a counterintuitive probability problem, no one is safe from bias.

Get the next one in your inbox

Related Posts

The Religious Debate About Data

The culture of data science

Startup Trends from YCombinator's Demo Day