When presented with figures and numbers and statistics, it’s easy to take the conclusions as fact. Numbers in a spreadsheet carry a finality, a exactitude that belies how inaccurate they can be.
In 2005, Stanford professor Johannes Ioannidis turned the world of research and statistics on its head. He published “Why Most Published Research Findings Are False.” Ioannidis’ paper cast doubt on decades of research. More than 75% of experimental results published in the world’s best journals couldn’t be replicated. The conclusions were results of bad experimental design, biases in the data, and statistical tools used incorrectly. A crisis of truth ripped through the research community. For the first time in about 90 years, researchers and statisticians are re-evaluating their methods.
One of the major problems with data analysis are the imperfect methods we use. Every student in a statistics class has come across p-values. Despite taking 3 statistics courses in college, I found them impossible to understand each time because their interpretation isn’t intuitive. P-value measures how often a result will happen by chance. But p-values doesn’t answer the question to the answer most people care about: what are the odds the hypothesis about the data is correct?
In addition to dissolving faith in the research process, bad data encourages wrong decision-making. In most cases, bad data is worse than no data at all. This is as much true for the research community as for startup management teams.
Because of the tendency to believe numbers as fact and the high rates of error of the methods we use, data has to be the start of the decision making process, not the end. As Richard Royall, professor of stats at Johns Hopkins says, when faced with data it’s critical to ask three questions:
What does the data imply? What do I believe? What should I do, considering both?
Published 2014-02-13 in Data Analysis