2 minute read / Mar 19, 2013 / data analysis /best practices /culture /
Which Data Biases Challenge Your Startup?
Steve Sinofsky, executive at Microsoft for 24 years penned an insightful post on the five data biases plaguing product decisions. It can be easy for any founder, product manager, marketer or engineer accept a data point at face value as the rationale behind a decision.
But understanding the nuances and biases of the data, questioning the data, is often just as important as the result. The corollary point argued in his post is data isn’t strategy - data can’t be blindly used to inform product design and decision making because the data might be “lies, damned lies and statistics”.
Steve highlights five data biases seen often in his career:
Representation. No data can represent all people using (or who will use) a product. So who was represented in the data?
Forward or backward looking. When looking at product usage, the data looks at how the product was used but not how it will be used down the road (assuming changes). Is the data justifying the choice or informing the e choice?
Contextual. The data depends on context in which it is collected, so if the user interface is sub-optimal or drives a certain pattern the data does not necessarily represent a valid conclusion. Did the data consider that the collection was itself flawed?
Counter-intuitive. The data is viewed as counter-intuitive and does not follow the conventional wisdom, so something must be wrong. How could the data overlook what is obvious?
Causation or correlation. With data you can see an end state, but it is not always clear what caused the end-state. If something is used a lot, crashes a lot, or is avoided there might be many reasons, most not readily apparent or at least open to debate, that cause that end-state. Is the end-state coincident with the some variables or do those variables cause the end-state?
In my day to day, the correlation or causation one is most common. When evaluating a businesse’s performance it’s easy to assume that one metric (new feature release) causes another to increase (engagement) but making that leap isn’t always correct.
Which data biases do you encounter most often? Tell me in this Branch.