There’s no quicker way to lose a user or buyer of your software than to lose their trust. The software didn’t save my data. The database suffered corruption. The website is down frequently. Data integrity is a challenge every company storing data faces. Machine learning SaaS startups face another trust risk – one introduced by probability.
When Nate Silver forecasted the successful election of Barack Obama in 2008 with nearly 100% accuracy across districts, probability theory shined. The real world matched the likely predictions. Fast forward to eight years later, and the new President wasn’t the projected winner.
In both the 2008 and 2016 analyses, the math may have been correct and the theory consistent. But in 2008, the results built trust in data and in 2016, the outcome eroded it. That’s part of human nature.
Many machine learning systems also rely on probability. A programmer encodes a threshold into machine learning models. The system uses that threshold to decide when the probability is sufficient enough to draw conclusion. Sometimes this is called a confidence score.
For example, the minimum probability that this image is contains a cat. The confidence level that that sacre blue should be translated as “Oh my gosh!” not “sacred blue.” The likelihood that taking the Van Wyck in rush hour is faster than the Belt Parkway to Manhattan from JFK.
What should those minimum probabilities be before computer system makes a recommendation? 80%? 90%? 95%? Increase the probability and the number of false positives, or type 1 errors, decreases. Far fewer mongooses in your cat search results.
But increasing the confidence threshold too much will invite type 2 errors. The system asserts an image doesn’t contain a cat. On further inspection, you can see plain as day, there’s a feline in that jpeg.
Machine learning SaaS companies must find equilibrium on this Goldilocks slackline. Not too strict, not too lenient of a machine learning system. If the product falls too much to one extreme, the product may lose the user’s trust, and eventually their business.
How best to manage this risk? The chatbot surge has taught me one principle of human/robot interaction. Setting the user’s expections of the system’s capacities is paramount. Underpromise and overdeliver. The converse leads to distrust.
The second way is to determine which type of error is more palatable to a user. In the case of email spam detection, better to lead in a few irrelevant messages than to classify an email from Mom as spam.
All of these next generation machine learning products will rely on building the trust of the user. At some point, those products must decide when the probability is good enough to recommend shifting marketing budget to a new campaign, classify an image, translate a word, trash a spam email, or escalate an error.
That product decision should not be taken lightly. The ML system’s confidence score has a direct bearing on how much users will trust the product.