May 27, 2024

The AI Trust Fall

If I asked you, “When someone turns in a work assignment, how accurate is it? 80%, 90%, 95% or perhaps 100%?”

We don’t think this way about coworkers’ spreadsheets. But we will probably think this way about AI & this will very likely change the way product managers on-board users.

When was the last time you signed up for a SaaS & wondered : Would the data be accurate? Would the database corrupt my data? Would the report be correct?

But today, with every AI software now tucking a disclaimer at the bottom of the page, we will be wondering. “Gemini may display inaccurate info, including about people, so double-check its responses” & “ChatGPT/Claude can make mistakes. Check important info” are two examples.

In the early days of this epoch, mistakes will be common. Over time, less so, as accuracies improve.

The more important the work, the greater peoples’ need to be confident the AI is correct. We will demand much better than human error rates. Self-driving cars provide an extreme example of this trust fall. Waymo & Cruise have published data arguing self-driving cars are 65-94% safer.

Yet, 2/3 of Americans surveyed by the AAA fear them.

We suffer from a cognitive bias : work performed by a human is likely more trustworthy because we understand the biases & the limitations. AIs are a Schrodinger’s cat stuffed in a black box. We don’t comprehend how the box works (yet), nor can we believe our eyes if the feline is dead or alive when we see it.

New product on-boarding will need to mitigate this bias.

One path may be starting with low-value tasks where the software-maker has tested exhaustively the potential inputs & outputs. Another tactic may be to provide a human-in-the-loop to check the AI’s work. Citations, references, & other forms of fact-checking will be a core part of the product experience. Independent testing might be another path.

As with any new colleague, the first impressions & a series of small wins will determine the person’s trust. Severe errors in the future will erode confidence, that must be rebuilt - likely with the help of human support teams who will explain, develop tests for the future, & assure users.

I recently asked a financial LLM to analyze NVIDIA’s annual report. A question about the company’s increase in dividend amount vaporized its credibility, raising the question : is it less work to do the analysis myself than to check the AI’s work?

That will be the trust fall for AI. Will the software catch us if we trust it?

