I believe machine learning will drive the next big wave of innovation in consumer web services. The very same technologies that power Google’s search and Netflix video recommendation engine will become far more common and useful, perhaps even predominant in the consumer web.
Every great consumer product has a little bit of magic. Apple employs static software and hardware design to anticipate user needs - to create that magic.
Compared to user experience design and forethought, intelligence, when it works, is far more compelling and impressive. Most of the time though, it’s easy to be disappointed. If you’re anything like me, the first time you played with Siri, you tried to find places where it failed. We expect and to a degree have been trained to expect the technology to fail. Like Dorothy, we all look behind the curtain to find the little man making the magic, the Mechanical Turk.
But ML and deep learning have reached a point that makes it harder and harder to pull back the curtain and expose weak technology. Google and Nuance speech recognition are truly useful because they work with very high degrees of accuracy. Netflix’s content recommendation, Facebook’s EdgeGraph, LinkedIn recommendations, Twitter’s Discover, Google Now - each one of these products improves every day.
For machine learning to create magic the the technology requires large amounts of data, the infrastructure to process the data and the algorithms to extract learning. The data sets relevant to consumer products are the data sets consumers create themselves: watch lists, click streams, intent to buy expression, and sharing patterns. And with the growth of smart phone penetration and tablet growth and time spent on the web and identity systems, more and more of human interaction with services and with others is recorded (for better or worse). These data sets grow at exponential rates on an individual basis improving the predictive ability for each user.
Second, the infrastructure to process the data is readily available. Google, Facebook, LinkedIn and others have open sourced Hadoop, Cassandra, Storm and other data processing systems that run on commodity infrastructure and scale horizontally. These systems are capable of analyzing data at close to the rates that users produce them.
Last, the combination of huge amounts of data available and the infrastructure to process it opens the doors to deep learning. This finally makes possible techniques like artificial neural networks which were previously constrained by lack of processing capacity.
In the next five years, machine learning will appear in everything from books to dating sites, from travel recommendations to insurance quotes. And opening the doors below the chess board, there won’t be a chess grand master masquerading, but a data center.
12 March 2013