Google F1 and The Cascade of Innovation New Databases Create

The NoSQL movement launched officially in 1999 but rose to prominence much later perhaps closer to 2008 when Hadoop and other key value pair technologies became en vogue.

Today, it’s hard to argue with the success of the movement. Large banks, insurance companies, biotech companies and dotcoms rely on NoSQL to power their services and inform their most important decisions.

I first saw MapReduce at Google. I’ll never forget the conversation I shared with a search engineer who described MapReduce. Separate the data, process each chunk individually, and reassemble it. That’s how it works, he told me. It was fascinating and unlike the SQL mechanics I knew.

Despite wide adoption inside Google, not all products ran BigTable/Hadoop. AdWords and AdSense, the ad systems that generate the lion’s share of Google’s revenue relied on a MySQL instance. When I joined, the Ads DB was sharded across 59 machines. When I left, I counted more than 70. And these were massive machines.

Since 2008, and perhaps well before, SQL interfaces for Hadoop/MapReduce have been developed because most people who interact with databases have learnt SQL, but very few write MapReduce. These bridge technologies enable more developers and users to leverage the database.

This is true at Google too. But these translation layers create bottlenecks and are imperfect translations of the underlying technology which is why the company built a new database called F1.

I read about F1’s breathtaking performance with incredulity. F1 promises the performance and transaction support of an SQL database with the scale of a NoSQL database. Most importantly, it enables anyone who speaks SQL to query large volumes of data rapidly. F1 is a database technology almost without compromises (setting aside the read and write latencies).

The pace and breadth of innovation in databases is accelerating. Each cycle enables more data to be processed, faster. Because databases are at the core of almost every application and as machine learning becomes a critical part of many products, these database improvements cascade into our daily lives in the form of intelligent assistants like Google Now, improved music recommendations in Pandora and customer prioritization software like Infer. These database innovations will create new opportunities for startups for years to come both in the form of new infrastructure and new applications that take advantage of these new database capabilities.

Get the next one in your inbox

Related Posts

Machine Learning in Consumer Products

What to look for when hiring a data scientist

Founders, teach your employees statistics