Semantic Cultivators : The Critical Future Role to Enable AI

By 2026, AI agents will consume 10x more enterprise data than humans, but with none of the contextual understanding that prevents catastrophic misinterpretations.

In this presentation I shared yesterday, this is the main argument.

Historically, our data pipelines have served people. We’ve architected complex pipelines to ingest, filter, and transform information in different systems of record: cloud data warehouses, security information and event management systems (SIEMs), and observability platforms.

We then interpreted these outputs and acted upon them.

But very quickly the end consumer won’t be people. So, we need to fundamentally reconsider the interface between these systems of record and their transformed data.

People thrive in ambiguity because we’re great at contextual interpretation. One VP of Sales mentions revenue, a CFO understands the demarcation between bookings, billings, GAAP revenue, or contracted ARR. Humans navigate these nuances effortlessly, machines don’t.

What happens when your AI agent pulls “customer acquisition cost” data but doesn’t recognize that marketing measures it by campaign spend, sales calculates it based on AE + BDR costs, & finance includes fully-loaded employee costs?

The result: expensive nonsense masquerading as intelligence.

To combat this disinformation, the teams that were formerly responsible for maintaining and monitoring pipelines will become cultivators of a constantly evolving collection of cross-domain semantic layers that feed the questions from AI agents via MCP or another protocol layer.

The major question in all this is how to deliver the semantic layer. Historically, it’s been difficult to sell a semantic layer as a standalone product. Looker was successful with its LookML language, and other companies have developed their own query language, which to some extent has enforced a loose semantic layer.

The coming years will see a major shift as enterprises realize that their most valuable digital asset isn’t their data lake or their AI models—it’s the semantic layer that makes those investments meaningful.

Software is the business of selling promotions, and no one has been promoted for implementing a semantic layer. However, many people will be promoted for massively improving the accuracy of AI systems and across data security and observability.

The semantic layer is the keystone to that project and consequently, the most strategic part of any data pipeline today.

Get the next one in your inbox

Related Posts

Cloud Data Lakes - The Keystone to the Decade of Data

Looker Joins Google Cloud

Welcome, Bryan and Philip!