70% of production AI teams use open source models. 72.5% connect agents to databases, not chat interfaces. This is what 375 technical builders actually ship - & it looks nothing like Twitter AI.
70% of teams use open source models in some capacity. 48% describe their strategy as mostly open. 22% commit to only open. Just 11% stay purely proprietary.
Agents in the field are systems operators, not chat interfaces. We thought agents would mostly call APIs. Instead, 72.5% connect to databases. 61% to web search. 56% to memory systems & file systems. 47% to code interpreters.
The center of gravity is data & execution, not conversation. Sophisticated teams build MCPs to access their own internal systems (58%) & external APIs (54%).
Synthetic data powers evaluation more than training. 65% use synthetic data for eval generation versus 24% for fine-tuning. This points to a near-term surge in eval-data marketplaces, scenario libraries, & failure-mode corpora before synthetic training data scales up.
The timing reveals where the stack is heading. Teams need to verify correctness before they can scale production.
88% use automated methods for improving context. Yet it remains the #1 pain point in deploying AI products. This gap between tooling adoption & problem resolution points to a fundamental challenge.
The tools exist. The problem is harder than better retrieval or smarter chunking can solve.
Teams need systems that verify correctness before they can scale production. The tools exist. The problem is harder than better retrieval can solve.
Context remains the true challenge & the biggest opportunity for the next generation of AI infrastructure.
Explore the full interactive dataset here or read Lauren’s complete analysis.