<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:content="http://purl.org/rss/1.0/modules/content" xmlns:media="http://search.yahoo.com/mrss/">:
  <channel>
    <title>Tomasz Tunguz</title>
    <link>https://www.tomtunguz.com/</link>
    <description>Recent content on Tomasz Tunguz</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Mon, 15 Jun 2026 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="https://www.tomtunguz.com/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>The Golden Age of AI Applications</title>
      
      <link>https://www.tomtunguz.com/golden-age-of-applications/</link>
      <pubDate>Mon, 15 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/golden-age-of-applications/</guid>
      
      <description>&lt;p&gt;We&amp;rsquo;re entering the golden age of AI applications. Three recent developments confirm it.&lt;/p&gt;
&lt;p&gt;The Fable retraction shows regulatory risk. Nadella&amp;rsquo;s thesis shows strategic consensus. Salesforce&amp;rsquo;s acquisition shows market validation.&lt;/p&gt;
&lt;p&gt;First, the US government shut down Fable access&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; &amp;amp; the software ecosystem roared with many responses : Bring it back! Open-source &amp;amp; local models have become essential! Don&amp;rsquo;t rely on a single model!&lt;/p&gt;
&lt;p&gt;Satya Nadella published an AI ecosystem thesis.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; He argued that for a healthy ecosystem, the moat can&amp;rsquo;t be the model. Instead, human expertise &amp;amp; the system around the model (the harness&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;) must be the moat.&lt;/p&gt;
&lt;p&gt;And Salesforce announced the acquisition of Fin, formerly Intercom, for $3.6b.&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt; The founders &amp;amp; management team repositioned the company through the AI upheaval. Fin used open-source models to maximize price/performance.&lt;/p&gt;
&lt;p&gt;Building AI applications is hard for different reasons than SaaS. It&amp;rsquo;s not a lack of engineers, or the challenges of uptime, or the demands of faster releases.&lt;/p&gt;
&lt;p&gt;AI applications present three new disciplines to master : picking the right models, developing the hill-climbing loop, &amp;amp; evaluating the performance of the system for each company, all of which answer the question how much intelligence can I squeeze out of my token budget?&lt;/p&gt;
&lt;p&gt;Models are tricky. Budgets prevent defaulting everyone to state-of-the-art. The legion of other models each have a personality. Kimi K2.6 is fast &amp;amp; a great creative writer but less precise. Qwen 3.6 27b is a small model with legendary performance, but it&amp;rsquo;s a bit of a donkey. It stops suddenly in the middle of a toolchain call &amp;amp; requires a good prodding to push on. GLM 5.1 is an excellent coding model, but a plodder.&lt;/p&gt;
&lt;p&gt;Loops, the critical problem-definition exercise of this era, are hard to design. Systems design is an entire discipline (see Donella Meadows&amp;rsquo; excellent work on it&lt;sup id=&#34;fnref:5&#34;&gt;&lt;a href=&#34;#fn:5&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;5&lt;/a&gt;&lt;/sup&gt;). What is the best way to define a loop so an agentic system improves? This field is novel &amp;amp; challenging because the models &amp;amp; infrastructure move quickly.&lt;/p&gt;
&lt;p&gt;Evaluating the performance of model + loop is ongoing labor. Most companies won&amp;rsquo;t want to staff a team for each workflow software in a company. AI systems are complex, finicky engines.&lt;/p&gt;
&lt;p&gt;The nuances of tuning the carburetors &amp;amp; the timing belts of these complex beasts are tasks better assigned to a few vendors to deliver maximum intelligence per dollar&lt;sup id=&#34;fnref:6&#34;&gt;&lt;a href=&#34;#fn:6&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;6&lt;/a&gt;&lt;/sup&gt; &amp;amp; amortize the costs across a broader population.&lt;/p&gt;
&lt;p&gt;The companies that master these three disciplines will own the golden age.&lt;/p&gt;
&lt;hr&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://fortune.com/2026/06/13/anthropic-disables-fable-mythos-export-controls-national-security-threat/&#34;&gt;Anthropic Pulls Fable 5 After U.S. Government Directive&lt;/a&gt; — Fortune, June 13, 2026.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/satyanadella/status/2066182223213293753&#34;&gt;A Frontier Without an Ecosystem Is Not Stable&lt;/a&gt; — Satya Nadella, June 14, 2026.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/harnessing-ai/&#34;&gt;Harnessing AI&lt;/a&gt; — tomtunguz.com.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://www.salesforce.com/news/press-releases/2026/06/15/salesforce-signs-definitive-agreement-to-acquire-fin/?bc=HL&#34;&gt;Salesforce Signs Definitive Agreement to Acquire Fin&lt;/a&gt; — Salesforce, June 15, 2026.&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:5&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/10-best-books-of-2025/&#34;&gt;10 Best Books of 2025&lt;/a&gt; — Donella Meadows&amp;rsquo; Thinking in Systems.&amp;#160;&lt;a href=&#34;#fnref:5&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:6&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/tokens-per-result/&#34;&gt;Tokens Per Result&lt;/a&gt; — tomtunguz.com.&amp;#160;&lt;a href=&#34;#fnref:6&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>A CEO&#39;s Cost of Capital Advantage</title>
      
      <link>https://www.tomtunguz.com/personal-cost-of-capital/</link>
      <pubDate>Fri, 12 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/personal-cost-of-capital/</guid>
      
      <description>&lt;p&gt;SpaceX IPOs today. One hallmark of the largest IPO in history : Elon Musk&amp;rsquo;s astoundingly low cost of capital. Despite raising 25x more than the typical founder, Musk retained ownership in the top decile.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:304px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_304,c_fill,g_auto,q_auto,f_auto/y4yeymzs0wrauqjiwoho&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/y4yeymzs0wrauqjiwoho&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_850,c_fill,g_auto,q_auto,f_auto/y4yeymzs0wrauqjiwoho&#34;
    alt=&#34;Musk has raised 25x more than most &amp;amp; kept top decile ownership&#34;
    width=&#34;756&#34;
    height=&#34;425&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Some founders raise $2m for an idea. Others raise $15m. Yet others raise hundreds of millions.&lt;/p&gt;
&lt;p&gt;Inverting&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, we can say some people have a high cost of capital &amp;amp; others a low cost of capital.&lt;/p&gt;
&lt;p&gt;At inception, cost of capital is purely personal. Founders &amp;amp; an idea. No business exists yet to evaluate. Over time, the combination of the team &amp;amp; the business&amp;rsquo;s performance dictates cost of capital.&lt;/p&gt;
&lt;p&gt;Early wins lower the cost of the next raise. Cheaper capital funds bigger bets. Bigger bets produce bigger wins. Musk&amp;rsquo;s trajectory from Zip2 to PayPal to Tesla to SpaceX is the flywheel in motion.&lt;/p&gt;
&lt;p&gt;The flywheel attracts capital from everywhere. Tesla&amp;rsquo;s retail ownership is 7x higher than the S&amp;amp;P 500 average&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;. SpaceX allocated a large portion of the offering to retail as well.&lt;/p&gt;
&lt;p&gt;Musk raised more capital than nearly any founder in history &amp;amp; retained more ownership than most. His personal cost of capital made that possible.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&amp;ldquo;Invert, always invert&amp;rdquo; is a mental model popularized by Charlie Munger, originating from 19th-century mathematician Carl Gustav Jacob Jacobi.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://www.webull.com/news/13249038316839936&#34;&gt;https://www.webull.com/news/13249038316839936&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The AI Glass Ceiling</title>
      
      <link>https://www.tomtunguz.com/upper-bound-corporate-ai/</link>
      <pubDate>Wed, 10 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/upper-bound-corporate-ai/</guid>
      
      <description>&lt;p&gt;We&amp;rsquo;ve reached the upper bound of AI.&lt;/p&gt;
&lt;p&gt;Not in the sense that performance won&amp;rsquo;t improve. On the contrary, AI will improve AI.&lt;/p&gt;
&lt;p&gt;But Anthropic&amp;rsquo;s Fable release has imposed a glass ceiling. How do you release the most powerful model in the world to everyone without destroying kingdoms?&lt;/p&gt;
&lt;p&gt;Strong guardrails. It&amp;rsquo;s easy to trigger a gentle reminder of verboten topics : ask for a description of a plant cell or a detailed description of a modern large language model or question about software security.&lt;/p&gt;
&lt;p&gt;But if we remain within the playground, Fable is the most powerful AI yet. Stripe compressed months of engineering into days : a 50-million-line Ruby codebase migrated in a single day, a refactor across tens of thousands of lines completed in 45 minutes.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;In my testing, Fable doubled inference performance on local models, besting the efforts of other state-of-the-art systems. Adding 10-15 percentage points on key benchmarks compared to typical improvements of 2 percentage points, Fable represents a genuine leap.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;We&amp;rsquo;re still understanding the best ways of using AI : techniques change every day. RAG, Plan/Act, Ralph Wiggum loops, /goals, structured prompting, MCP. How many fashions have we seen when the seasons of AI trends are measured in days?&lt;/p&gt;
&lt;p&gt;Systems this powerful need to be phased in to allow the backbones of technology, banking, &amp;amp; energy to harden themselves in anticipation of increasingly powerful attacks.&lt;/p&gt;
&lt;p&gt;The glass ceiling exists. It was inevitable for stability. It will rise over time, but for now there&amp;rsquo;s vast area underneath its curve.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://www.anthropic.com/news/claude-fable-5-mythos-5&#34;&gt;Stripe&amp;rsquo;s Ruby migration using Claude&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://anthropic.com/claude-fable-5-mythos-5-system-card&#34;&gt;Claude Fable 5 &amp;amp; Claude Mythos 5 System Card — Anthropic Research&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The Substitution Wave in AI</title>
      
      <link>https://www.tomtunguz.com/inflation-deflation-ai/</link>
      <pubDate>Sun, 07 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/inflation-deflation-ai/</guid>
      
      <description>&lt;p&gt;Three forces are reshaping the AI cost structure :&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Foundation labs are moving up the stack into applications,&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;li&gt;Frontier model prices keep rising for the smartest models,&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;li&gt;Open-source models have crossed the good enough threshold for most use cases.&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt; &lt;sup id=&#34;fnref:5&#34;&gt;&lt;a href=&#34;#fn:5&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The natural response from AI buyers is substitution.&lt;/p&gt;
&lt;p&gt;Coinbase&lt;sup id=&#34;fnref:6&#34;&gt;&lt;a href=&#34;#fn:6&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;6&lt;/a&gt;&lt;/sup&gt; :&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;At Coinbase we&amp;rsquo;re working hot on routing prompts to cheaper models where appropriate, &amp;amp; in some cases have been able to keep costs roughly flat, while token usage continues to grow exponentially.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Lindy&lt;sup id=&#34;fnref:7&#34;&gt;&lt;a href=&#34;#fn:7&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;7&lt;/a&gt;&lt;/sup&gt; :&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Pulled the trigger today &amp;amp; switched 100% of Lindy traffic to DeepSeek v4, churning from Anthropic models. Saves us millions of $ &amp;amp; we&amp;rsquo;re actually seeing an &lt;em&gt;increase&lt;/em&gt; in performance on many core use cases. Transformative for the business.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Harvey&lt;sup id=&#34;fnref:8&#34;&gt;&lt;a href=&#34;#fn:8&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;8&lt;/a&gt;&lt;/sup&gt; :&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On a 100-task slice of our Legal Agent Benchmark (LAB), SFT moved Kimi 2.6&amp;rsquo;s all-pass rate from 11% to 15%, beating Opus&amp;rsquo; 14%. But the cost gap was even more striking : $84 vs $954 across the same 100 tasks, or ~11x cheaper.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Cursor went further. They post-trained Kimi K2.5 into their own production model, Composer.&lt;sup id=&#34;fnref:9&#34;&gt;&lt;a href=&#34;#fn:9&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Composer 2.5 is exceptionally intelligent &amp;amp; up to 10x more efficient than similarly capable models.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Coinbase&amp;rsquo;s quote shows where the savings go : costs flat, tokens exponential. Buyers don&amp;rsquo;t pocket the discount — they spend it on more intelligence.&lt;/p&gt;
&lt;p&gt;Closed models are getting more expensive at the frontier; open models are getting cheaper at parity. The choice is which slope you want under your unit economics.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:412px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_412,c_fill,g_auto,q_auto,f_auto/zoiothfpwaee5fqfrjq5&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/zoiothfpwaee5fqfrjq5&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_1152,c_fill,g_auto,q_auto,f_auto/zoiothfpwaee5fqfrjq5&#34;
    alt=&#34;Ramp cost curve framing for AI buyers and app purveyors&#34;
    width=&#34;756&#34;
    height=&#34;576&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/Law360/status/2062263047578673481&#34;&gt;https://x.com/Law360/status/2062263047578673481&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://theoryvc.com/blog-posts/are-foundation-models-and-application-companies-friends-or-foes&#34;&gt;https://theoryvc.com/blog-posts/are-foundation-models-and-application-companies-friends-or-foes&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/ai-model-inflation/&#34;&gt;https://tomtunguz.com/ai-model-inflation/&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/using-local-ai-to-work-faster/&#34;&gt;https://tomtunguz.com/using-local-ai-to-work-faster/&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:5&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/the-thriving-ecosystem-of-open-models/&#34;&gt;https://tomtunguz.com/the-thriving-ecosystem-of-open-models/&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:5&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:6&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/brian_armstrong&#34;&gt;https://x.com/brian_armstrong&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:6&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:7&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/Altimor/status/2062389885437366342&#34;&gt;https://x.com/Altimor/status/2062389885437366342&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:7&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:8&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/harvey/status/2062218656420167785&#34;&gt;https://x.com/harvey/status/2062218656420167785&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:8&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:9&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/cursor_ai/status/2056415414977187904&#34;&gt;https://x.com/cursor_ai/status/2056415414977187904&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:9&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The Minimill of AI</title>
      
      <link>https://www.tomtunguz.com/using-local-ai-to-work-faster/</link>
      <pubDate>Fri, 05 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/using-local-ai-to-work-faster/</guid>
      
      <description>&lt;p&gt;A laptop on my desk now handles 78% of my AI work, with the rest sent to the cloud. The shift came out of my &lt;a href=&#34;https://tomtunguz.com/the-pi-agent-skill-distillation/&#34;&gt;skill distillation work&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s how it works.&lt;/p&gt;
&lt;p&gt;I create tasks in Asana. An agent sees the task : scheduling, email triage, research, a CRM update ; &amp;amp; classifies it as easy or hard. If it&amp;rsquo;s straightforward, a local model on my Mac handles it in seconds. If it&amp;rsquo;s complex, the same model routes it to a cloud model.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:653px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_653,c_fill,g_auto,q_auto,f_auto/px49a9ub9nd1cyoaltbt&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/px49a9ub9nd1cyoaltbt&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_1828,c_fill,g_auto,q_auto,f_auto/px49a9ub9nd1cyoaltbt&#34;
    alt=&#34;Local router replacing a single queue with a two-lane scheduler&#34;
    width=&#34;756&#34;
    height=&#34;914&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Across the last seven days, daily peaks reached 88%.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:312px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_312,c_fill,g_auto,q_auto,f_auto/sjfy7dxj6fmpsir8cjvw&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/sjfy7dxj6fmpsir8cjvw&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_872,c_fill,g_auto,q_auto,f_auto/sjfy7dxj6fmpsir8cjvw&#34;
    alt=&#34;Daily share of model route decisions handled locally, May 29 to June 4&#34;
    width=&#34;756&#34;
    height=&#34;436&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;As the workload grew, the two-lane design paid off. Throughput jumped about 25%, average task duration fell from 47 seconds to 19, &amp;amp; queue age dropped from 73 seconds to four. Nothing about the work changed. Small, fast tasks simply stopped waiting behind big, slow ones.&lt;/p&gt;
&lt;p&gt;The task factory that uses &lt;a href=&#34;https://tomtunguz.com/the-pi-agent-skill-distillation/&#34;&gt;distilled skills&lt;/a&gt; is now humming along with 25% more throughput, queue age down 94%, &amp;amp; a much more responsive system. For now, the cloud handles the hard fifth. The Mac handles the rest.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s the minimill of agentic work. Nucor&amp;rsquo;s minimills&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; started small, capital-light, &amp;amp; close to demand; within a generation they outflanked the integrated steel giants.&lt;/p&gt;
&lt;p&gt;Every laptop, phone, &amp;amp; edge device with enough memory to host a distilled model becomes its own minimill : routing locally, paying cloud rates only for the hard fifth. Tens of millions of these will proliferate inside companies in the next few years, each one quietly absorbing much of the work that today shows up on a hyperscaler invoice.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Nucor began in the 1960s by melting scrap steel in electric-arc furnaces rather than smelting iron ore in giant integrated blast-furnace mills. Each minimill was a fraction of the size &amp;amp; cost of an integrated plant, sited near regional demand, &amp;amp; ran on flexible, lower-cost labor. The integrated mills dismissed minimills as fit only for low-grade products like rebar. Over the next thirty years Nucor moved up-market into sheet steel &amp;amp; structural beams, &amp;amp; by 2014 had become the largest steel producer in the United States, while most of the integrated giants (Bethlehem, LTV, National) had gone bankrupt. Clayton Christensen used the story as the canonical example of disruptive innovation in &lt;em&gt;The Innovator&amp;rsquo;s Dilemma&lt;/em&gt;.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Intelligence Per Dollar</title>
      
      <link>https://www.tomtunguz.com/tokens-per-result/</link>
      <pubDate>Wed, 03 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/tokens-per-result/</guid>
      
      <description>&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:288px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_288,c_fill,g_auto,q_auto,f_auto/fczixgwvrhqhqt5uxfto&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/fczixgwvrhqhqt5uxfto&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_806,c_fill,g_auto,q_auto,f_auto/fczixgwvrhqhqt5uxfto&#34;
    alt=&#34;Screenshot 2026-06-02 at 9.22.43 PM&#34;
    width=&#34;756&#34;
    height=&#34;403&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Yesterday Microsoft added a new metric to a model release card, one that will likely become a standard.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Average token usage.&lt;/p&gt;
&lt;p&gt;In the first row, the Microsoft model hits 71.6 on SWE-Bench Verified using about a third of the tokens Claude Haiku 4.5 burns.&lt;/p&gt;
&lt;p&gt;Benchmarks are now measured on two different dimensions, the overall performance &amp;amp; the cost to achieve that intelligence.&lt;/p&gt;
&lt;p&gt;This is yet another sign that the era of subsidies&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;, tokenmaxxing&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;, &amp;amp; all-out performance for many use cases is over.&lt;/p&gt;
&lt;p&gt;Even the most valuable companies in the world cannot afford state-of-the-art intelligence for every conceivable use case.&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt; Uber capped employee AI spending after blowing through its budget in four months.&lt;sup id=&#34;fnref:5&#34;&gt;&lt;a href=&#34;#fn:5&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;5&lt;/a&gt;&lt;/sup&gt; Salesforce is spending $300M on Anthropic tokens &amp;amp; has frozen engineering hires.&lt;sup id=&#34;fnref:6&#34;&gt;&lt;a href=&#34;#fn:6&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;This new dual benchmark answers the buyer&amp;rsquo;s only question : what is my intelligence per dollar?&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:270px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_270,c_fill,g_auto,q_auto,f_auto/hnlfpw6c8qaurohqluul&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/hnlfpw6c8qaurohqluul&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_756,c_fill,g_auto,q_auto,f_auto/hnlfpw6c8qaurohqluul&#34;
    alt=&#34;Screenshot 2026-06-03 at 5.49.00 AM&#34;
    width=&#34;756&#34;
    height=&#34;378&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Artificial Analysis already benchmarks this.&lt;sup id=&#34;fnref:7&#34;&gt;&lt;a href=&#34;#fn:7&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;7&lt;/a&gt;&lt;/sup&gt; GPT 5.5 &amp;amp; Claude Opus 4.8 land within a point of each other on the Intelligence Index, around 60. Running the index costs $3,357 on GPT 5.5 &amp;amp; $4,685 on Opus 4.8. Same answer, 40% more expensive.&lt;/p&gt;
&lt;p&gt;Model companies must now compete on both dimensions. The application layer will compete one level up, on dollars per outcome, what a closed ticket, a shipped PR, or a resolved support case actually costs.&lt;/p&gt;
&lt;p&gt;Every layer in the stack now has to price the same way the customer thinks : per result, not per token.&lt;/p&gt;
&lt;hr&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://microsoft.ai/news/introducingmai-code-1-flash/&#34;&gt;Introducing MAI-Code-1-Flash&lt;/a&gt; — Microsoft announces a new coding model with average token usage on the release card.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/ai-model-inflation/&#34;&gt;The Unsustainable Subsidy&lt;/a&gt; — The era of AI subsidies is ending.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/tokenmaxxing/&#34;&gt;Tokenmaxxing&lt;/a&gt; — Models that game benchmarks with extra tokens are losing their edge.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://www.windowscentral.com/microsoft/microsoft-cancels-claude-code-licenses-shifting-developers-to-github-copilot-cli-a-move-likely-driven-by-financial-motives&#34;&gt;Microsoft cancels Claude Code licenses, shifting developers to GitHub Copilot CLI&lt;/a&gt; — Microsoft cancelled Claude Code licenses across its Experiences and Devices division (Windows, Microsoft 365, Outlook, Teams, Surface) after engineering usage outran budgets.&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:5&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://techcrunch.com/2026/06/02/uber-caps-employee-ai-spending-after-blowing-through-budget-in-four-months/&#34;&gt;Uber caps employee AI spending after blowing through budget in 4 months&lt;/a&gt; — Uber caps employee AI spending after blowing through budget in four months.&amp;#160;&lt;a href=&#34;#fnref:5&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:6&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://enterprisedna.co/resources/news/salesforce-300m-anthropic-tokens-engineer-hiring-freeze-2026/&#34;&gt;Salesforce Spends $300M on AI, Freezes Engineering Hires&lt;/a&gt; — Salesforce Spends $300M on AI, Freezes Engineering Hires.&amp;#160;&lt;a href=&#34;#fnref:6&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:7&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://artificialanalysis.ai/&#34;&gt;AI Model &amp;amp; API Providers Analysis&lt;/a&gt; — Independent analysis of AI model costs.&amp;#160;&lt;a href=&#34;#fnref:7&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The Thriving Ecosystem of Open Models</title>
      
      <link>https://www.tomtunguz.com/the-thriving-ecosystem-of-open-models/</link>
      <pubDate>Tue, 02 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/the-thriving-ecosystem-of-open-models/</guid>
      
      <description>&lt;blockquote&gt;
&lt;p&gt;Competition is a discovery procedure.
— Friedrich Hayek&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And developers are discovering the value of open models.&lt;/p&gt;
&lt;p&gt;OpenRouter offers a useful view into the model market.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; It is not the whole AI economy. But it is close to the API frontier, where developers can switch models quickly, compare price-performance daily, &amp;amp; route each request to the best available option.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:304px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_304,c_fill,g_auto,q_auto,f_auto/pgrtbnrcmuh2kafp1oi3&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/pgrtbnrcmuh2kafp1oi3&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_850,c_fill,g_auto,q_auto,f_auto/pgrtbnrcmuh2kafp1oi3&#34;
    alt=&#34;Stacked chart of open versus closed model token share on OpenRouter&#34;
    width=&#34;756&#34;
    height=&#34;425&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Since 2025, open models have grown sharply on OpenRouter. In the latest model-level snapshot, open-weight models generated 69.1% of named open-versus-closed token volume. Closed models produced 30.9%.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:304px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_304,c_fill,g_auto,q_auto,f_auto/e9ygofyo8luvha2peyf6&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/e9ygofyo8luvha2peyf6&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_850,c_fill,g_auto,q_auto,f_auto/e9ygofyo8luvha2peyf6&#34;
    alt=&#34;Open-model demand jumps with launches&#34;
    width=&#34;756&#34;
    height=&#34;425&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;New models attract developer attention &amp;amp; large scale testing, after which token use surges. Each new clustered release of different models sustains a new plateau of token volume.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:304px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_304,c_fill,g_auto,q_auto,f_auto/rtbwjm5pleqmxdfzvzfy&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/rtbwjm5pleqmxdfzvzfy&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_850,c_fill,g_auto,q_auto,f_auto/rtbwjm5pleqmxdfzvzfy&#34;
    alt=&#34;Open-model leadership keeps changing hands&#34;
    width=&#34;756&#34;
    height=&#34;425&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Just as in the closed-model ecosystem, the competition among open models means rapid innovation &amp;amp; leaderboard changes.&lt;/p&gt;
&lt;p&gt;DeepSeek&amp;rsquo;s early lead gave way to MiniMax &amp;amp; Kimi models in late 2025 &amp;amp; early 2026. Later, launches from MiMo, Qwen, Alibaba&amp;rsquo;s open-weight model family, Hy3, Tencent&amp;rsquo;s open-weight model release, &amp;amp; DeepSeek reshuffled share again.&lt;/p&gt;
&lt;p&gt;Arcee, a US lab focused, makes a strong appearance recently.&lt;/p&gt;
&lt;p&gt;Open models still represent a fraction of overall inference, but the thriving competition, increasing usage, &amp;amp; surge of experimentation suggest developers are increasingly willing to route production traffic to them.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Source data: &lt;a href=&#34;https://openrouter.ai/rankings&#34;&gt;OpenRouter rankings &amp;amp; usage data&lt;/a&gt;, analyzed from weekly token-volume snapshots in the OpenRouter analysis dataset.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>The AI Skepticism Map</title>
      
      <link>https://www.tomtunguz.com/ai-shorts/</link>
      <pubDate>Mon, 01 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/ai-shorts/</guid>
      
      <description>&lt;p&gt;With Michael Burry &lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; &amp;amp; Leopold Aschenbrenner &lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; placing heavy short trades on AI, questions about GPU depreciation, &amp;amp; the Saaspocalypse, how negative is the financial market on AI?&lt;/p&gt;
&lt;p&gt;We can look at the percentage of shares sold short, a bet the stock will decline.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:304px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_304,c_fill,g_auto,q_auto,f_auto/efhbgcdt1zv5fhknjjix&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/efhbgcdt1zv5fhknjjix&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_850,c_fill,g_auto,q_auto,f_auto/efhbgcdt1zv5fhknjjix&#34;
    alt=&#34;AI shorts have edged higher&#34;
    width=&#34;756&#34;
    height=&#34;425&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Across all software, semiconductor, neocloud, data center, &amp;amp; hyperscalers, the median short interest (short shares / total shares) has increased by about 24% in the last quarter.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:338px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_338,c_fill,g_auto,q_auto,f_auto/qfcljxtzpzm57b8twiyc&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/qfcljxtzpzm57b8twiyc&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_946,c_fill,g_auto,q_auto,f_auto/qfcljxtzpzm57b8twiyc&#34;
    alt=&#34;AI cloud and neoclouds have the gloomiest sky&#34;
    width=&#34;756&#34;
    height=&#34;473&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;One segment stands out for gloomy skies in the cloud: the GPU data center businesses, whose shorted shares have grown 60% in the last year &lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;. AI cloud and neocloud companies have the highest current median short interest at 16.8% of float.&lt;/p&gt;
&lt;p&gt;The negative sentiment for SaaS &amp;amp; Dev Tools is a more abrupt &amp;amp; recent phenomenon. Developer tools and infrastructure software follow at 9.5%. Enterprise SaaS and AI apps sit at 8.9%.&lt;/p&gt;
&lt;p&gt;Hyperscalers are at the other end of the spectrum. Their median short interest is 1.1%. NVIDIA, the defining AI infrastructure stock, is also lightly shorted: 1.2%.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:338px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_338,c_fill,g_auto,q_auto,f_auto/q9kptfdiq5sojoi5h2o4&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/q9kptfdiq5sojoi5h2o4&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_946,c_fill,g_auto,q_auto,f_auto/q9kptfdiq5sojoi5h2o4&#34;
    alt=&#34;Enterprise AI apps saw the sharpest rise&#34;
    width=&#34;756&#34;
    height=&#34;473&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;Semiconductor stocks saw a decrease in short-selling. With memory makers like Micron up 742% this year &lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt;, &amp;amp; many ecosystem CEOs pointing to memory &amp;amp; storage as the limiting factor, the newest trillion-dollar companies are all memory.&lt;/p&gt;
&lt;p&gt;The stocks with the most actively bearish betters? Most of these are small or mid-cap companies. The updated chart below adds market capitalization to each company label. The largest AI winners are mostly absent.&lt;/p&gt;
&lt;p&gt;SoundHound AI is 36.3% short. C3.ai is 32.2%. BigBear.ai is 29.4%. Applied Digital is 28.0%. UiPath is 22.0%. TeraWulf is 21.3%.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:338px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_338,c_fill,g_auto,q_auto,f_auto/cx2motwu7izvlnzsola0&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/cx2motwu7izvlnzsola0&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_946,c_fill,g_auto,q_auto,f_auto/cx2motwu7izvlnzsola0&#34;
    alt=&#34;Small AI names dominate the short book&#34;
    width=&#34;756&#34;
    height=&#34;473&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;This is the market&amp;rsquo;s current AI skepticism map.&lt;/p&gt;
&lt;p&gt;The skepticism is concentrated in companies whose AI exposure still depends on future capital access, future demand, or future operating leverage.&lt;/p&gt;
&lt;p&gt;That distinction matters. If short interest were rising uniformly across AI semiconductors, hyperscalers, and software, the message would be broad fatigue with the AI trade. Instead, the data suggest a more specific view: memory has become critical &amp;amp; in short supply; software &amp;amp; devtools businesses need to prove their worth post-AI; &amp;amp; businesses reselling GPUs have more than their fair share of doubters about current prices versus long-term value.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/michaeljburry/status/2060897772782375243&#34;&gt;https://x.com/michaeljburry/status/2060897772782375243&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://finance.yahoo.com/markets/stocks/articles/nvidia-corporation-nvda-leopold-aschenbrenner-212121469.html&#34;&gt;https://finance.yahoo.com/markets/stocks/articles/nvidia-corporation-nvda-leopold-aschenbrenner-212121469.html&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://tomtunguz.com/the-other-leverage-in-software-and-ai/&#34;&gt;https://tomtunguz.com/the-other-leverage-in-software-and-ai/&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://www.google.com/finance/beta/quote/MU:NASDAQ?window=YTD&#34;&gt;https://www.google.com/finance/beta/quote/MU:NASDAQ?window=YTD&lt;/a&gt;&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Skill Distillation</title>
      
      <link>https://www.tomtunguz.com/the-pi-agent-skill-distillation/</link>
      <pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/the-pi-agent-skill-distillation/</guid>
      
      <description>&lt;p&gt;I&amp;rsquo;ve been using state-of-the-art models to teach small models running on my computer how I work.&lt;/p&gt;
&lt;p&gt;My personal agent, based on &lt;a href=&#34;https://github.com/earendil-works/pi&#34;&gt;Pi&lt;/a&gt;, runs my inbox, my deal pipeline, my blog publishing, my calendar, &amp;amp; my research. It looks less like a chatbot &amp;amp; more like a small operating system.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:315px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_315,c_fill,g_auto,q_auto,f_auto/ziwcr39iporaaarazp5t&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/ziwcr39iporaaarazp5t&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_882,c_fill,g_auto,q_auto,f_auto/ziwcr39iporaaarazp5t&#34;
    alt=&#34;The Pi Agent architecture : QMD procedural memory, SKILL.md playbooks, &amp;amp; the agent loop with tools &amp;amp; MCP&#34;
    width=&#34;756&#34;
    height=&#34;441&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;The first layer is &lt;strong&gt;&lt;a href=&#34;https://github.com/tobi/qmd&#34;&gt;QMD&lt;/a&gt;&lt;/strong&gt;, a local markdown knowledge base of about eighty workflow files in &lt;code&gt;~/memories&lt;/code&gt;. Before answering any procedural question, the agent searches QMD for the right playbook.&lt;/p&gt;
&lt;p&gt;The second layer is &lt;strong&gt;Skills&lt;/strong&gt;, atomic &lt;code&gt;SKILL.md&lt;/code&gt; files that describe one job each. The skills are written by a frontier model. So are the evaluations that grade them. The same system writes, tests, and rewrites each skill until accuracy converges. It also checks recall against QMD, so the right keywords always surface the right skill.&lt;/p&gt;
&lt;p&gt;The third layer is the &lt;strong&gt;Agent Loop&lt;/strong&gt;, a model running Plan → Tool Call → Observe → Refine, calling out to seventeen Rust APIs &amp;amp; a handful of MCP integrations.&lt;/p&gt;
&lt;!--[if mso | IE]&gt;
&lt;v:rect xmlns:v=&#34;urn:schemas-microsoft-com:vml&#34; fill=&#34;true&#34; stroke=&#34;false&#34; style=&#34;width:540px;height:298px;&#34;&gt;
  &lt;v:fill type=&#34;tile&#34; src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_540,h_298,c_fill,g_auto,q_auto,f_auto/lej9l80oienqfse5y0l6&#34; /&gt;
  &lt;v:textbox style=&#34;mso-fit-shape-to-text:true&#34; inset=&#34;0,0,0,0&#34;&gt;
&lt;![endif]--&gt;
&lt;div style=&#34;margin:0 auto;max-width:756px;&#34;&gt;&lt;a href=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/q_auto,f_auto/lej9l80oienqfse5y0l6&#34; target=&#34;_blank&#34; style=&#34;display:block;text-decoration:none;&#34;&gt;&lt;img
    src=&#34;https://res.cloudinary.com/dzawgnnlr/image/upload/w_1512,h_834,c_fill,g_auto,q_auto,f_auto/lej9l80oienqfse5y0l6&#34;
    alt=&#34;Skill distillation : a frontier model authors SKILL.md files that smaller local models execute&#34;
    width=&#34;756&#34;
    height=&#34;417&#34;
    style=&#34;display:block;width:100%;max-width:756px;height:auto;border:0;cursor:pointer;&#34;
    loading=&#34;lazy&#34;
  /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;!--[if mso | IE]&gt;
  &lt;/v:textbox&gt;
&lt;/v:rect&gt;
&lt;![endif]--&gt;
&lt;p&gt;One of the techniques I&amp;rsquo;ve started to use is &lt;strong&gt;skill distillation&lt;/strong&gt;. A frontier model, Opus 4.7, GPT-5.1, Gemini 3 Pro, authors &amp;amp; refines the skill files. A smaller model, Qwen 35B or Gemma 26B running locally, executes them. The teacher transfers procedural knowledge to the student through markdown. The skill is inspectable, versionable, &amp;amp; hot-swappable.&lt;/p&gt;
&lt;p&gt;This is fundamentally different from classical knowledge distillation, which compresses a big model&amp;rsquo;s soft probability outputs into a smaller model&amp;rsquo;s weights. It&amp;rsquo;s different from instruction tuning, which bakes behavior into weights through prompt-response pairs. It&amp;rsquo;s different from RAG, which retrieves facts.&lt;/p&gt;
&lt;p&gt;Skill distillation retrieves &lt;em&gt;procedures&lt;/em&gt;. The smaller model doesn&amp;rsquo;t have to know how to evaluate a company. It just has to know how to follow the steps.&lt;/p&gt;
&lt;p&gt;Every night a system runs through historical logs to understand what new skills should be generated, mirroring the loop that &lt;a href=&#34;https://www.youtube.com/watch?v=B246K_G7mHU&#34;&gt;Pete Koomen described at Y Combinator&lt;/a&gt; earlier this week.&lt;/p&gt;
&lt;p&gt;The frontier model becomes a teacher. The library becomes the company&amp;rsquo;s institutional knowledge. The student becomes whichever model happens to be cheapest this quarter.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Security in the Age of AI Agents: Office Hours with Jonathan Jaffe</title>
      
      <link>https://www.tomtunguz.com/jonathan-jaffe-office-hours-post-event/</link>
      <pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.tomtunguz.com/jonathan-jaffe-office-hours-post-event/</guid>
      
      <description>&lt;p&gt;When security practitioners become engineers, the mission changes from managing people to architecting the automated policies that govern an agentic world.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/jonathanjaffe/&#34;&gt;Jonathan Jaffe&lt;/a&gt;, CISO at Lemonade, joined me on Office Hours to discuss what this means for how we build, secure, &amp;amp; operate AI systems when both sides are automated.&lt;/p&gt;
&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube.com/embed/mmhEpifpmgg?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; title=&#34;YouTube video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;

&lt;p&gt; &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI is just as powerful for defenders as it is for attackers.&lt;/strong&gt; The fear narrative underestimates this fact. Defenders harden everywhere, simultaneously, because every vendor in the stack is also racing to ship.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;There are tens of thousands of attack targets out there. The chances that you&amp;rsquo;re going to be one of those is small. At the same time, all of the vendors that you use will also have access to this to improve their services.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;The window of exploitability is narrowing.&lt;/strong&gt; Yes, AI will write more vulnerable code. But AI-written code also gets reviewed, pen-tested, &amp;amp; patched faster than any human pipeline. Plus, the total number of bugs within a particular piece of software is finite. As the velocity of solving or resolving bugs increases, software will become far more resilient.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Security teams are becoming engineering teams.&lt;/strong&gt; At Lemonade, every security person is an engineer. They built their own AI platform with agents on top of it. One agent reads threat intel. Another checks whether the vulnerable method is actually called in production code.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;Automation is the only way you can deal with the scale of what&amp;rsquo;s coming at us now.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Every agent needs an identity.&lt;/strong&gt; On a single endpoint, we could be running 200 or 10,000 agents, but each one of them needs to be numbered and then governed by policy at the point of action.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;Every agent needs to have an identity, and more than that, you need a way to control policy for all of these agents in a much more complex way than current identity and access management systems do.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Modern agentic security engineering is rapidly transforming, and we should expect to see significantly hardened systems as a result. It&amp;rsquo;s a bright future for security and security professionals.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m grateful to Jonathan for sharing his insights at Office Hours!&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
