Pocket Power : From State of the Art to Your Phone in 23 Months

Two years ago, the idea of useful AI on your phone was fantastical. Siri couldn’t finish a sentence. Local models hallucinated nonsense.

Last week, Google released Gemma 4 E4B1, a free model that matches GPT-4o and runs entirely on your phone.2

The next few weeks promise even more advanced pocket models. The market expects new releases from DeepSeek3, Qwen4, Kimi5 & Minimax6.

Frontier models don’t stay frontier for long. Within three to four months, you can run a model with similar performance on your laptop; 23 months later, you can run the same model on your phone.

Read more

Tokenmaxxing

Two days ago, I burnt 250 million tokens in a single day.

That’s up 20x in six weeks. This idea, called tokenmaxxing, is the deliberate practice of maximizing token consumption. The question : how much electricity can we turn into useful work?

The secret is parallelization. Structure a plan at the start of the day that allows multiple agents to work simultaneously. METR research shows the latest models can now work autonomously for 12 hours, up from 1 hour a year ago. Here’s the ramp once I started implementing a daily plan :

Read more

Marketing in the Agentic Era

Lena Waters Office Hours

On April 9th at 10:00 AM PDT, Lena Waters will kick off a new version of Office Hours.

Lena led marketing at Notion, Grammarly, & DocuSign. At Notion, she was CMO during the company’s AI product transition. She guided the shift from product-led growth to enterprise expansion while the company deepened its position in AI-powered work. At Grammarly, she oversaw marketing as the writing assistant added AI features. At DocuSign, she managed enterprise go-to-market strategy.

Read more

Veblen & Jevon Walk Into a Data Center

Jevon & Veblen walk into a data center.

The dominant motif around AI has been Jevon’s Paradox1 : the cheaper a product becomes, the more it is consumed.

Token prices dropped 10-20x over the past 18 months & demand exploded in response.

Anthropic surged past $19 billion in run-rate last month, up from $9 billion at the end of 2025.2 OpenAI topped $25 billion in annualized revenue in February, a 17% increase in two months.3

Read more

A New Axis of Competition

Would you choose one software over another because it has a proprietary model with better performance?

Two companies shipped custom AI models today (three in a week counting Cursor!1), raising that question. Intercom launched Apex 1.0, a model for answering customer support tickets.2 Chroma released Context-1, a model for multi-hop agent search.3

Apex 1.0 beats GPT-5.4 & Claude Opus 4.5 on customer service tasks.2 Context-1 scores 97% on agent search benchmarks.3 One Intercom gaming customer saw resolution rates jump from 68% to 75%.2

Read more

AI's Bundling Moment

The SaaS era was defined by unbundling : find a workflow, optimize it, own it. Salesforce chose sales automation. Slack chose chat. Dropbox chose file sharing. Point solutions won by perfecting single workflows. The playbook : own one pain point, expand from there.

AI is moving faster than anyone predicted. When models change every 42 days, buyers can’t assemble a best-of-breed stack. They want a platform they can trust for three to five years.

Read more

Cursor, Kimi & the Open Source Imperative

Last week, Cursor launched Composer 2 to over one million daily active users.1 Within hours, a developer discovered Cursor had built its flagship model on top of Moonshot AI’s Kimi K2.5, a Chinese open-source model.2

Moonshot AI’s response? “This is the open model ecosystem we love to support.”3

Cursor’s model is at near parity with state-of-the-art at one-eighth the price.4 It’s also no coincidence the editor powering Cursor is open-source, VS Code.

Read more

The Pricing Power of Agents

In 2025, we predicted that 2026 would be the year agents would earn as much as a person.

It’s already happening.

In markets where there’s a labor shortage and an urgent need to hire people, we are seeing agents command 75%, 85%, even 100% of a human equivalent salary. This is faster than we were anticipating.

The first-order benefit is completing the work.

But there are second-order benefits that are now starting to appear. Training agents is significantly faster since all materials can be presented at once & in parallel to the AI.

Read more

The Robotic Tortoise & the Robotic Hare

I set up a race today between two robots.

My Mac on the left vs Claude Code on the right. Both tasked with building a payment app on Stripe’s new Tempo blockchain. Same prompts, same task, side by side.

Opus 4.5 is about 20% smarter than Qwen 35B on benchmarks. And it’s likely 50x larger. The hare should have won. It didn’t.

The local model finished in 2 minutes. Claude took over 6. I asked Claude to score both outputs : local model 6.5, Claude 4.5.1

Read more

The 12x Bet on AI

For every dollar hyperscalers earn from AI today, they’re spending twelve dollars to build more capacity.1 That’s the bet embedded in $575 billion of capital expenditure this year.2

How fast does AI revenue need to grow to pay back this data center mortgage?

Hyperscaler CapEx vs Cash from Operations 2016-2026

From 2020 to 2024, hyperscalers issued an average of $20 billion in bonds annually.3 In 2025, that jumped to $96 billion. In 2026, it will reach $159 billion.3 Morgan Stanley projects $1.5 trillion over the next few years.4

Read more