Two years ago, the idea of useful AI on your phone was fantastical. Siri couldn’t finish a sentence. Local models hallucinated nonsense.

Last week, Google released Gemma 4 E4B1, a free model that matches GPT-4o and runs entirely on your phone.2

The next few weeks promise even more advanced pocket models. The market expects new releases from DeepSeek3, Qwen4, Kimi5 & Minimax6.

Frontier models don’t stay frontier for long. Within three to four months, you can run a model with similar performance on your laptop; 23 months later, you can run the same model on your phone.

Parameters Required for GPT-4o-Level HumanEval Score : 450x compression in 23 months

Three forces are driving this compression. Better algorithms : distillation & reinforcement learning squeeze more capability into fewer parameters. Talent density : the biggest prizes in capitalism attract the best minds in the field. These are the fastest growing software companies in history. And capital : a trillion dollars invested in data centers powering training.

In 23 months, the same capability that needed 1.8 trillion parameters now fits in 4 billion parameters. A 450x compression. At this rate, the phone in your pocket will run today’s frontier models before you upgrade it.


  1. Google AI Edge Gallery on iOS App Store ↩︎

  2. Gemma 4 E4B matches or exceeds GPT-4o across multiple benchmarks including MATH, GSM8K, GPQA Diamond & HumanEval. Full benchmark comparison ↩︎

  3. DeepSeek’s new AI model ↩︎

  4. Qwen 3.6 ↩︎

  5. Kimi K3 ↩︎

  6. MiniMax M2.5 release ↩︎