Pocket Power : From State of the Art to Your Phone in 23 Months

Two years ago, the idea of useful AI on your phone was fantastical. Siri couldn’t finish a sentence. Local models hallucinated nonsense.

Last week, Google released Gemma 4 E4B¹, a free model that matches GPT-4o and runs entirely on your phone.²

The next few weeks promise even more advanced pocket models. The market expects new releases from DeepSeek³, Qwen⁴, Kimi⁵ & Minimax⁶.

Frontier models don’t stay frontier for long. Within three to four months, you can run a model with similar performance on your laptop; 23 months later, you can run the same model on your phone.

Parameters Required for GPT-4o-Level HumanEval Score : 450x compression in 23 months

Three forces are driving this compression. Better algorithms : distillation & reinforcement learning squeeze more capability into fewer parameters. Talent density : the biggest prizes in capitalism attract the best minds in the field. These are the fastest growing software companies in history. And capital : a trillion dollars invested in data centers powering training.

In 23 months, the same capability that needed 1.8 trillion parameters now fits in 4 billion parameters. A 450x compression. At this rate, the phone in your pocket will run today’s frontier models before you upgrade it.

Google AI Edge Gallery on iOS App Store ↩︎
Gemma 4 E4B matches or exceeds GPT-4o across multiple benchmarks including MATH, GSM8K, GPQA Diamond & HumanEval. Full benchmark comparison ↩︎
DeepSeek’s new AI model ↩︎
Qwen 3.6 ↩︎
Kimi K3 ↩︎
MiniMax M2.5 release ↩︎