In yesterday’s post (which an agent pushed in raw outline form via email!), I wrote about the future of AI email. What does that future cost?

chart_monthly_cost_by_model

If you are using state-of-the-art model ranging, it costs between $22 to $130 per month. Would you pay for that? At work, I imagine, many would. Let’s take the middle case of $26/month raw cost.

A software company seeking 75% gross margin would charge about $350 per year for that product excluding hosting & serving costs. So let’s call it a $500 per year list with a 15% discount at scale.

A Google Enterprise plan is $11-18/month. A fully agentic solution would then cost about twice as much.

chart_all_models_comparison

Smaller models help. They cut cost by 10 to 20x, but we can do better.

By running the models locally, when the cost plummets to zero : users’ GPU does the work.

It’s this type of cost optimization that I have done crudely here that I think will define the next 12 to 24 months of AI software : determining which components can be executed deterministically, like the email filters, which are just rules. And the next is matching the model to the workload.

With some basic heuristics and techniques we can drop the overall cost by 100x. Given the tremendous shortage of GPUs, this segmentation of inference is inevitable.