I spent $3,234.96 on AI last month

Hey there,

One month. $3,234.96. Every major AI model tested to death.

Here's what the receipts actually say.

I ran ~3.52 BILLION tokens through these things. Not a typo.

And the results surprised even me.

The full damage.

It was an expensive month, because I actually built something.

Anthropic: $1,860.69 • 66%
OpenAI: $853.70 • 30%
Other: $95.57 • 4%

And that was just Cursor, excluding my Codex ($200) and Claude Code ($225) subscriptions.

Both of which I've spent 287.8M and 599.6M - an additional 887.4M tokens.

But what did I learn?

Anthropic is not dead.

So here's why I spent $1,860.69 on Claude models this month.

Opus 4.7 just came out two weeks ago and while the whole world is shitting on it, I actually like it.

It's the type of model that is very creative, mostly follows your instructions, and writes great code.

My biggest hot take is that it actually is just as good as GPT 5.5, but only... outside of Claude Code.

Think of Claude Opus like the convertible supercar.

It's incredibly fun to drive, but it's more expensive, and you wouldn't use it to tow a truck out of the mud.

Codex/GPT-5.5 is now the 2nd best app.

Second only to Cursor, which still performs better overall (but at a cost).

It's the one I reach for the most when I don't use Cursor, and the one I recommend to my non-technical clients.

Since it got fast (and more expensive), it has become the #1 AI coding model in the world.

But it still sucks at design, which is why I used Claude models so much.

Claude vs Codex - can you tell which one is which?

Think of it like the SUV of coding models.

It's got very expensive too in 5.5, but it's pulling its weight well.

Why I don't use other models.

Grok 4.20, Gemini 3.1 Pro, GLM 5.1 and Composer-2 make up that 4% share.

Honestly, I keep trying. Every few weeks, I'd give them another shot.

They're all... fine. GLM 5.1 is actually impressive, and so is Composer-2.

But neither can compete with SOTA (state-of-the-art) models we have today.

A "fine" status doesn't earn a spot in my stack.

What I didn't expect.

I thought I'd find one model to rule them all, but I didn't.

What I found is that the best builders both SOTA models like a chef uses different knives.

You wouldn't use a steak knife to butter your toast.

Codex for autonomy and thinking hard.

Opus for creative design problems.

GLM 5.1 or Composer-2 (or now Kimi K2.5) for when you're on a budget.

What this means for you.

Stop trying every new model that drops on Twitter.

The noise is designed to keep you confused and away from building

Use the best two. Learn them well, and actually ship something.

Both offer subscriptions, for for $100-200/mo you get the best-in-class.

The $3,234.96 was an expensive test for me. But you got the notes for free.

Talk next week,
Rob

P.S. Next month I'm opening some very limited spots to my new AI Architect Program. If you're an established business with a real ideas, thing you want to build, and you need a mix of one-on-one and group coaching support to get it done, this will be for you. Reply "SHIP" and we'll put you on a special launch list.

Robin Ebers