I was staring at the OpenRouter usage charts this morning, and something looked wrong.
Usually, when a new model launches, you see a gradual adoption curve. Developers test it, benchmark it, and slowly migrate traffic. But the January 2026 data for Gemini 3 Flash Preview didn’t look like migration. It looked like a stampede.
At the same time, the trend line for Claude Haiku 4.5—the model that was supposed to define the “cheap intelligent” category just three months ago—wasn’t just flat. It was ghosting.
I needed to understand why. So I dug into the feature sets. I checked the pricing tables. I looked for the alternatives that other LLM providers were offering. And finally, I realized where the model went.
It wasn’t a mystery. It was a market correction.
The Data: 236B vs 63.5B

The numbers on OpenRouter tell a brutal story of displacement.
The top public application using Gemini 3 Flash Preview, “Lemonade,” consumed 236 billion tokens this month. In comparison, the top public app for Claude Haiku 4.5, “Claude Code,” consumed 63.5 billion tokens.
This represents a 3.7x difference in volume. Even the third-ranked app on Gemini 3 Flash (Roo Code, 58.4B) nearly matches the entire volume of the #1 app on Haiku.
This isn’t just about one app being popular. It’s about where the automated, high-volume workloads have settled. They have settled on Google.
The Strategic Failure of “Haiku”

To understand this shift, we have to look at the strategies of the two companies.
Anthropic’s Strategy (The Brain):
Anthropic is building intelligence. Their focus is clearly on the high end—Claude Sonnet 4.5 and the massive, reasoning-heavy Claude Opus. Haiku 4.5 was meant to be their “gateway drug,” a cheaper entry point into the ecosystem. But they priced it at $1.00/1M input, treating it as a premium “lite” model.
Google’s Strategy (The Factory):
Google looked at the same market and saw a commodity. They released Gemini 3 Flash at $0.50/1M input with a 1 Million token context window. They didn’t try to make it a “junior partner” to a bigger model; they built it as a standalone industrial engine.
The result? Haiku 4.5 got squeezed.
It’s not smart enough to compete with Claude Sonnet 4.5 (the real brains). And crucially, it is no longer cheap enough to be the worker bee. Developers who need “smart” go to Sonnet or Opus (which explains why Opus usage remains healthy). Developers who need “volume” go to Flash. Haiku exists in a vacuum.
The $0.50 Trap and the Context Moat

I tried to verify if this was purely price-driven, or if something else was at play. I compared the specs side-by-side.
The difference isn’t just the 50% price cut. It’s the 1 Million token context.
For an app like “Kilo Code” (75.4B tokens on Flash), this context window changes the architecture. You don’t need complex RAG pipelines. You don’t need to selectively retrieve files. You just dump the entire codebase into the prompt and let the model work.
With Haiku 4.5’s 200k limit, you are constantly managing state, truncating history, and adding engineering overhead. Google effectively commoditized “laziness.” And in software engineering, the path of least resistance (and lowest cost) always wins.
Conclusion: The Market Has Spoken
Haiku 4.5 was a success in October because it had no competition. It is a failure in January because it refused to adapt.
The OpenRouter data confirms that the market has bifurcated.
1. High Intelligence: Developers stick with Anthropic (Sonnet/Opus) for complex reasoning tasks where price is secondary.
2. High Volume: Developers have aggressively moved to Google (Gemini 3 Flash) for swarm agents, coding loops, and data processing.
Unless Anthropic rethinks the “Haiku” tier—perhaps a Haiku 5 at $0.40—this gap will only widen. The “middle child” of AI models has no future.
