I do question this finding:
> the small model category as a whole is seeing its share of usage decline.
It's important to remember that this data is from OpenRouter... a API service. Small models are exactly those that can be self-hosted.
It could be the case that total small model usage has actually grown, but people are self-hosting rather than using an API. OpenRouter would not be in a position to determine this.
I'm pretty surprised by that, but I guess that also selects for people who would use openrouter
I'd be interested in a clarification on the reasoning vs non-reasoning metric.
Does this mean the reasoning total is (input + reasoning + output) tokens? Or is it just (input + output).
Obviously the reasoning tokens would add a ton to the overall count. So it would be interesting to see it on an apples to apples comparison with non reasoning models.
- Take in the user query (input tokens)
- Break that into a game plan. Ex: "Based on user query: {query} generate a plan of action." (reasoning tokens)
- Answer (output tokens)
Because the reasoning step runs in a loop until it's run through it's action plan, it frequently uses way more tokens than the input/output step.