I'm very happy with using it to just "do things". When doing in depth debugging or a massive plan is needed, I'd go with something better, but later going through the motions? It works.
"MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks" could be an IDE, a UI framework, a performance library, or, or...
Whatever benchmark opus is ahead in should be treated as a very important metric of proper generalization in models.
These days, by default I just use Sonnet/Haiku. In most cases it's more than good enough for me. It's plenty with $20 plan.
With MiniMax, or GLM-4.7, some people like me are just looking for Sonnet level capability at much cheaper price.
At the moment Opus is the only model i can trust even when it generates "refactoring work", it can do the refactoring.
I’m also managing a few projects and teams. One way I’m getting value from my GLM subscription is by building a daily GitHub PR summary bot using a GitHub Action. It’s good enough for me to keep up with the team and to monitor higher-risk PRs.
Right now I’m using GLM more as an agent/API rather than as a coding tool. Claude works best for agentic coding for me.
I’m on Claude $20 plan and I usually start with Haiku, then I switch to Sonnet or Opus for harder or longer tasks.
Claude Code with GLM seems ok to me, I just it use it as a backup LLM if in case I hit usage limits but for some light refactoring it did the job well.
Are you also facing issues with Claude Code and GLM?
Whereas with Sonnet/Haiku, I'm much more guaranteed to have 100% AI assistance throughout my coding session. This matters more to me right now. Just a tradeoff I'm willing to make.
I think it's still not on the $20 plan tho which is sad.
> Claude Opus 4.5, our frontier coding model, is now available in Claude Code for Pro users. Pro users can select Opus 4.5 using the /model command in their terminal.
Opus 4.5 will consume rate limits faster than Sonnet 4.5. We recommend using Opus for your most complex tasks and using Sonnet for simpler tasks.
This compresses to: “We are updating our model, MiniMax, to 2.1. Agent harnesses exist and Agents are getting more capable.”
A good model and agent harness, pointed at the task of writing this post, might suggest less verbosity and complexity— it comes off as fake and hype-chasing to me, even if your model is actually good. I disengage there.
I saw yall give a lightning talk recently and it was similarly hype-y. Perhaps this is a translation or cultural thing.
is it a cultural thing?
Most people here are big company worker bees where they take zero risks and do very little of substance.
In these organizations, it’s common for large groups of people to get together in “meetings” and endlessly nitpick surface-level details of unimportant things while completely missing the big picture because it’s far too complex to allow for easy opinions or smart-sounding critique.
you are more than welcomed to pick whatever model or software you choose to trust, that is totally fine. However, that is vastly different from bad mouthing a model or software just because its release note contains a single sentence you don't like.
Took me like 5 prompt iterations until it finally listened.
But it's very good, better than flash 3.0 in terms of code output and reasoning while being cheaper.
One of the demos shows them using Claude Code, which is interesting. And the next sections are titled 'Digital Employee' and 'End-to-End Office Automation'. Their ambitions obviously go beyond coding. A sign of things to come...
MiniMax is like 100x more honest.
Are you still writing code by hand?
https://contextarena.ai/?needles=8
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...
https://artificialanalysis.ai/leaderboards/models
https://gorilla.cs.berkeley.edu/leaderboard.html
https://github.com/lechmazur/confabulations
It's nice and simple in the overview mode though. Breaks it down into an intelligence ranking, a coding ranking, and an agentic ranking.
“We're excited for powerful open-source models like M2.1 […]”
Yet as far as I can tell, this model isn’t open at all. Not even open weights, nevermind open source.
When is someone going to vibe code Objective-C 3.0? Borrowing all of the actual good things that have happened since 2.0 is closer than you'd think thanks to LLVM and friends.