Gave it a short prompt and it gave me an openscad model with everything parametrized. I printed with no changes in tpu and it was nearly perfect on the first try. Claude put in a 0.3mm subtraction in the x/y dimensions and I lowered it to 0.1 and it's perfect.
Much easier shape than ancient Roman architecture but still very cool how easy it was.
My Antigravity (forced) replacement for Gemini CLI requires me to log on via browser every time I use it, and my Antigravity IDE won't update at all, so:
If it's ok I'd prefer they just work on reaching a baseline acceptable rollout before worrying about being Top in anything.
Ps actual title:
OpenSCAD LLM Benchmark: Building the Pantheon
So far I like it much more than Gemini CLI (my previous daily driver for personal projects). Seems more mature and "feels more intelligent" (very subjective ofc)
As a side note Autodesk released an agentic assistant back in December for Fusion. Six months later it is still quite bad.
Why is this medium ranked, and not on par with the best two?
A model that knows more in general, will often be better at specific tasks. e.g. If you ask a model to "make a program that estimates the annual production of a solar installation", it needs to have been trained on a lot more than just Python code.
And next year Google will probably sunset Antigravity.
If it doesn't make Google billions, don't trust them.
I can't imagine why (or who) that'd be kept alive for..
funny how some of their projects have undisclosed budgets and profits.
My point is that with every new model release, the expectations grow. I don't know how else to say that.
Where are the normal people :/
Don't get me wrong, I don't think AI coding is a bad thing. For East Asians like myself, it levels the playing field with Westerners, so as long as you rigorously review the AI's output, it's a perfectly viable tool.
However, the absolute farce we just witnessed with the antiGravity2.0 update really raises doubts about whether 'vibe coding' can actually be trusted. If even a behemoth like Google is dropping the ball like this, it says a lot.
I'm sorry, but that sounds exactly like almost every single Google "product" out there, they seem to only care about throwing stuff over the wall as quickly as possible, and you'd have a hard time finding a single Google product that doesn't also feel filled with fragmented choices, like every project of theirs have a different project manager every week.