Some details on the timeline are not quite precise, and would benefit from linking to a source so that everyone can verify it. For example, HyperClOVA is listed as 204B parameters, but it seems it used 560B parameters (https://aclanthology.org/2021.emnlp-main.274/).
Models that take visual input seem more focused on identifying what is in the image compared to what a human might perceive is in an image, and most interfaces lack any form of automated feedback mechanism for them to look at what it has made.
In short, I have made some fun things with AI but I still end up doing CSS by hand.
It's in the timeline though? Or are you saying that one should somehow be highlighted, even though none of the other ones are? Seems it's just chronological order, with no one being more or less visible than others, as far as I can see.
This keeps bothering me, why they need several iterations to arrive at correct solution instead of doing it first time. The prompts like "repeat solving it until it is correct" don't help.
No, all the models are designed to be "helpful", but different companies see that as different things.
If you're seeing the model deliberately creating errors so you have something to fix, then that sounds like something is fundamentally wrong in your prompt.
Besides that, I'm guessing "repeat solving it until it is correct" is a concise version of your actual prompt, or is that verbatim what you prompt the model? If so, you need to give it more details to actually be able to execute something like that.