Grok seems in general better at being "human" in ways that are hard to define: for eg. if I ask it "does this message roughly convey things correctly, to the level it can given this length", it will likely answer like a human would (either a yes or a change suggestion that sticks to the tone and length), while Chatgpt would write a dissertation on the message that still doesn't clear anything up.
Recently I've noticed that Grok seems to have gotten really good at dictation too (that feature where you click the mic to ask it something). Chatgpt has like 90-95% accuracy with my accent, the speech input on Android's Gboard something like 75%, Grok surprisingly gets something like 98% of my words correct.
They all did pretty well at a more "formal" tone, but GPT4.1 was the only one that didn't make me cringe with a "casual" tone.
Twitter language has started seeming normal casual to us, rather than us using normal casual language in Twitter.
Even if 95% of the spam gets actively reported and dealt with, that still leaves a ton of nonsense on the platform, getting fed into the LLM. And spam has only gotten worse over the years, as the barrier to entry has lowered and lowered.
You know people lie, right? Especially when the lie casts them in a better light and/or makes them more money.
Especially you as a 'english-as-a-second-language', the hitler grok doesn't like people like you :|
Just wish they would finally put some work into their apps, it's the only thing keeping me from actually subscribing to SuperGrok:
- No MCP / connected apps support. It's been teased but here we are, still not available. I can't connect Grok to anything, so I can't use it for serious work
- Projects are still not available in the app so as soon as you move something into a project, it's gone from all the native apps
- No way to add artifacts (like generated markdown docs) directly to a project, we have to export to PDF/markdown and re-import. And there isn't even a way to export artifacts. This makes serious project work hard because we can't dynamically evolve projects with new information
- No memory, no ability to look up other chats, each chat is completely new
- No voice mode in projects at all
If someone from xAI is reading this, please consider adding some of these.
Not saying they should create their own grok-code harness, just allowing usage in existing ones would already be beneficial. But that's probably what the Cursor acquisition is going to do eventually
Anyone remember why Oracle was named Oracle?
Indeed, the update did not go unnoticed. By Tuesday, Grok was calling itself "MechaHitler."...
https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-...
Grok is definitely a reliable source of truthful sane rational information.
Grok has tool use, no? Why would you also need MCP? What does MCP add?
Asked if he knew anything about OpenAI's "safety card," Musk smiled and replied: "Safety card? Why would it be a card?"
https://www.axios.com/2026/04/30/musk-openai-safety-grokLow relevancy in spite of cluster size and musical chair gas generators for time being:
Later in his testimony, Musk was asked about a claim he made last summer that xAI would soon be far beyond any company besides Google. In response, he ranked the world’s leading AI providers, saying Anthropic held the top spot, followed by OpenAI, Google, and Chinese open source models. He characterized xAI as a much smaller company with just a few hundred employees.
https://techcrunch.com/2026/04/30/elon-musk-testifies-that-x...(Affiliated with no AI company, just surprised to read this yesterday - how could Elon miss model cards…concerning…, & the fact money can’t buy success every time.)
I don't like Musk or Grok. But not knowing what's a safety card is not a signal of anything IMO.
system-cards
https://www.anthropic.com/system-cardsYou’d have to be asleep at the wheel. For years:
Claude 2
July 2023
Read system card
But users don’t need to know you’re 100% right, you shouldn’t need to know this inside baseball (you didn’t pollute & compute & gain the responsibility).I think there's a surprising number of actually useful applications in this sort of grey area for a slightly-less guardrailed, near-frontier model (also the grok-fast models are cheap!).
The user above you could have explained what uncensored models he believes are more capable than Grok. Maybe the Chinese open-weights models are superior to Grok at the moment.
Also, I don't know tons about uncensored models because I don't use them. But I do see posts on r/localllama about "abliterated models". Those are models which have been fine tuned to remove safety filters almost entirely while maintaining predictive efficacy.
Has nothing to do with China. People can do this to any open text model as far as I know.
Democrats have no loyalty to their own sex offenders. Look how we treated the California governed candidate, or Anthony weiner, or literally every other sex pest found in our party. Some of them who didn’t even deserve it get canceled like Al Franklin.
Diddling and then defending it and doubling down is literally a maga problem.
Those models are 1T parameters total and 30B or 40B active, this might make abliteration impractical.
About Musk, yes, there is correspondence. The only confirmed meeting appears to be a 30 minute visit at Epstein's house together with Musk's wife at the time.
As for photos you mention, a quick search tells me there is one photo of Musk and Maxwell at a 2014 Vanity Fair Oscar Party.
I find most commentary on here and other platform like Reddit extremely exaggerated compared to what is actually confirmed. Users seem hellbent on linking Musk to pedophilia-related allegations.
I also know he has stated that he has had direct involvement in groks directionality. Thereby it's no surprise to me that grok was generating csam. I also genuinely would not be surprised if grok offered advice for sex trafficking, etc.
All publicly available evidence and discussions from the guy himself.
When the documents were released they found several like thie one below. Saying things like "What day/night will be the wildest party on =our island?" [0]
The "our" part is especially interesting as it implies he didnt just visit, but had an ownership stake.
Other emails were found with Epstein making excuses to avoid having Musk visit, and Musks own child publically stated that the emails were authentic and aligned with her memory of the events. [1]
[0] https://www.justice.gov/epstein/files/DataSet%2010/EFTA01762...
[1] https://www.threads.com/@vivllainous/post/DUMBh2Vkk8D?xmt=AQ...
The first question was around setting up timers for a Fox ESS battery in Home Assistant and disconnecting Fox ESS from the cloud. The second was around cornering speed in Sunnypilot and Frogpilot.
Somewhat niche but if an AI is confidently telling you something wrong it's hard to work with.
But they all do that. It just comes with the territory. Grok will absolutely do the same thing another time you try it.
Like yeah tonally I guess there are. But with regard to references and information? You’re literally just using three different slot machines and claiming one is hot.
I suppose though I shouldn’t be that surprised then since Vegas and every other casino on Earth has been built on duping people in that exact way.
the smartest among them just make the tests complicated and biased; the less intelligent just cherry pick.
of course, would you really expect anyone to do real rsearch in this economy?
People are mostly using GLM and Deepseek via API and Gemma4 and Mistral finetunes locally.
It seems to me like the roleplay market is comparatively old and mature and users have developed cost consciousness and like models to follow their workflow/preferences. So something like Opus is liked for its smartness but considered too expensive and opinionated.
Might be an interesting data point for how the other markets might develop in the future.
I'm not sure I see how that's possible, given their image/video generation seems to be heavily censored. Do they have some alternative product besides "Imagine" or whatever it's called, that people use for generating CSAM?
Judging by https://old.reddit.com/r/grok (but I haven't validated it myself), it seems like people are complaining more about how censored the model is, than anything else, maybe that's not actually true in reality?
There are image models out there with 0 restrictions, even available on HuggingFace or CivitAI, I'm guessing those are way more widely used for things like CSAM than any centralized platform with moderation.
I think the proportion of people generating images that way is likely very low. Though I am sure it is possible.
Here are some links
https://arstechnica.com/tech-policy/2026/01/x-blames-users-f...
https://9to5mac.com/2026/02/17/eu-also-investigating-as-grok...
Concerning.
Obviously, I assumed we all are familiar with our local laws to not unwittingly commit crimes here :)
> I think the proportion of people generating images that way is likely very low
So probably a far cry from "holding the world record for the biggest generator of CSAM" given the amount of local alternatives available? Would be my guess at least, but obviously also hard to know for sure.
> Though I am sure it is possible.
How can you be sure of this? I've tried just now to get Grok to generate even sexually explicit material with adults, and it's unable to, all of the requests are getting moderated and censored. Are you claiming that instead of prompting "A man and a woman having sex" you put "A man and a child having sex" and then the moderation doesn't censor it? Somehow I find that hard to believe, but as you say, I'm not gonna test that either, so I guess we'll never know for sure.
At the same time, in this corner of the world, acting Minister for Justice (also known for trying to push through Chat Control), and NGO Save the Children, have been working to make legal the generation of CSAM for law enforcement use. So that would certainly make the industry legitimate, and you would already have a customer.
https://www.justitsministeriet.dk/pressemeddelelse/regeringe...
edit: to clarify for you, here's an example.
Model A advocates for single-payer healthcare, while Model B prefers for the current US healthcare system. So on that one axis, A is more progressive than B. Neither of them needs to be racist for that calculation.
Grok if anything reduces populism because fake claims can be debunked
I hope the Cursor guys help them catch up to be closer to frontier models because they badly need help in it.
Nonetheless, the 10 Billion and 60 Billion deal with Cursor is weird as hell. I can only imagine that he wants to throw as much money at all of his shit before the IPO.
He probably wants the training data
Margins are going up for the 2 frontier model providers like crazy, and I don't expect it to go down more, I think we have seen the cheapest token prices already.
Still, my impression is, Gemini hallucinate too much while Grok is always less capable than competitors so it's not worth using it.
Pricing is also quite surprising, compared to comparable competitors. I guess they have tons of capacity or really want to bring over more people.
I hope Meta finally comes around, too. I want those sweet, sweet billionaire subsidized tokens.
I am old and cynical - I have no illusions, but I also have my limits and a semblance of moral compass. We, as citizens, can vote with ballots, but also with money.
And, no, I am not someone who keeps boycotting companies for every little grievance (was on the receiving end of that nonsense twice).
You're getting like 40k in tokens a year for $2400. A whole lotta people are about to be sad when they realize they bet their competency on that lasting forever.
I don't think there's a single thread on Xitter whete people don't delegate some question to grok.
(There's a separate conversation of failure modes, and whether it's a good thing, and how much control Elon had when he doesn't like Grok's "woke" responses)
Expensive miscalculation.
No way am I going to use a model where the backing has such blatantly obvious brain washing goals.
(ran this on arena.ai direct chat and also tried to write this gist inspired by how simon writes his gists about pelicans)
Edit: just realized that I made pelican riding a bike instead of bicycle, which now makes sense as to why it hardened the bicycle to look tankier, going to compare this with pelican riding a bicycle if anybody else shares the pelican riding a bicycle.
You should probably come up with variations, like a beaver riding a scooter or something, just to see what's what :)
beaver riding a scooter: https://gist.github.com/SerJaimeLannister/f6de26bd0d0817e056...
pelican riding a bicycle: https://gist.github.com/SerJaimeLannister/f6de26bd0d0817e056...
Personal opinion but the beaver one looks especially bad as compared to pelicans. Can we be for sure that this model of grok-4.3 hasn't been trained on pelican. Simonw in blog-post says that he will try with other creatures so I hope he does that but it does feel to me as the model/xAI is trying to cheat, Hope Simonw tests it out more.
Edit: Also added turtle riding a scooter, something which literally has images online or heck even teenage mutant ninja turtles and I thought that it would be able to pass this but it wasn't even able to generate this: https://gist.github.com/SerJaimeLannister/f6de26bd0d0817e056...
This literally looks more avocado than turtle. Perhaps this could be a bug from arena.ai or something else too, not sure but at this point waiting for simon's analysis.
Thanks for generating those!
(Also it puts Opus 4.7 universally above Opus 4.6, and I may be wrong but this doesn't seem to match the experience of most/many/some people. I think it's widely recognized that Anthropic is severely lacking compute and Opus 4.7 is a costs saving measure)
But then, Anthropic employees don't have rate limits, right?
Update, I noted that Grok 4.3 is in the "Most attractive quadrant", that's cool! It is also in the top 5 highest in "AA-Omniscience Index", good! Really good.
It says #1 for speed but then in the chart it's #2. Also says #10 for intelligence but then it's #7 in the chart.
Politically motivated models can still do a lot of damage that affects me (or "have a lot of impact" depending on whether you like the politics or not) even if I don't engage with them myself.
Even with grock it's only broadening things to creepy corporate right of silicon valley.
I hate giving Elon any money. The man is a net negative to society but … if the models are objectively better then logically I must no?