"No mine is the most dangerous"
"Nuh uh mine is"
"Mine could kill everyone!"
"Mine could do it faster!"
"Prove it!!!"
This is where we are
I'm sure their marketing department is ecstatic but you guys are far more hype-based than what you're calling out.
This AISLE benchmark is interesting in this matter: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag...
And the recently discovered Copy Fail by Xint code is another proof that the gating is overblown: https://xint.io/blog/copy-fail-linux-distributions
I'm not entirely up to date on each week's LLM hype train/scandal but last I heard there was no public access to it or public-trusted 3rd parties that can review model's capabilities
People don't become bad guys just because they lie. The consequences of their actions (and their lies) matter more. Take Elon Musk for instance, he has always been a recognized liar, even when he was a good guy. What changed? Before, he was famous for making the electric car people actually wanted to drive, and cool rockets. Then came the politics: supporting the party most of his fans disliked, being responsible for many government job losses, in particular in the field of environmental preservation (ironic for a supporter of "green" energy), etc...
assuming mythos is a paper tiger: great marketing, keep going
assuming mythos is for real: err, does this have to be explained?
>ChatGPT: This content was flagged for possible cybersecurity risk. If this seems wrong, try rephrasing your request. To get authorized for security work, join the Trusted Access Cyber program.
Put up velvet ropes outside… leak out rumors about the horrors inside. Whether it’s LLMs or carnies with tents full of “freaks” it’s the same playbook.
Watching OpenAI tumble from the clear market leader into “hey guys us too!” territory has been insightful.
I personally am ready to buy the drop when this bubble pops.
Not sure about the security capabilities and haven't tested it all that well, as I usually just use hosted models, but I do find myself using it and it's been quite successful for parsing unstructured data, writing small focused scripts and translations.
The fact that I retain control of the data itself makes it incredibly useful, as I work in an environment where I can't just paste internal stuff into Codex.
But since it's run locally on a toaster testing it is out of scope for me. It takes a fairly long time to do anything.