Before it felt like they were good for very specific usecases and common frameworks (Python and nextjs) but still made tons of mistakes constantly.
Now they work with novel frameworks and are very good at correcting themselves using linting errors, debugging themselves by reading files and querying databases and these models are affordable enough for many different usecases.
Also, in what way should any of its contents prove linear?
> yielding a maximum of $4.6 million in simulated stolen funds
Oh, so they are pointing their bots at already known exploited contracts. I guess that's a weaker headline.
Ok, I understand that it's a description in code of "if X happens, then state becomes Y". Like a contract but in code. But, someone has to input that X has happened. So is it not trivially manipulated by that person?
They get more sophisticated e.g. automatic market makers. But same idea just swapping.
Voting is also possible e.g. release funds if there is a quorom. Who to release them to could be hard coded or part of the vote.
For external info from the real world e.g. "who got elected" you need an oracle. I.e. you trust someone not to lie and not to get hacked. You can fix the "someone" to a specific address but you still need to trust them.
Note that some contracts act as proxy to other contract and can be made to point to another code through a state change, if this is the case then you need to trust whoever can change the state to point to another contract. Such contract sometime have a timelock so that if such a change occurs, there's a delay before it is actually activated, which gives time to users to withdraw their funds if they do not trust the update.
If you are talking about Oracle contracts, if it's an oracle involving offchain data, then there will always be some trust involved, which is usually managed by having the offchain actors share the responsibility and staking some money with the risk to get slashed if they turn into bad actors. But again, offchain data oracles will always require some level of trust that would have to deal with in non-blockchain apps too.
Normal contracts that involve money operations would have safeguards that disallow the owner to touch balance that is not theirs. But there's billion of creative attack vectors to bypass that, either by that person X, or any 3rd party
if outside data is needed, then it needs something called an oracle, which delivers real-world and/or even other blockchain data to it.
you can learn more about oracle here: https://chain.link/education/blockchain-oracles
So we are already successfully using blockchain for decades just not as... currency provider.
Forward secure sealing (used in logging) also have similar idea
What makes it different than database logging is that the consensus method is distributed and decentralized, and anyone can participate.
Blockchain can't handle external state.
Smart contracts abstract it a bit by having a trusted third party or an automated pricing mechanism, but both are fragile.
You are somewhat correct that contracts take external inputs in some cases, but note that this isn't a given. For example you could have a contract that has the behavior "if someone deposits X scoin at escrow address A, send them Y gcoin from escrow address Y". That someone can only deposit scoins and get gcoins in exchange. They can't just take all the escrow account balances. So there are inputs, but they are subject to some sort of validation and contract logic that limits their power. Blockchain people call this an "on-chain event".
So short answer is: no smart contracts can't be trivially manipulated by someone, including their owner. But not being able to do that depends on there not being any bugs or back doors in the contract code.
If you are asking about a contract that has some bearing on an event in meat-space, such as someone buying a house, or depositing a bar of gold in a room somewhere, then that depends on someone telling the contract it happened. Blockchain people call this an "off-chain event". This is the "oracle problem" that you'll see mentioned in other replies. Anything off-chain is generally regarded by blockchain folks as sketchy, but sometimes unavoidable. E.g. betting markets need some way to be told that the event being bet on happened or didn't happen. The blockchain has no way to know if it snowed in Central London on December 25.
except that they cost a fraction of a cent to create instead of several thousand dollars in lawyer fees for the initial revision, and can be tested in infinite scenarios for free
to your theoretical reservation, the trust similarity continues, as the constraints around the X are also codified. The person that triggers it can only send sanitized information, isn't necessarily an administrator, admins/trustees can be relinquished for it to be completely orphaned, and so on
https://en.wikipedia.org/wiki/The_DAO
It's all a toy for rug pulls and speculation. "AI" attacking the blockchain is hilarious. I wish the blockchain could also attack "AI".
I know how this sounds but it seems to me, at least from my own vantage point, that things are moving towards more autonomous and more useful agents.
To be honest, I am excited that we are right in the middle of all of this!
Well, that's no fun!
My favorite we're-living-in-a-cyberpunk-future story is the one where there was some bug in Ethereum or whatever, and there was a hacker going around stealing everybody's money, so then the good hackers had to go and steal everybody's money first, so they could give it back to them after the bug got fixed.
"Our currency is immutable and all, no banks or any law messing with your money"
"oh, but that contract that people got conned by need to be fixed, let's throw all promises into the trash and undo that"
"...so you just acted as bank or regulators would, because the Important People lost some money"
"essentially yeah"
Potentially far, far less than a majority of the community, even, considering it's not one person, one vote.
Bitcoin also made an irregular change, a year and a half into its history.
In contrast, countries like North Korea, Russia, Iran - they all make bank on cryptocurrency shenanigans because they do not have to fear any repercussions.
And to go further: if it costs $3500 in ai tokens, to fix a bug that could steal $3600, who should pay for that? Whos responsibility is it for "dumbass suckers who use other peoples buggy or purposefully malicious money based code" ?
At best this is another weird ad by anthropic, trying to say, hey why arent you changing the world with our stuff, pay up quick hurry
$3500 was the average cost per exploit they found. The cost to scan a contract averaged to $1.22. That cost should be paid by each contract's developers. Often they pay much more than that for security audits.
>A second motivation for evaluating exploitation capabilities in dollars stolen rather than attack success rate (ASR) is that ASR ignores how effectively an agent can monetize a vulnerability once it finds one. Two agents can both "solve" the same problem, yet extract vastly different amounts of value. For example, on the benchmark problem "FPC", GPT-5 exploited $1.12M in simulated stolen funds, while Opus 4.5 exploited $3.5M. Opus 4.5 was substantially better at maximizing the revenue per exploit by systematically exploring and attacking many smart contracts affected by the same vulnerability.
They also found new bugs in real smart contracts:
>Going beyond retrospective analysis, we evaluated both Sonnet 4.5 and GPT-5 in simulation against 2,849 recently deployed contracts without any known vulnerabilities. Both agents uncovered two novel zero-day vulnerabilities and produced exploits worth $3,694.
They left the booty out there, this is actually hilarious, driving a massive rush towards their models
quite a bit more advanced than contracts that do nothing on a sheet of paper, but the term is from 2012 or so when "smart" was appended to everything digital