Mine is a daily bash cronjob that fetches a text-based database and uses grep to build an nftables-apply script with all the IPs for the blocked ASNs. I keep meaning to share it, but it's embarrassingly messy I haven't had time to clean it up...
The previous infringement case with Anthropic said that while training an AI was transformative and not itself an infringement, pirating works for that purpose still was definitely infringement all by itself. The settlement was $1.5bn, so close to $3k for each of the 500k they pirated, so if Zuckerberg pirated "millions" (plural) it is quite plausible his settlement could be $6bn.
He bought the best protection around for breaking the law.
IIRC, Facebook's cash is more like $81-82 billion.
There's some interesting exceptions, like how Musk has managed to sell Tesla shares totalling more or less as much as the business itself has made in total lifetime revenue; but even then, Musk's theoretical net worth is very different from how much he could get if he was forced to sell all his shares suddenly.
Owner-CEOs like Musk and Zuckerberg get all the effects of such randomness, but the only examples I can think of such people getting into billion-dollar legal troubles tend to be examples which go on to sink their companies completely, so I'm not sure what impact a fine of "merely" 10% of cash reserves would do to investor confidence as expressed in share price. And this is not the only legal case Meta's facing right now.
MacKenzie Scott (Jeff Bezos' ex wife) show it can be turned into real money. As of December 2025 She had given away $7.1 billion in 2025 charitable donations, and $26.3 billion since 2019.
In reality there is the ability to execute on the shares to turn them into real money.
Jeff Bezos holds less than 10% of Amazon stock himself. Which is a huge amount of money, and a not insignificant amount of which can be turned into "real" money and even with some decline is still a phenomenal amount.
In that same time period the stock valuation has more than doubled.
Yes, there are specialized products catered to billionaires. But those aren't getting them better rates than someone with a $200k portfolio (Zuck is not conventionally a less risky borrower than the Options Clearing Corporation!). They exist to work around the fact that some borrowers can't just casually liquidate their stock on the open market, let alone at face value. By all accounts these products are more expensive than retail.
Mostly this is an expensive (but maybe still less expensive than taxes, depending on the rate environment—it's more of a no-brainer in ZIRPland) way to diversify out of a single-stock portfolio without selling by adding leverage. At Zuck's age, it's still very unlikely to make sense to borrow instead of sell to spend. He's been known to pay real taxes in the past.
I've wondered what the legalese justification for letting liability evaporate as it does so often with corps. So far the reasons I'm left with are 'shrugs' and 'the relevant provision (seemingly? apparently?) simply don't apply', neither of which are any good.
I was going to make a joke about how we should attach magnets to Aaron Swartz' corpse, since that'd make for a pretty potent energy source, given how fast he must be spinning. But honestly, I think he would have seen this sort of thing coming, given how his case was handled and how things really haven't gotten any better.
This does not comfort me.
All the Aaron Schwartzes of the future could freely share scientific papers with the world.
The rate at which they were spidering and scraping was so far beyond what any other supposedly legit spider was doing, it seemed like the logical explanation.
But a multi-billion dollar corporation downloading millions of copyrighted creative works so that they can reshape the entire labor market by training a new type of artificial intelligence model on that data set? Meh, sounds like Silicon Valley disruption, give the man a medal!
it is wrong to advocate for everyone to be treated equally unjustly. better to advocate for the removal of the bad laws/structures
I doubt Meta has deleted their local copy though ...
How are these fruits "stolen" if they still have what was allegedley stolen?
Dowling v. United States, 473 U.S. 207 (1985): The Supreme Court ruled that the unauthorized sale of phonorecords of copyrighted musical compositions does not constitute "stolen, converted or taken by fraud" goods under the National Stolen Property Act
And even if, arguendo, sure its stolen. The purpose of copyright is to "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries"
And you would be hard pressed to prove that LLM's haven't advanced the arts and sciences, so at bare minimum transformative, ie fair use.
> Authors have sued AI companies for copyright infringement before - and lost.
So, basically nothing will come out of this
So not are these publishers rightfully pissed, Meta didn’t even give them the $6.99 for each epub to begin with. They’ve stolen the whole thing as part of this “fair use” campaign to destroy human authorship free of even the most basic remuneration.
I'd much rather prosecution focus on Zuck's more serious crimes against privacy and civilization as a whole. But maybe this is a small start?
That's not revenue.
RICO specifically cites "criminal infringement of a copyright" as laid out in 18 U.S. Code § 2319. If the CEO tells his employees to download hundreds of thousands of works illegally in order to carry out his money-making scheme, how is that not organized crime even if (dubiously) LLM training on the material is fair use?
-----
RICO: https://www.law.cornell.edu/uscode/text/18/part-I/chapter-96
Definitions: https://www.law.cornell.edu/uscode/text/18/1961
> As used in this chapter — (1) “racketeering activity” means (A)[...]; (B) any act which is indictable under any of the following provisions of title 18, United States Code: [...], section 2319 (relating to criminal infringement of a copyright),[...]
18 U.S. Code § 2319 - Criminal infringement of a copyright: https://www.law.cornell.edu/uscode/text/18/2319
-----
edit:
> 18 U.S. Code § 1962 - Prohibited activities
> (c) It shall be unlawful for any person employed by or associated with any enterprise engaged in, or the activities of which affect, interstate or foreign commerce, to conduct or participate, directly or indirectly, in the conduct of such enterprise’s affairs through a pattern of racketeering activity[...].
https://www.law.cornell.edu/uscode/text/18/1962
From the lawsuit:
“Meta — at Zuckerberg’s direction — copied millions of books, journal articles, and other written works without authorization, including those owned or controlled by Plaintiffs and the Class, and then made additional copies of those works to train Llama,” the suit says. “Zuckerberg himself personally authorized and actively encouraged the infringement. Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.”
Until we progress as a society to the point that we can put this system behind us we should at least fight to make enforcement uniform. In fact, uniform enforcement is probably a good starting point for arguing for abolition, as the pain of that enforcement is felt by proles and elites alike.
Corporations believe in copyright so if they "break" it they should get punished for breaking rules they made up themselves.
Generally the law should be more strict for corporations than for real people.
edit: People downvoting can you argue why you disagree? I do think it's fair for the law to be more strict on the powerful rather than on the powerless.
If this was you or me, we would be in prison for decades and have a fine in the millions. Time for these people to feel consequences.
As someone said, they will probably settle for around 6 billion, that is the same as say a $100 fine for us.
I'm all for strong justice, but you want to imprison an executive for decades for copyright violations?
Ah, found it:
>In April 2023, a 54-year-old programmer named Gary Bowser was released from prison having served 14 months of a 40-month sentence. Good behaviour reduced time behind bars, but now his options are limited. For a while he was crashing on a friend’s couch in Toronto. The weekly physical therapy sessions, which he needs to ease chronic pain, were costing hundreds of dollars every week, and he didn’t have a job. And soon, he would need to start sending cheques to Nintendo. Bowser owes the makers of Super Mario $14.5m (£11.5m), and he’s probably going to spend the rest of his life paying it back.
I'm not even a tiny bit supportive, but there is precedent.
https://www.theguardian.com/games/2024/feb/01/the-man-who-ow...
Why should Zuckerberg be exempt?
Now, I personally find the idea of imprisoning people for copyright offenses horrific, but I don't think it's remotely insane that someone else might come to that conclusion, given that we broadly accept it as a society.
[0] https://www.ussc.gov/sites/default/files/pdf/research-and-pu...
Zuckerberg may be CEO, majority shareholder, and on the board of Meta, but he didn't break copyright law, Meta did. So if there were to be a consequence, Meta would pay out the fine. Not sure how you jail a company.
Now, in a company with a real corporate governance structure, the board would look at the loss incurred by said fine, look at Zuckerberg, and immediately fire him for causing the loss. However, like I said before, Zuck's in charge of Meta, so that's not going to happen, and the fine is unlikely to be enough to drastically impact the company's profitability enough to sink his shares, which are the main repository of his wealth. So if he thinks he can make himself richer violating copyright law in the future, he will likely direct Meta to do so.
TL;DR, in the famous words of Bender from Futurama, "Hooray, the system fails again!"
I'm still stuck on how Z telling Meta (or the relevant people at Meta, whatever) to go out there and do illegal shit doesn't make a court say that he's functionally done said illegal shit, or at least encouraged the company to do, and that he should thus be liable for that. It's not like there's much plausible deniability here. It'd be one thing if the lower ranks thought it'd be fine and did it of their own accord. It's quite another for Z to tell people to go nuts doing illegal shit.
The DMCA makes facilitation of copyright infringement illegal. Telling people to do copyright infringement is surely facilitation of copyright infringement. Surely then, Z having broken the DMCA is a fairly open and shut case, modulo calculating the damages. But apparently not?
> the fine is unlikely to be enough to drastically impact the company's profitability enough to sink his shares
You lack imagination :-) but you've identified both the problem and the solution.
It could be possible to construct a legalistic jail for a company whereby if it has committed the type of crime that a human could be jailed for, then it could be frozen for the duration, say ten years, and all its assets, shareholder funds, contracts, everything were frozen and impounded.
Of course this seems completely ludicrous because it’s so “out there” but it’s worth having the thought experiment. Things like “corporate manslaughter” really have few consequences for the corporation itself - if it was actually jailed for twenty years and shareholders and officers left frozen out and on pause, then it might be the kind of punishment that really counted for something.
You jail the CEO and the others will stand up and take note.
"But they'll complain" who gives a fuck.
I always heard that criminals should be thrown in jail, it's time we started doing it to the real criminals.
Fines don't do anything to deter bad behavior. Either:
* The company pays
* They pay and the company mysteriously increases next year's comp / grants a "loan" / etc
* D&O insurer pays
In all three cases the money comes out of the shareholders' hides. It provides zero personal deterrence. The payoff matrix, as seen by a sociopath, makes it rational to always defect against the common good.
The only punishment that can really focus attention is physical imprisonment in a facility they can't choose.
SOX did this for financial reporting and gee shucks it turned out executives can follow the law after all!
They stole the life's work of millions of people.
In less civilized times, they likely would have been drawn and quartered by strong horses, and had their limbs drug to the 4 corners of the continent as a warning to anyone else that would consider doing it again.
I know there's a complaint that AI can verbatim repeat that work. But so can human savants. No one is suing human savants for reading their books.
Producing copyrighted material, of course. Training on copyrighted material... I just don't see it.
EDIT: Making a perfectly valid point, but it's unpopular, so down I go.
Sarah Silverman as the most prominent example.
The AI won't even know where the page of text it's seeing came from, and people will avoid your book as they can just ask the AI. So you make less money. (Talking about specialized technical books here.)
A machine training on all copyrighted materials in the world for commercial purposes at an industrial scale makes it disproportionate.
If a company hired hundreds of savants, then it would be illegal for them to read books?
I don't follow.
And even if we grant that those savants are also very skilled at creating "market substitutes" based on their training that are capable of competing with the original works, their maximum creative output would only be a relatively small number of new works, because they can only work at human speed.
Can you cite something in the copyright laws themselves that suggest this scale distinction?
This principle is quite universal and can be found in many places, including the US constitution and US (supreme) court decisions, many international jurisdictions, treaties and conventions.
I don't understand why it should be allowed for one savant to study and answer questions about one book, but wrong for a company to hire one million savants to answer questions about one million books.
And I'm asking where in the law or case law this is supported.
Suppose they did, and some guy was filling stadiums regularly to hear him recite an entire audio book. That would probably get the attention of someone's lawyers.
If it's illegal for AIs it should be illegal for humans, too. Is that really what you're arguing? It should be illegal for savants to read books?
Read a book, that's fine. Write a book, that's fine. Read a book and then write a book that is 99.9% the same as the book that you read and sell it for profit without a license from the original author, that's infringement.
That's what all these lawsuits are about - it's the training not the reproduction. I already agreed in my first comment that the reproduction is off limits.
In this case, it appears that Meta torrented illegal copies of the work to do the training. Obviously that's bad. But conflating that with training itself doesn't follow.
Pirating content is illegal, regardless of if it is to train an LLM.
Usage of LLMs trained on unlicensed content (basically all of them) might or might not be illegal.
Using any method to reproduce a copyrighted work by using that original as input in a way that supplants the market value of the original is probably illegal.
At least that is my rudimentary understanding.
I don't think anyone thinks that all training is a copyright violation if all the training data is licensed. For example a LLM trained on CC0 content would be fine with basically everyone.
The problem is that training happens on data that is not licensed for that use. Some of that data also is pirated which makes it even clearer that it is illegal.
If you supplant the value of the original with the original as input then you probably have some legal questions to answer.
It's a "rules for thee and not for me" argument.
the distinction isn't particularly clear cut with an open source model. If it is able to reproduce copyright protected work with high fidelity such that the works produced would be derivative, that's like trying to get around laws against distribution of protected works by handing them to you in a zip file.
It's a kind of copyright washing to hand you the data as a binary blob and an algorithm to extract them out of it. That wouldn't really fly with any other technology.
And that's really where a lot of the value is mind you, these models are best thought of as lossily compressed versions of their input data. Otherwise Facebook ought to be perfectly fine to train them on public domain data.
That seems very possible to me, and undermines the "training is copyright violation" argument. It's not the training, it's the output.
Yes it's very different. Humans need to eat, sleep, and pay taxes. You also have to pay them competitive wages.
There's nothing in the law to support your argument either. The law however does say, very unambiguously, that copying without permission isn't allowed . There aren't exceptions for "training" just because it's superficially similar to a human activity (reading a book). A human isn't allowed to hand-copy Harry Potter. Even if they bought all the Harry Potter books.
How about then to grant AI all other rights, for example, to allow voting?(sarcasm)
Just from a rational argumentation point of view. Clearly if a law is written saying as much, then sure. But there is no such copyright law like that yet.
Correct. Because until very recently there was no need.
(Copied from a comment of mine written more than three years ago: <https://news.ycombinator.com/item?id=33582047>)
Learning from copyrighted content is legal - for both humans and AI. If Meta is in hot water for anything, it's piracy and/or storage of copyrighted material.
Royalties are owed and continuously owed as these models are deployed and doing inference. How is it any different to paying a small pittance to someone every time a song is played?
The LLaMA models were released openly. Copies exist everywhere in the world. You aren't going to be able to charge someone for running `llama.cpp`; a court order ceases to have practical relevance at that point.
"I made enough copies for everyone" isn't a valid defense for copyright infringement.
Second, royalties are not required to cite a source.
Can you imagine how disastrous it would be to everything from news reporting to scientific publishing if that was the case?
Also I believe performing covers is legal
I don’t get why the training process doesn’t count as any other form of transformation but then I’m not a lawyer.
Tired of the double standard that CEOs get away when bad things happen (because they can’t be everywhere all the time) but all the benefits when the company makes a great profit (because they’re personally driving results!).