The story of DeepSeek is incredibly inspirational: The founder being a phd in computer science, completely bootstrapping his AI efforts by doing quantitative trading, and even as they reached the frontier in the hottest subfield being more open than any other lab about what they were doing.
In general I find the attitude of the Chinese AI labs (and government) to be refreshingly not "AGI-pilled" and focusing on the correct downsides of AI (the effect on youth employment and the messing up of higher education).
in the early 2000s in california universities you'd get marked down for citing wikipedia. so the good souls told everyone "see the number in brackets[2] after what you're trying to cite the article for? just click that then click the archive.org or whatever link there, then cite that."
Now? i think wiki is considered a valid source? or has it flopped back to being "unreliable"?
[0]: https://en.wikipedia.org/wiki/Wikipedia:Citing_Wikipedia
The issue is that Wikipedia can be wrong and you’d only know that by going to the source (or lack thereof), or checking other sources.
Just like how people should use AI for research, I guess.
a bing or google or wiki search to get the primary or secondary sources are okay, but if i use chat.deepseek.com instead, suddenly it isn't okay.
[0] https://www.reuters.com/world/china/china-prepares-295-billi...
[1] https://www.globalneighbours.org/en/articles/china-unveils-n...
[2] https://english.www.gov.cn/news/202606/10/content_WS6a296017...
We're lucky to have China imposing competiton to the western AI megacorps.
If it wasn't for China, I would probably have to spend $100/mo on AI instead of $10 like I do currently while using DeepSeek and MiMo (opencode Go plan).
And while I could do so comfortably, I feel for those who can't. It must feel incredibly isolating to only watch others have access to expensive models to leverage their careers.
I hope SoTA AI becomes an universal right because it will contribute to too much income disparity otherwise.
The second they get a hold of the market, Chinese Big Tech will be as bad or worse than US Big Tech.
We're lucky to have DeepSeek.
It's a smart move to make everyone dependent on them.
p.s. how would such "subsidization" work on a such a scale? if you think the EVs, PV panels, etc are cheap because the govt like, just covers the loss on every sale(?) where do they get all that surplus finance to cover labour and resources?
have you considered 'subsidies' can be used for accelerating R&D for national interest rather than some monopolistic plot
China's subsidies are comparatively much shorter.
https://www.oecd.org/en/publications/subsidies-and-the-solar...
https://www.oecd.org/content/dam/oecd/en/publications/report...
It's crazy how much you get out from Deepseek V4 Flash alone.
I have unlimited tokens at work than i go home what do i do? Spend 200$ per month? No def not.
When Anthropic increased the limits for their 20$ plan, i started again coding with it on a private project and it was fun and i did a lot in that 4 weeks.
We've had a taste, and damned if I'm going to have the "means of production" snatched from me already?
I assume it will get reposted at some point.
Notes on DeepSeek:
We visited the company HQ last Tuesday. It was founded in 2023 by Liang Wenfeng and operated out of his hedge fund, High-Flyer, until somewhat recently. The company released their R1 model in January 2025, so it was interesting to see what they’ve been doing
The company is located in an unmarked, 12-story building in Hangzhou. There is no DeepSeek branding visible from the street or lobby. I asked why this is, and the team demurred and said, “Well, there are many companies in this building, and we are not special.” They want to keep a low profile.
We met with their Head of Data and Head of Infrastructure. The company only has 300 employees. They are at least an order-of-magnitude smaller than Anthropic, and don’t care to scale further just yet. Their Head of Infrastructure, in particular, was young; maybe 30 years old and apparently one of the best AI buildout and energy experts in the country. (We briefly walked through the labs, and everybody seemed young. There was a lot of discussion; it felt like an exciting and energetic place.)
Lots of competition is coming from Alibaba (Qwen), ByteDance, and Moonshot (Kimi). People in China seem to mostly use Kimi or Deepseek. Young people use VPNs to access Claude, though Anthropic has blockers around usage in China and make it difficult. Poaching between groups is common, just like in the U.S. DeepSeek has a reputation as being really smart and “cool,” maybe similar to Anthropic. Big labs are mostly in Beijing, near Tsinghua and Peking University, with Hangzhou as the main exception (DeepSeek and Alibaba/Qwen are there).
The DeepSeek team reads western AI writers. They listen to Dwarkesh and read Gwern. The people we met with said they had never met with any employees from Anthropic. They were not at all concerned with some kind of hostile / AGI takeover scenario. They kept bringing up job loss (which is already high amongst youth in China) as their main concern. When we asked if they do red teaming on their models, they said no. In China, AI models are not regulated directly; the government instead has restrictions on how those models can be used in software, services, etc.
As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration.
We asked the DeepSeek team: “What has the highlight been so far? What are your plans for an exit?” And they said that their highlight and great achievement was R1. They did not gesticulate at a future model or vision, but rather seemed proudest of what they’ve already done. They are content for now to remain ~6 months behind U.S. companies while maintaining a lower profile and team size.
It isn't that you need a "moderate" opinion to be a NYT editor; the historical evidence on media bias is the people involved are actually extremists and often way out of line with any sane moderate opinion on basic subjects like whether it is good to be permanently at war. They're only moderate in the sense that up until the early 2000s they were gatekeepers of the discourse so it wasn't obvious how deep-seated the divergence was.
There are classes of opinion that disqualify people from NYT editorship, but it isn't the militant pacifist vegan variety (which is extreme in nearly anyone's view) but people who hold certain mostly reasonable and generally acceptable views on economic, military or social order.
This 1988 model of the flow of information in free societies and their media gatekeepers was probably correct. Nearly 40 years later it is not. The digital content flows in free societies is so diverse today that widely read content extremely critical of whichever parties or power-holders you'd like to read about is everywhere and easy to find. Not the case in authoritarian systems.
Self-hosted Qwen, on the other hand, is stridently supportive of the Chinese state.
I posted the answers I got here https://swelljoe.com/post/open-model-censorship/
I also don't like how easily manipulated they are. They should have seen through Persona. They shouldn't have touched Persona with a 10 foot pole. Persona is not the answer to anything.
this doesn't sound belivable, or at least it seems off. competent ai engineers should have good intution about how agents work, and what happens when they don't do what you want them to do: https://www.forbes.com/sites/boazsobrado/2026/03/11/alibabas...
also if those eginers do read gwern and watch dwarkesh, then shouldn't they have picked up on talk of x-risk? this doesn't add up
What's brutal is that Google, which started this AI revolution, has literally the worst coding model! I tried 3.5 Flash last week (the stupid still pays for Ultra due to Google One's storage), and before I gave up on 3.1 Pro, I saw a coding agent hallucinate for the first time in months, even at the highest effort level!
Meanwhile, I've tried DeepSeek with the DeepSeek TUI (now CodeWhale), and it didn't do any worse than Codex or Claude Code. I know there are benchmarks and all, some of them gamed, I'm sure, but in real-world experience, DeepSeek is absolutely amazing for its price! If you have software engineering skills and are not an accidental vibe-coder, honestly, try it out and stop burning money. I'm sure you will get even better results with OpenCode! Human Intelligence + Artificial Intelligence beats the highest AI model without the guidance of a HI!
Meanwhile, I burned through my entire budget on the $200 Max for Fable 5, for a modest-amount project in Python using its own CLI coding agent. What a waste!
I keep hearing "always use the bestest model" - no, always use the most practical one for the job! I got so many issues with Fable on a very small project that even Copilot found that it's simply not worth it for 99% of your tasks!
This is a refreshing perspective.
I don't have enough information to say whether the Chinese leadership sees AI "just as the next technology" or they are more cautious due to its double-sword nature. But the immense efforts for building their own AI/GPU chips plus government's billions fund pushed for AI build out, a directive for fast pace integration on large scale and a sweeping national education reform for AI, I don't think it can be seen as similar to other ordinary techs.
[0] https://www.reuters.com/world/china/china-prepares-295-billi...
[1] https://www.globalneighbours.org/en/articles/china-unveils-n...
[2] https://english.www.gov.cn/news/202606/10/content_WS6a296017...
Not saying its a bad thing, but US and EU limited exports of chips and litography equipment to China for decades.
There is literally nothing else China can do to secure their supply of chips. They would do it even without AI bubble.
Its military tech now and this is not just about LLMs. Autonomous flying killbots need GPUs too.
Further on. Refreshing indeed.
I hypothesize that, rather than slowly having it disperse in society and allow people to harness it in ways they don't want, they might as well accelerate everything until AI becomes the totalitarian swiss knife - which they can make use of in the best way of course.
Let's see what will happen.
[0] https://www.washingtonpost.com/national-security/2026/03/11/...
Nothing to do with AI and can happen in any war. Do some research, check sattelite imagery:
https://goo.gl/maps/ZoAXkw1iFwyF7exQ8?g_st=ac
PS: I not trying defend bombing schools, but posting that its "AI" resposible is opposite of what you need to do if you care.
Its military - there been specific people who found this location for the strike, then some senior officers who choose it without checking and specific people who executed it. And its all logged with "paper" trail in chain of command.
It was all people with specific names who are responsible to avoid bombing schools. They failed. Not "AI".
https://oldcc.gov/our-programs/public-schools-military-insta...
I never said AI is responsible. I pointed out the US is clearly the one using AI in dystopian ways.
It's trivial for me to download one of their models and run it on my Spark, and there's all sorts of ways to strip out their Tiananmen-denialism or whatever.
If/when the memory price crunch dissipates, even more so. And so far it's only China I see as making moves to increase production capacity on memory, too.
If anything the centralization of capital into US-based Anthropic and OpenAI is far more terrifying from the perspective you're outlining.
> It got me thinking, though--the most successful founders do not set out to create companies. They are on a mission to create something closer to a religion, and at some point it turns out that forming a company is the easiest way to do so. [1]
From the notes this part sat with me as the real difference:
> As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration.
Meanwhile... In the fantasy land over here in the US we're constantly being told that it's "coming", "almost here", "too powerful for us to give you access to", "of national security importance!". Or... FUD.
And while there may be trace amounts of truth in those overzealous statements we haven't seen a significant improvement in much outside of software development comparative to the spend and environmental impact.
Expert in buildout or expert in distillation?
1) China distills and is therefore morally bad.
As you rightly point out, that's not a great argument.
2) China distills and is therefore possibly not that competent.
I think that makes sense. If they only catch up to the frontier through distillation then 1) Their model will never be as good as the model they are distilling from. 2) They will never reach the frontier - they need someone else to do it first.
“All they do is copy.”
And now, oops they are world leaders in EVs, batteries, solar, drones, just to name a few on the biggest consumer facing things.
You gotta start somewhere and you can start at page 1 or page 10 and that time, energy and cost you saved starting 9 pages later can be put into making whatever it is you're building better than the original.
The US, and every other country, is full of derivatives or straight up copies. No one is getting super mad at the generic cheerios at the grocery store. It's hypocrisy.
I think deepseek at least has done enough innovative work that you could grant them a baseline of competency.
In general, there are enough papers coming out of China to suggest that there are quite a few people there who know what they are doing.
I also have a soft spot for deepseek because they write such readable papers. I don't have a degree in anything but with a little work I can understand their papers - which I really appreciate.
But I still think my point stands - if you need distillation you won't be SOTA
I heard that argument more than one year ago, when chain of thought and reasoning cycles started to be hudden to protect against distillation.
Meanwhile, models as DeepSeek and MiMo are nothing short of excellent nowadays.
Ever since I switched away from OpenAI to DeepSeek I never felt the need to go back.
Together it gives me plenty of head room/model performance for $40ish/mo, plus letting me compare the various models over time.
Originally I'd been using the Z.AI plan (that I'm still grandfathered into for <1 yr) as my cheap plan but wasn't keeping up with the SOTA progress and is slow/limited now. So I subscribed to the Opencode Go plan and use Deepseek Flash V4 almost exclusively and it is insane how much usage I can get for $10/mo.
I did the math on my Flash usage vs. what I'm paying Opencode and I'm typically not even exceeding $10 in API costs! So it's actually sustainable not rugpull pricing at least for me. I can pound it with requests/agentic loops and have it running for 30 min doing whatever the fuck and check back and have spent literal pennies for what would have cost $30+ on my work's Github Copilot plan.
I know enterprise world works under different rules and isn't price sensitive in the same ways as an individual but I truly don't see how this is sustainable for the US AI giants in the long term to maintain like 25x+ markup for 1.25x performance benefit.
IMO it does help explain the recent emphasis on secret, scary "super models" like Mythos to muddy the waters for decision makers with hype and FOMO at at time when companies are beginning to seriously scrutinize their token spending for the first time.
I canceled ChatGPT because I would be on vacations. Codex was pretty great, but I thought "Let me put 10 bucks on Deepseek API and plug it into Claude Code".
I was completely blown away. I found it even better than Claude or Codex. And those 10 bucks? It lasted for more than a month.
I don't see myself coming back to Claude/OpenAI.