DeepClaude – Claude Code agent loop with DeepSeek V4 Pro, 17x cheaper
170 points
4 hours ago
| 16 comments
| github.com
| HN
aftbit
3 hours ago
[-]

    #!/bin/sh
    export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
    export ANTHROPIC_AUTH_TOKEN=sk-secret
    export ANTHROPIC_MODEL=deepseek-v4-flash
    export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
    exec claude $@
reply
aaurelions
2 hours ago
[-]
It seems like any project that makes fun of Claude is bound to reach the top spot on Hacker News. Even if it’s just a project consisting of four lines of code.
reply
btbuildem
1 hour ago
[-]
This in essence is what allows one to use any model with CC -- including local.
reply
nadermx
2 hours ago
[-]
The AI wars have begun
reply
stingraycharles
33 minutes ago
[-]
This has been possible since the beginning.
reply
vitaflo
3 hours ago
[-]
I'm not exactly sure what the point of this is. Deepseek already has instructions to use its API with many CLI's including Claude Code directly:

https://api-docs.deepseek.com/quick_start/agent_integrations...

reply
croes
2 hours ago
[-]
From vibe coders for vibe coders
reply
2ndorderthought
1 hour ago
[-]
I don't always copy paste vibe coded project readme mds into Claude code and ask them to rewrite it but when I do... actually that's all I do now because my goal in life is to make wealthy overvalued companies wealthier.
reply
kordlessagain
22 minutes ago
[-]
Problem?
reply
2ndorderthought
3 hours ago
[-]
There probably isn't a point. Someone didn't understand something, didn't research it, so they 1 shotted their first thought and sent it to the front page of HN and all of their socials. It's the future bruh
reply
ttoinou
3 hours ago
[-]
I thought the tool format wasnt exactly the same ? So plugging any IA into claude code requires a conversion of format
reply
selcuka
1 hour ago
[-]
DeepSeek has a dedicated Anthropic-compatible endpoint [1].

[1] https://api-docs.deepseek.com/guides/anthropic_api

reply
ricardobeat
2 hours ago
[-]
Many of them expose “anthropic-compatible” APIs for this very purpose.
reply
crooked-v
2 hours ago
[-]
I'm curious how well it actually works. I tried Deepseek with Hermes and Opencode and it seemed extremely bad about using some of the basic tools given, like the Hermes holographic memory tools, even with system prompt instructions strongly pointing them out.
reply
justech
2 hours ago
[-]
If you're looking for Claude Code alternatives, I would first suggest looking into pi.dev or opencode for your harness. And then for models, you can choose from OpenCode Go (IMO most cost effect at this moment), OpenRouter, or direct from DeepSeek. Better if you go the Kimi route IMO and just buy a subscription from kimi.com
reply
wolttam
2 hours ago
[-]
I’m going to throw my harness in the ring: https://codeberg.org/mlow/lmcli
reply
Aeroi
2 hours ago
[-]
agreed. OpenCode is a strong base, and with a couple modifications it can become a very effective harness. my sideproject mouse.dev I’ve been combining parts from OpenCode, Claude Code, and Hermes to build a cloud agent architecture that works well from mobile.
reply
CharlesW
2 hours ago
[-]
> OpenCode is a strong base, and with a couple modifications it can become a very effective harness.

I personally didn't find it to be competitve with Claude Code as a harness. Can I ask how you modified it to perform better?

reply
Aeroi
1 hour ago
[-]
I haven’t run formal evals but i improved the experience for my own needs and it feels noticeably better with these modifications.

-Claude-style subagents -an MCP layer for higher-level tools -Cursor-style control plane modes like Ask, Plan, Debug, and Build.

The MCP layer lets the harness use things like GitHub file/code read, PR creation, web search/fetch, structured user questions, plan-mode switching, user skills, and subagents.

So the improvement is mostly from better ui/ux orchestration and tool access. There's some things from hermes that are interesting as well.

Most of my focus has been on applying this stack to sandboxed cloud agents so you can properly code and work from mobile devices.

I can't definitively say that the stack is better or worse than Claude code, more just tuned for my use case I guess.

reply
bakugo
2 hours ago
[-]
> I would first suggest looking into pi.dev

Looked into this one. Thought it was suspicious that it only had 7 open issues on github. Turns out they have a bot that auto-closes every single issue just because.

I honestly have no words.

reply
LPisGood
1 hour ago
[-]
The idea is for it to he extremely minimal which strikes me as a very opinionated stance, and not opinions I agree with.
reply
aaurelions
2 hours ago
[-]
Another very cost-effective option is Ollama Cloud. In a month of use, I only hit the 5-hour limit once, when I ran 8 agents simultaneously for 2 hours.
reply
postatic
2 hours ago
[-]
definitely worth it - have both ollama cloud, opencode and hermes running to test them all out, working great so far.
reply
_345
3 hours ago
[-]
If you're okay with sonnet level performance, this sounds like a straight upgrade. But I find that sonnet messes up too much, that it ends up not being worth cost optimizing down to using it or another sonnet-level model. Glad to have this as an option though
reply
2ndorderthought
3 hours ago
[-]
A lot of people are having good experiences doing things like using opus for designing and using locally hosted qwen3.6 for implementation.

I could see a serious cost reduction story by using opus for design and deepseek for implementation.

Personally I would avoid anthropic entirely. But I get why people don't.

reply
girvo
3 hours ago
[-]
Like me: that’s what I do. Either Opus 4.7 or GLM 5.1 for planning, write it out to a markdown file, then farm it out to Qwen 3.6 27B on my DGX Spark-alike using Pi. Works amusingly well all things considered.
reply
brianjking
17 minutes ago
[-]
How are you interacting with GLM 5.1? Via the Claude Code harness? I really wish they'd release a fully multimodal model already.
reply
2ndorderthought
3 hours ago
[-]
How is glm 5.1? I have t tried it yet but have been meaning too
reply
girvo
2 hours ago
[-]
It's surprisingly good. Beats MiniMax 2.7 and Qwen 3.5 Plus in my testing (I haven't tested 3.6 plus though), quite handily. It's far better than Sonnet, and often equivalent to Opus for the web development and OCaml tasks I'm using it for. It definitely isn't Opus 4.7, but its far good enough to earn it's keep and is substantially cheaper.
reply
sshine
1 hour ago
[-]
I agree with this. And also: it uses more thinking time to reach this. So while you get a lot of tokens on their plan, the peak 3x token usage multiplier + the extra thinking means you run into the rate limit anyways.
reply
girvo
1 hour ago
[-]
True, though the $20 equivalent used for planning only I don’t hit those limits often, vs Claude where the Pro can literally hit limits with a single prompt haha
reply
aftbit
3 hours ago
[-]
What hardware are you using to power this?
reply
girvo
2 hours ago
[-]
> DGX Spark-alike

Probably wasn't clear enough if you don't know what that is already, apologies

It's an Asus Ascent GX10, which is a little mini PC with 128GB of LPDDR5X as shared memory for an Nvidia GB10 "Blackwell" (kind of, it's a long story) GPU and a MediaTek ARM CPU

reply
aftbit
1 hour ago
[-]
Ah yeah I saw that, I was just curious which particular mini-PC you were using. I was considering picking up one of the various AI Max 395 boxes before the RAMpocalypse but didn't take the plunge. Thanks for the response!
reply
girvo
1 hour ago
[-]
I heavily considered one of the AMD Strix Halo boxes, but part of the reason I wanted this was to learn CUDA :)
reply
chrsw
2 hours ago
[-]
I keep re-learning this lesson: I chug along with a lesser model then throw a problem at it that's too complex. Then I try different models until I give up and bring in Opus 4.6 to clean up.
reply
brianwawok
2 hours ago
[-]
And I keep using Opus to like, make git commits. Really just need a smart router that is actually smart, vs having to micromanage model
reply
willio58
2 hours ago
[-]
I don’t find this with sonnet at all. As long as I have a solid Claude.md and periodically review the output and enforce good code practices via basic CI gates I’ve rarely ever found myself having to switch to opus
reply
2ndorderthought
1 hour ago
[-]
You might be surprised then at how good cheaper models solve your problems
reply
nclin_
12 minutes ago
[-]
Is claude code the best coding harness? Anyone running evals on that?
reply
ahmadyan
6 minutes ago
[-]
In my anecdotal experience, it is not. Same model, opus, works better in 3P harnesses such as Factory Droid or Amp.

Claude code, on the other hand, is the most subsidized one, both for consumers (through max subscription) and for enterprises (token discounts). It is also heavily optimized for cost, specially token caching and reduced thinking, at the expense of quality.

reply
alexdns
3 hours ago
[-]
obviously vibe coded ( co authored ) + the prices dont even match
reply
2ndorderthought
3 hours ago
[-]
It's going to be real hard to find headlines that weren't vibe coded from here on out unfortunately.
reply
SchemaLoad
2 hours ago
[-]
Unless I actually know the author I assume everything here is vibeslop and full of mistakes.

Maybe I need to switch to some news publication that actually does real research and writing still. Because public forums like this have been completely destroyed by LLMs.

reply
cyanydeez
3 hours ago
[-]
welp, pack it it in boys, it was nice conceptualizing all you as real humans on the internet. I guess I'll just have to go touch grass if I want to feel parasocial.
reply
dragontamer
2 hours ago
[-]
I mean, we have the tech and community to actually build in person meetups and sign CRT certificates, right?

If we touch grass in person and swap certificate requests, we can actually rebuild a trust network.

This is a pretty old problem with regards to clubs / secret societies and whatnot. And with certificates / PKI, our modern security tools have solved all the technical problems.

reply
2ndorderthought
2 hours ago
[-]
I wish I could be invited to a secret club of guaranteed humans. Someone hand me a certificate next time you see me! Also don't stab me kthxbye
reply
cyanydeez
2 hours ago
[-]
Unfortunately, a lot of whats happening in the tech world seems to be from some super serious AI cults, so not sure goin offline like this is any better.
reply
2ndorderthought
2 hours ago
[-]
Yea but we could have fun. Play some dnd. Drink tea or whiskey. Eat pizza pie. Light saber battle. Buy a megaphone and hang out at a street corner telling passerbys they are perfectly acceptable and worthy of kindness and love
reply
inciampati
1 hour ago
[-]
poorly vibe coded. machines can check details easily, use them.
reply
dopeepsreaddocs
53 minutes ago
[-]
Did... Did you just ask an AI to one-shot something that normally amounts to no more than setting two env variables?
reply
vagab0nd
1 hour ago
[-]
This has become a problem for me. I like trying new things. But I also know that in about a week, there's going to be a better/cheaper setup. And a week after that. And ideally I'd like to get some coding done when I'm not tinkering with the tools.

So I think I'll stay with CC for now.

reply
kordlessagain
17 minutes ago
[-]
CC has the ability to use Ollama as well, which includes the ability for Ollama to proxy to Ollama's cloud models. It's brilliant, and works with a single Ollama command that doesn't mess with CC at all (so you can run them at the same time).

If you are interested, I've built an agentic terminal that helps manage these types of things better: https://deepbluedynamics.com/hyperia

reply
orliesaurus
3 hours ago
[-]
Is there a way to do this directly by using claudecode CLI (which I already have installed) and openrouter??
reply
vitaflo
3 hours ago
[-]
reply
spirit23
15 minutes ago
[-]
reply
jubilanti
2 hours ago
[-]
Here's a oneliner:

   ANTHROPIC_BASE_URL="https://openrouter.ai/api" ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY" ANTHROPIC_DEFAULT_SONNET_MODEL="deepseek/deepseek-v4-flash" CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 claude
reply
theanonymousone
3 hours ago
[-]
Yes, from Claude Code themselves: https://code.claude.com/docs/en/llm-gateway
reply
gnat
3 hours ago
[-]
This repo's README explains how it works and you can do it yourself. claude looks for environment variables that say which API endpoint to talk to, which key to pass, which model name to use for haiku/sonnet/opus-level workloads, etc.
reply
Lihh27
2 hours ago
[-]
the wrapper is basically env var glue. You’re still betting the whole loop on Anthropic's closed client.
reply
game_the0ry
2 hours ago
[-]
Cost engineering [1] will be the next hot topic for AI.

[1] A fancier way of saying "reducing cost."

reply
fHr
57 minutes ago
[-]
layer on layer on layer to refactor bunch of lines xD
reply
deadbabe
2 hours ago
[-]
I had a call with our CTO and we are pivoting away from Claude Code to DeepClaude because the cost savings are too substantial to ignore.
reply
2ndorderthought
3 hours ago
[-]
Oh shoot now the next CC upgrade will blow your subscription for doing this
reply
esafak
3 hours ago
[-]
Why wouldn't you use something open source like OpenCode, which already support DSv4 and has more features than CC?
reply
CharlesW
2 hours ago
[-]
Coding harnesses make a big difference, and OpenCode is notably less effective than Claude Code (1) in my experience, (2) with the models I've tried it on. (I've not yet tried it with DSv4.)
reply
dlx
2 hours ago
[-]
As someone who does use other models with CC, I am curious about opencode, what extra features does it have that you find essential?
reply
esafak
2 hours ago
[-]
I like being able to add a wide array of models, define perms for agents and subagents, turn MCPs on and off at will, and be able to fix bugs I find in it.
reply
dlx
2 hours ago
[-]
fair enough...any drawbacks that you've found?
reply
esafak
2 hours ago
[-]
Its UI isn't as slick, and it has bugs, but so does CC and you can submit a PR to have them fixed in OC.
reply
ttoinou
3 hours ago
[-]
More features than CC ?

Also opencode tracks you by default. Its not safe. Every first prompt you send is routed through their servers, logged and they can use your data however they want

reply
sedawkgrep
2 hours ago
[-]
I thought this was debunked awhile ago. ?
reply
esafak
2 hours ago
[-]
I could not find any evidence of prompt logging. The code is open; can you point me to it?
reply
morpheos137
3 hours ago
[-]
anthropic messed up big time harness works with any muh commodity LLM, meanwhile VCs were duped on the myth of FOOM AGI, probably not a cooincidence Anthropic is enmeshed with the scifi fan fic forum known as lesswrong. The world wants useful tools. The bay area bubble in contrast thrives on Mythos.
reply
bwfan123
4 minutes ago
[-]
> anthropic messed up big time harness works with any muh commodity LLM

that surprised me too. The intelligence is at the client, and by making that open, anthropic has commoditized the coding agent.

reply
hgyyy
2 hours ago
[-]
I think OAI and Anthropic will be ok for a year or two. But after that If they still continue to earn revenues from selling tokens to firms/software engineers they will be in serious trouble.

The American firms are not demonstrating escape velocity and as long as china offers something somewhat comparable and offers it at a very low price to compensate for any difference in quality, they will not be generating enough in cash flows to finance reinvestment. I highly doubt they’ll be able to continue raising external financing for numerous periods from here on out - they gotta start showing strong financials and that they are running away from the open source models.

reply
LeFantome
50 minutes ago
[-]
The performance gap will likely close as Chinese hardware improves. This is happening very rapidly.

Already DeepSeek v4 is being hosted on Huawei Ascend 950. What do you think those cost relative to NVIDIA gear?

reply
morpheos137
1 hour ago
[-]
I wouldnt put it past the US gov to ban foreign models. they tried to ban tiktok. what is being demosrrated here is silicon valley can not withstand a competitive market.
reply
LeFantome
52 minutes ago
[-]
Good luck banning Open Source models.

Not only that but other countries are very unlikely to follow suit, so it is just a straight-up productivity tax on the US.

reply
morpheos137
25 minutes ago
[-]
Yeah see the Nvidia china us gov self own. The assumption seems to be 1.4 billion people in a middle income country are dependent on 300 million for tech.
reply