FilterHN

Ornith-1.0: self-improving open-source models for agentic coding

15 points

by danboarder

1 hour ago

| past

| 2 comments

| github.com

| HN

▲

CharlesW

18 minutes ago

[-]

Previously: https://news.ycombinator.com/item?id=48709744

https://swelljoe.com/post/will-it-mythos/: "Poor performer here, only found the one bug that almost every model found, despite its performance on other benchmarks being excellent for its size. […] It also performs poorly in a chat without tools, exhibiting an ehthusiasm for hallucination. I’m currently working on a replication of this with full tool access, including bash/Python, which may allow this model to be competitive."

▲

kennywinker

18 minutes ago

[-]

Can anyone explain what’s the story here? Is this just a re-skinned qwen? Who is deepreinforce-ai and why isn’t this model listed on their website?

How does it self-improve, does the model change on disk - or just during a single context run it gets better?

▲

simonw

13 minutes ago

[-]

It doesn't self-improve, that's a misleading headline.

As far as I can tell they trained it by running their own reinforcement learning on top of Qwen and Gemma 4 (not sure how they combined weights from both, or if they used Qwen as the basis and Gemma 4 to help train?) - so the "self-improving" is about their training process, not how you use the weights.

▲

wmedrano

1 minute ago

[-]

Looking at the models weights, they seem to be Qwen fine tunes with the 31b being a Gemma 4 fine tune

▲

kennywinker

11 minutes ago

[-]

Gotcha. That makes more sense. We ran the model to train the model -> “self-improving”.

▲

CharlesW

18 minutes ago

[-]

Yes, it's "another post-train of Qwen 3.5/3.6 MoE intended for agentic use", according to https://swelljoe.com/post/will-it-mythos/.