FilterHN

We decreased our LLM costs with Opus

36 points

by shad42

1 hour ago

| past

| 4 comments

| mendral.com

| HN

▲

wxw

59 minutes ago

[-]

> We switched to the "triager" pattern: a Haiku agent with a very specific and narrow job. Is this issue already tracked or not? If it is, stop right there. If not, escalate to Opus.

> 4 out of 5 failures never reach Opus. A triager match costs around 25x less than a full investigation.

The title feels misleading. Why clickbait on that when you can just be genuine about the architecture?

▲

idorosen

53 minutes ago

[-]

The title does not match the article title: “We Upgraded to a Frontier Model and Our Costs Went Down”.

▲

stingraycharles

17 minutes ago

[-]

It’s still misleading, though.

▲

cadamsdotcom

47 minutes ago

[-]

I have rewritten the article to be slightly shorter:

“Let a cheap agent decide if the expensive one is needed.”

▲

a_t48

6 minutes ago

[-]

Sounds like L1 vs L2 support :)

▲

saltyoldman

15 minutes ago

[-]

I do a similar thing with a "planner agent" that uses the cheapest (I think it's using openai-gpt-5.2-mini or something at like 20 cents for 1M.) that more or less emits a plan name, task list and the task list has a recommended model in each task. It's not perfect, but many of our tasks are accomplished with lighter weight models. When doing code generation or fixing we upgrade to a more expensive model, planning and decisions are done more cheaply. Keep in mind the tasks are relatively constrained, so planning done with a cheap agent makes sense here. An open-ended agent would likely use a more expensive call for planning.

▲

whalesalad

46 minutes ago

[-]

Looking at the diagram, is this seriously a case of replacing basic functional concepts like "write to clickhouse" or "have we seen this before" to a model? could those be actual function calls in some language?

just seems wasteful all around. having an agent in the critical path when a regular expression (or similar) could do just seems odd. yeah haiku is cheap but re.match() is cheaper.