FilterHN

Everyone’s building “async agents,” but almost no one can define them

45 points

by kmansm27

13 hours ago

| past

| 12 comments

| omnara.com

| HN

▲

simonw

1 hour ago

[-]

I like the term "asynchronous coding agent", which I define as the category of coding agent which runs in a container somewhere and files a PR when it's done.

OpenAI Codex Cloud, Claude Code for the web, Gemini Jules and I think Devin (which I've not tried) are four examples.

I like that "asynchronous coding agent" is more specific than "asynchronous agent" - I don't have a firm idea of what an "asynchronous agent" is.

One catch though is that the asynchronous coding agents are getting less asynchronous. Claude Code for the web lets you prompt it while it's running which makes it feel much more like regular Claude Code.

▲

baalimago

52 minutes ago

[-]

I've never heard anyone speak of "async agents". Autonomous agents, yes. Async? No. Sounds like a information bubble, if you ask me. A quick google trends lookup validates this: https://trends.google.com/explore?q=async%2520agents%2Cauton...

And I agree, "async agents" makes little sense

▲

kmansm27

17 minutes ago

[-]

Here's Stripe using the term today - https://x.com/stevekaliski/status/2021034048945070360?s=20

And here's Google using the term to describe Jules - https://news.ycombinator.com/item?id=44813854

So fairly large players are using “async agent” to mean something specific, which seems enough to warrant defining it. It also makes sense that it’s far less common than “autonomous agent”, since “async” is mostly used by technical folks, which is a much smaller audience. I’m definitely in that sf/swe/tech/startup information bubble, but that's where this stuff is taking off.

▲

Edmond

5 hours ago

[-]

For an example of what an "async" agent implementation should help you accomplish: https://youtu.be/hGhnB0LTBUk?si=q78QjgsN5Kml5F1E&t=5m15s

You can use the idea to spin-off background agent tasks that can then be seamlessly merged back into context when they complete.

The example above is a product specific approach but the idea should be applicable in other environments.... it's really an attempt to integrate long running background tasks while continuing with existing context in an interactive manner.

When you start working on the problem of working with automation programs (AKA agents) in an interactive human-in-the-loop fashion, you will naturally run into these kinds of problems.

We've all seen sci-fi movies with AI assistants that seamlessly work with humans in a back and forth manner, async spin-offs are essential for making that work in practice for long running background tasks.

▲

Aperocky

3 hours ago

[-]

Paraphrase: It's not the time, or location, or even concurrency, it's `join()`.

▲

dumpsterdiver

3 hours ago

[-]

One weird skill I have is the ability to describe simple concepts as complex and confusing systems. I’ll take a go at that now.

When working with LLMs, one of my primary concerns is keeping tabs on their operating assumptions. I often catch them red-handed running with assumptions like they were scissors, and I’m forced to berate them.

So my ideal “async agents” are agents that keep me informed not of the outcome of a task, but of the assumptions they hold as they work.

I’ve always been a little slow recognizing things that others find obvious, such as “good enough” actually being good enough. I obtusely disagree. My finish line isn’t “good enough”, it’s “correct”, and yes, I will die on that hill still working on the same product I started as a younger man.

Jokes aside, I really would like to see:

1. Periodic notifications informing me of important working assumptions. 2. The ability to interject and course correct - likely requiring a bit of backtracking. 3. In addition to periodic working assumption notifications, I’d also like periodic “mission statements” - worded in the context of the current task - as assurance that the agent still has its eye on the ball.

▲

Animats

7 hours ago

[-]

"Background job"?

The real question is what happens when the background job wants attention. Does that only happen when it's done? Does it send notifications? Does it talk to a supervising LLM. The author is correct that it's the behavior of the invoking task that matters, not the invoked task.

(I still think that guy with "Gas Town" is on to something, trying to figure out connect up LLMs as a sort of society.)

▲

tiny-automates

4 hours ago

[-]

"background job" is actually the more honest framing.

the interesting design question you're pointing at, what happens when it wants attention, is where the real complexity lives. in practice i've found three patterns: (1) fire-and-forget with a completion webhook (2) structured checkpointing where the agent emits intermediate state that a supervisor can inspect (3) interrupt-driven where the agent can escalate blockers to a human or another agent mid-execution.

most "async agent" products today only implement (1) and call it a day. But (2) and (3) are where the actual value is, being able to inspect a running agent's reasoning mid-task and course-correct before it burns 10 minutes going down the wrong path.

the supervision protocol is the product, not the async dispatch.

▲

thewhitetulip

4 hours ago

[-]

I've written an async agent. It's triggered by a http request. It does a specific processing and updates a database table regarding it's output

▲

DonHopkins

7 hours ago

[-]

Marvin Minsky thought of it a long time before Gas Town, and yes, he was on to something.

https://en.wikipedia.org/wiki/Society_of_Mind

>The Society of Mind is both the title of a 1986 book and the name of a theory of natural intelligence as written and developed by Marvin Minsky.

>In his book of the same name, Minsky constructs a model of human intelligence step by step, built up from the interactions of simple parts called agents, which are themselves mindless. He describes the postulated interactions as constituting a "society of mind", hence the title. [...]

>The theory

>Minsky first started developing the theory with Seymour Papert in the early 1970s. Minsky said that the biggest source of ideas about the theory came from his work in trying to create a machine that uses a robotic arm, a video camera, and a computer to build with children's blocks.

>Nature of mind

>A core tenet of Minsky's philosophy is that "minds are what brains do". The society of mind theory views the human mind – and any other naturally evolved cognitive system – as a vast society of individually simple processes known as agents. These processes are the fundamental thinking entities from which minds are built, and together produce the many abilities we attribute to minds. The great power in viewing a mind as a society of agents, as opposed to the consequence of some basic principle or some simple formal system, is that different agents can be based on different types of processes with different purposes, ways of representing knowledge, and methods for producing results.

>This idea is perhaps best summarized by the following quote:

>What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle. —Marvin Minsky, The Society of Mind, p. 308

That puts Minsky either neatly in the scruffy camp, or scruffily in the neat camp, depending on how you look at it.

https://en.wikipedia.org/wiki/Neats_and_scruffies

Neuro-symbolic AI is the modern name for combining both; the idea goes back to the neat/scruffy era, the term to the 2010s. In 1983 Nils Nilsson argued that "the field needed both".

https://en.wikipedia.org/wiki/Neuro-symbolic_AI

For example, combining Gary Drescher’s symbolic learning with LLMs grounds the symbols: the schema mechanism discovers causal structure, and the LLM supplies meanings, explanations, and generalization—we’re doing that in MOOLLM and spell it out here:

MOOLLM: A Microworld Operating System for LLM Orchestration

See: Schema Mechanism: Drescher's Causal Learning

https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...

Also: LLM Superpowers for the Gambit Engine:

https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...

Schema Mechanism Skill:

https://github.com/SimHacker/moollm/blob/main/skills/schema-...

Schema Factory Skill:

https://github.com/SimHacker/moollm/blob/main/skills/schema-...

Example Schemas:

https://github.com/SimHacker/moollm/tree/main/skills/schema-...

▲

TacticalCoder

7 hours ago

[-]

People can appreciate others for their work but... Minsky is not just named several times in the Epstein files: he went to Epstein's island after Epstein had already been charged several times with sex offenses. And one of the main witness, Virginia Giuffre, said Epstein instructed her to have sex with Minsky.

> "minds are what brains do"

And "a man is what he does".

▲

DonHopkins

6 hours ago

[-]

The record doesn’t say what you’re implying. Virginia Giuffre’s deposition is that Epstein told her to have sex with Minsky. It does not say that Minsky agreed, touched her, or did anything. That’s “he was instructed to be offered to,” not “he did it.”

What we have from people who were there:

Greg Benford (physicist and SF author, present that day) stated publicly: "I was there. Minsky turned her down. Told me about it." [InstaPundit, Aug 2019, quoting Benford: https://instapundit.com/339725/ ]

>Typical Crap Journalism from NYT:

>“In a deposition unsealed this month, a woman testified that, as a teenager, she was told to have sex with Marvin Minsky, a pioneer in artificial intelligence, on Mr. Epstein’s island in the Virgin Islands. Mr. Minsky, who died in 2016 at 88, was a founder of the Media Lab in the mid-1980s.”

>Note, never says what happened. If Marvin had done it, she would say so. I know; I was there. Minsky turned her down. Told me about it. She saw us talking and didn’t approach me.

https://en.wikipedia.org/wiki/Gregory_Benford

Minsky was there with his wife, told her about the approach, and told Benford right afterward. So we have a first‑hand, on-the-record account that he declined, plus the fact that he immediately told his wife and a colleague. There is no evidence he “did” anything.

So: (1) the allegation that he did something is unsupported by the testimony and contradicted by an eyewitness; (2) even if it weren’t, “a man is what he does” has nothing to do with whether Society of Mind or his other theories are valid. Newton’s physics and Minsky’s cognitive architecture stand or fall on evidence and argument, not on moral purity. Conflating a disputed personal allegation with the worth of his ideas is a smear, not an argument.

David Henkel-Wallace (gumby) has posted about this before on HN:

https://news.ycombinator.com/item?id=22015840

>gumby on Jan 10, 2020 | next [–]

>I know several people who were at that island and have discussed this event; one even told me that he remembered it because Marvin came over to him and said "this woman just offered to have sex with me." Also Gloria, his wife, was there, though I haven't asked her about it (and wouldn't). This seems believable to me.

>OTOH I did read Giuffre's deposition and she says not just that she was told by Epstein to proposition various people but that it happened. I find that very hard to believe having known him so long, but she made that statement under oath. Also I'm not sure Marvin was famous enough to be worth making up a story about (as opposed to, say, a famous heir to a throne).

Gumby was mistaken in claiming the deposition says “it happened”; he was very likely inferring it from the same transcript. What "happened" is she was told to have sex with him, but there is absolutely no evidence or testimony that he did, and there is evidence from Greg Benford that he didn't.

Gwern draws the same distinction:

https://news.ycombinator.com/item?id=20774197

Look for yourself here:

https://www.documentcloud.org/documents/7010864-virginia-giu...

Now do you have anything interesting to say about his theories, other than trying to smear him?

▲

emmanueloga_

3 hours ago

[-]

How about framing this in terms of two orthogonal axes the article doesn’t name: concurrency (actors) and continuity (durable execution).

* Durable execution: long‑running, resumable workflows with persistence, replay, and timeouts.

* Actors: isolated entities that own their state and logic, process one message at a time, and get concurrency by existing in large numbers (regardless of whether the runtime uses threads, async/await, or processes under the hood).

Combine the two and you get a "Durable actor", which seems close to what the article calls an “async agent”: a component that can receive messages, maintain state, pause/resume, survive restarts, and call out to an LLM or any other API.

And since spawning is already a primitive in the actor model, the article’s "subagent" fits naturally here too: it’s just another actor the first one creates.

▲

isehgal

12 hours ago

[-]

hey, ishaan here (kartik's cofounder). this post came out of a lot of back-and-forth between us trying to pin down what people actually mean when they say "async agents."

the analogy that clicked for me was a turn-based telephone call—only one person can talk at a time. you ask, it answers, you wait. even if the task runs for an hour, you're waiting for your turn.

we kept circling until we started drawing parallels to what async actually means in programming. using that as the reference point made everything clearer: it's not about how long something runs or where it runs. it's about whether the caller blocks on it.

▲

stavros

7 hours ago

[-]

Not to be all captain hindsight, but I was puzzled as I was skimming the post, as this seemed obvious to me:

Something is async when it takes longer than you're willing to wait without going off to do something else.

▲

tiny-automates

4 hours ago

[-]

that's the user-facing definition but the implementation distinction matters more.

"takes longer than you're willing to wait" describes the UX, not the architecture. the engineering question is: does the system actually free up the caller's compute/context to do other work, or is it just hiding a spinner?

nost agent frameworks i've worked with are the latter - the orchestrator is still holding the full conversation context in memory, burning tokens on keep-alive, and can't actually multiplex. real async means the agent's state gets serialized, the caller reclaims its resources, and resumption happens via event - same as the difference between setTimeout with a polling loop vs. actual async/await with an event loop.

▲

cmsparks

6 hours ago

[-]

IMO feels sorta like Simon Willison's definition of agents. "LLMs in a loop with a goal" feels super obvious, but not sure if I would have described it that way in hindsight

▲

stavros

6 hours ago

[-]

Maybe, but that's what I thought while reading the "what actually is async?" part of the post, so I don't think I got biased towards the answer by that point.

▲

DonHopkins

6 hours ago

[-]

One nuance that helps: “async” in the turn-based-telephone sense (you ask, it answers, you wait) is only one way agents can run.

Another is many turns inside a single LLM call — multiple agents (or voices) iterating and communicating dozens or hundreds of times in one epoch, with no API round-trips between them.

That’s “speed of light” vs “carrier pigeon”: no serialization across the boundary until you’re done. We wrote this up here: Speed of Light – MOOLLM (the README has the carrier-pigeon analogy and a 33-turn-in-one-call example).

Speed of Light vs Carrier Pigeon: The fundamental architectural divide in AI agent systems.

https://github.com/SimHacker/moollm/blob/main/designs/SPEED-...

The Core Insight: There are two ways to coordinate multiple AI agents:

  Carrier Pigeon
    Where agents interact: between LLM calls
    Latency: 500 ms+ per hop
    Precision: degrades each hop
    Cost: high (re-tokenize everything)
  Speed of Light
    Where agents interact: during one LLM call
    Latency: instant
    Precision: perfect
    Cost: low (one call)
  MCP = Carrier Pigeon
    Each tool call:
      stop generation → 
      wait for external response → 
      start a new completion
    N tool calls ⇒ N round-trips

MOOLLM Skills and agents can run at the Speed of Light. Once loaded into context, skills iterate, recurse, compose, and simulate multiple agents — all within a single generation. No stopping. No serialization.

▲

canadiantim

1 hour ago

[-]

Thanks for sharing, that distinction is very helpful.

▲

8note

7 hours ago

[-]

i just imagine it as the swap between "human watching agent while it runs"

vs "agent runs for a long time, tells the user over human interfaces when its done" eg. sends a slack. or something like gemini deep research.

an extension would be that they are triggered by events and complete autonomously with only human interfaces when it gets stuck.

theres a bit of a quality difference rather than exactly functionally, in that the agent mostly doesnt need human interaction beyond a starting prompt, and a notification of completion or stuckness. even if im not blocking on a result, it cant immediately need babying or i cant actually leave it alone

▲

sjanes

7 hours ago

[-]

- I ask for butter and walk away. - It passes the butter to where I expect it to be when I return. - That is its purpose.

▲

jameslk

6 hours ago

[-]

(Rick and Morty reference: https://www.youtube.com/watch?v=X7HmltUWXgs)

▲

jayd16

6 hours ago

[-]

That's just a slow response with extra steps.

There's also the concept of a daemon process that looks for work to do and tells you about it without being prompted.

▲

candiddevmike

6 hours ago

[-]

Do you try to pull the butter onto your knife periodically, or do you wait somehow until it pushes the butter onto your knife? When does it become less work to just go get the butter yourself?

▲

DonHopkins

6 hours ago

[-]

You just need a self-buttering knife, like a self-licking ice cream cone:

https://en.wikipedia.org/wiki/Self-licking_ice_cream_cone

▲

threethirtytwo

2 hours ago

[-]

Nothing new when people make up bullshit words that no one can define. "Life" is a really good one. Or AGI is a recent popular one. And don't forget my favorite bullshit word: Spirituality.

For async agents and for "life" I sort of have a blurry shape of what the thing is in my head, but spirituality is the strangest one. There is no shape. The word is utter bullshit, yet people can carry the word and use it without realizing it has no shape or meaning. It's not that I don't get what "spirituality" is. I "get" it as much as everyone else but it's I've taken the extra step to uncover the utter meaninglessness of the word.

Don't spend too much time thinking about this stuff. It's not profound. You are spending time debating and discussing a linguistic issue. Pinpointing the exact arbitrary definition of some arbitrary set of letters and sounds we call a "word" is an exercise in both arbitrariness and pointlessness.

▲

codazoda

5 hours ago

[-]

So, say you want this. How do you do it with Claude Code?

▲

kmansm27

5 hours ago

[-]

If you're talking about the async agent described in the post (already regretting calling it that, let's call it orchestrator agent instead), looks like https://code.claude.com/docs/en/agent-teams is trying to achieve that

▲

neom

4 hours ago

[-]

Opus[1m] on the API with teams is a very very expensive but very interesting thing to play with, if you're willing to burn a $100 playing with what "state of the art" looks like - I suspect this is it.

▲

jsemrau

7 hours ago

[-]

Read my post on this from 9 months ago: https://jdsemrau.substack.com/p/designing-agents-architectur...

▲

schaefer

7 hours ago

[-]

^^ requires paid subscription.

▲

ETH_start

2 hours ago

[-]

I'll take a stab at it. An async agent is an agent that is triggered autonomously, without direct human intervention, where its execution does not have tight temporal coupling with the other components of the system (agent or otherwise).

Practically speaking, it means they often operate within a larger system, that due to its open-ended nature, produces emergent behavior, meaning behavior that was not explicitly designed.

▲

isehgal

6 minutes ago

[-]

thats a great stab :) We dig into this in the post, but the key distinction we landed on is that the trigger can be asynchronous without the agent itself being async. A cron job, webhook, or autonomous trigger is really about scheduling, not a property of the agent’s execution model.

In other words: triggering without a human ≠ async by itself. What matters is whether the caller blocks on the agent’s work, as opposed to how or when it was kicked off.