Ask HN: For those of you building AI agents, how have you made them faster?
2 points
13 hours ago
| 1 comment
| HN
Because of the coordination across multiple systems + chaining LLM calls, a lot of agents today can feel really slow. I would love to know how others are tackling this:

- How are you all identifying performance bottlenecks in agents?

- What types of changes have gotten you the biggest speedups?

For us we vibe-coded a profiler to identify slow LLM calls - sometimes we could then switch out a faster model for that step or we'd realize we could shrink the input tokens by eliminating unnecessary context. For steps requiring external access (browser usage, API calls), we've moved to fast start external containers + thread pools for parallelization. We've also experimented some with UI changes to mask some of the latency.

What other performance enhancing techniques are people using?

codingdave
10 hours ago
[-]
To start with, I'm not coordinating across multiple systems or chaining LLM calls. I write all data to a central data store, run a state machine on it, and each LLM call then operates independently.

At that point, you can measure and optimize each call as needed and it is fairly straightforward.

reply