FilterHN

Show HN: Why single agents suck at math proofs

3 points

by austinbaggio

1 hour ago

| past

| 1 comment

| ensue.dev

| HN

Cursor published a great post last week on scaling long-running autonomous coding agents to build a browser.

We’ve been exploring a similar idea, but applied to math proofs

We show how a swarm of agents, coordinated via shared memory, can generate a Lean proof for a non-trivial multi-step math problem-Putnan A2-that a single agent struggled with. We were feeling fancy so we wrote a full post and captured the runtime.

Harness will be public soon(tm) for you to run it yourself.

▲

saidcooldude

1 hour ago

[-]

i thought this work was fun for white box theorem proving. it is interesting to pass the structure to an llm to better understand the problem solving strategy.

using the tree shape as context for other questions was also interesting