Show HN: Paper Lantern – improving Autoresearch with research knowledge
2 points
2 hours ago
| 1 comment
| paperlantern.ai
| HN
Hi, we've been working on Paper Lantern - an MCP server that searches 2M+ CS research papers for coding agents. The coding agent describes its problem and PL returns ranked techniques with implementation steps, hyperparameters, and failure modes.

We tested it on Karpathy's autoresearch framework : where the task is to find better llm architecture and training configs. In autoresearch, the agent proposes an optimization, tries a 5 min training run, calculates the val loss and then keeps / discards if the val loss lowered / increased.

We compared a strong baseline agent (Opus 4.6 + web search) vs that same agent + Paper Lantern.

  - agent + Paper Lantern iterated to a config that got a much lower val loss on 5-min runs  

  - we trained the two final configs for 2 hours : the config from Paper Lantern got a 3.2% lower val loss
Two concrete examples :

  1. Both agents tried halving the batch size. The paper-access agent pulled a 2022 paper and scaled the learning rate by 1/sqrt(2) as the paper prescribed. It worked, and further halving kept working. The web-search agent made the same batch change, got worse loss, and moved on without diagnosing the LR.  

  2. The with-paper-lantern agent also implemented AdaGC (adaptive gradient clipping, arxiv 2502.11034, published Feb 2025) on the first try with no tuning. Which the baseline agent did not try at all.  

If you want to deep-dive:

  - (code) https://github.com/paperlantern-ai/autoresearch-experiment

  - (blog) https://www.paperlantern.ai/blog/autoresearch
If you want to try Paper Lantern yourself:

  - Quick setup: `npx paperlantern@latest`
parima08
2 hours ago
[-]
That's an impressive jump in performance by providing the agent with access to relevant literature.

Is there a breakdown of which wins came from hyperparameter values (where BO would likely match this) vs. wins from techniques the agent wouldn’t have tried without the paper?

reply
paperlantern
1 hour ago
[-]
yes - the blog post has a figure showing all the improvements and how big they were.

also, some times the baseline agent tries the same idea but doesn't get as big a boost as the baseline + Paper Lantern agent. We studied it and found the reason was that the baseline tries changes in isolation whereas the research-backed ideas understand the interactions between parameters and suggests multiple changes at the same time - which the baseline agent never discovers.

reply