Intro to TLA+ for the LLM Era: Prompt Your Way to Victory
35 points
by zdw
2 days ago
| 4 comments
| emptysqua.re
| HN
jmorse3
2 hours ago
[-]
https://www.sigops.org/2026/can-llms-model-real-world-system...

"Running leading LLMs across the eleven systems shows that LLMs are great at producing correct TLA+ syntax but struggle to ensure conformance and appropriate invariants."

reply
zihotki
1 hour ago
[-]
I think that's what author implies by this sentence in the intro:

> It’s still your responsibility to understand your system and define what “correctness” means, and you need a high-level understanding of temporal logic.

reply
leoqa
1 hour ago
[-]
Unfortunately the benefit of TLA+ is the act of modeling your system painstakingly. The actual checker helps confirm your hypothesis, etc. But skipping the modeling and outsourcing it is not ideal. I’ve always struggled reasoning about models my team mates wrote, and will often have to mentally go through the process of arriving at the same abstractions/invariants etc before I can understand it.
reply
UltraSane
5 minutes ago
[-]
Exactly. I view a complete TLA+ specification as a kind of metalanguage that can be used with LLMs to generate code from.
reply
aerodexis
1 hour ago
[-]
I did this myself a few weeks ago and the technique that helped me the most was to compare the TLA output against the race-conditions I could construct by hand. I worked iteratively and unit-tested the model by constructing a model that didn't have certain race protection mechanisms and validating that the model generated the expected race.

You can also work backwards from the races it generates and ensure that they're real races.

reply
esafak
41 minutes ago
[-]
What did you do to get the TLA output?
reply
aerodexis
37 minutes ago
[-]
I was working from a design-doc, not code.

"look at @design-document and generate a TLA+ specification for the interactions between local and remote"

reply
UltraSane
6 minutes ago
[-]
TLA+ should be generated by hand as a extremely detailed spec for LLMs to use to generate code
reply
relativeadv
5 minutes ago
[-]
right? I read the kleppman post sometime ago about formal verification taking off but i could never square away who verifies the verifier.
reply