Rethinking Attention-Model Explainability through Faithfulness Violation Test Yibing Liu, Haoliang Li, Yangyang Guo, Chenqi Kong, Jing Li, Shiqi Wang
The repo with a real transformer model (transformer-bp-lean) has 22 axioms and 7 theorems. In Lean, an axiom is something you state without proving. The system takes your word for it. Here the axioms aren't background math, they're the paper's claims:
- "The FFN computes the Bayesian update" [2]. Axiom.
- "Attention routes neighbors correctly" [3]. Axiom.
- "BP converges" [4]. Axiom, with a comment saying it's "not provable in general."
- The no-hallucination corollary [5]. Axiom.
The paper says "formally verified against standard mathematical axioms" about all of these. They are not verified. They are assumed.
The README suggests running `grep -r "sorry"` and finding nothing as proof the code is complete. In Lean, `sorry` means "I haven't proved this" and throws a compiler warning. `axiom` also means "I haven't proved this" but doesn't warn. So the grep returns clean while 22 claims sit unproved. Meanwhile the godel repo has 4 actual sorries [6] anyway, including "logit and sigmoid are inverses," which the paper treats as proven. That same fact appears as an axiom in the other repo [7]. Same hole, two repos, two different ways to leave it open.
Final count across all five repos: 65 axioms, 5 sorries, 149 theorems.
Claude (credited on page 1) turned it into "Not an approximation of it. Not an analogy to it. The computation is belief propagation." Building to a 2-variable toy experiment on 5K parameters presented as the fulfillment of Leibniz's 350-year-old dream. Ending signed by "Calculemus."
[1] https://github.com/gregorycoppola/sigmoid-transformer-lean/b...
[2] https://github.com/gregorycoppola/transformer-bp-lean/blob/7...
[3] https://github.com/gregorycoppola/transformer-bp-lean/blob/7...
[4] https://github.com/gregorycoppola/transformer-bp-lean/blob/7...
[5] https://github.com/gregorycoppola/transformer-bp-lean/blob/7...
[6] https://github.com/gregorycoppola/godel/blob/bc1d138/Godel/O...
[7] https://github.com/gregorycoppola/sigmoid-transformer-lean/b...
Why would being a Bayesian network explain why transformers work? Bayesian networks existed long before transformers and never achieved their performance.
I have not taken the time to review the paper, but if the claim stands, it means we might have another tool to our toolbox to better understand transformers.
NNs are as close to continuous as we can get with discrete computing. They’re flexible and adaptable and can contain many “concepts.” But their chief strength is also their chief weakness: these “concepts” are implicit. I wonder if we can get a hybrid architecture that has the flexibility of NNs while retaining discrete concepts like a knowledge base does.
This is incorrect. For example, fuzzy logic[0] can model analog ("continuous") truth beyond discrete digital representations, such as 1/0, true/false, etc.
> Which statistical models disclaim that their output is insignificant if used with non-independent features? Naieve Bayes [...]
Ironic then, because if transformers are Bayesian networks then we're using Bayesian networks for non-independent features.
From "Quantum Bayes' rule and Petz transpose map from the minimum change principle" (2025) https://news.ycombinator.com/item?id=45074143 :
> Petz recovery map: https://en.wikipedia.org/wiki/Petz_recovery_map :
> In quantum information theory, a mix of quantum mechanics and information theory, the Petz recovery map can be thought of as a quantum analog of Bayes' theorem
But there aren't yet enough qubits for quantum LLMs: https://news.ycombinator.com/item?id=47203219#47250262
"Transformer is a holographic associative memory" (2025) https://news.ycombinator.com/item?id=43028710#43029899