DeepSeek Engram: Conditional Memory via Scalable Lookup [pdf]
4 points
4 hours ago
| 1 comment
| github.com
| HN
alyxya
46 minutes ago
[-]
Unlike most improvements to LLMs that modify the architecture or optimizer or something about the model, this paper discusses a novel technique that relies on some external lookup table in the forward pass computation, with the external lookup happening in parallel with some of the compute. It's a really interesting idea with a lot of cool engineering work behind it, but it looks too convoluted without improvements that could justify the complexity.
reply