Why are your models so big? (2023)
15 points
3 days ago
| 3 comments
| pawa.lt
| HN
unleaded
2 minutes ago
[-]
Still relevant today. Many problems people throw onto LLMs can be done more efficiently with text completion than begging a model 20x the size (and probably more than 20x the cost) to produce the right structured output. https://www.reddit.com/r/LocalLLaMA/comments/1859qry/is_anyo...
reply
semiinfinitely
3 minutes ago
[-]
I honestly don't get why anyone still owns a "computer." Like, have you heard of an iPhone? It fits in your pocket and is literally a supercomputer that connects to the internet. Why would you waste space on a desk with a keyboard and mouse just to browse the web or send an email? It feels totally outdated to sit in a chair to do stuff you can just do on your phone while lying in bed.
reply
siddboots
36 minutes ago
[-]
I think I have almost the opposite intuition. The fact that attention models are capable of making sophisticated logical constructions within a recursive grammar, even for a simple DSL like SQL, is kind of surprising. I think it’s likely that this property does depend on training on a very large and more general corpus, and hence demands the full parameter space that we need for conversational writing.
reply