FilterHN
new
ask
show
jobs
submit
FilterHN
show menu
Q8 KV cache lets a 30B model fit 100K context on a 24 GB RTX 5090
2 points
by
bozdemir
1 hour ago
|
past
| 0 comments
|
buraak.com
|
HN
No one has commented on this post.