FilterHN
new
ask
show
jobs
submit
FilterHN
show menu
Show HN: TurboPrefill – Multi-GPU prefill acceleration for llama.cpp
1 points
by
trykhlieb
1 hour ago
|
past
| 0 comments
|
github.com
|
HN
TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.
No one has commented on this post.