Show HN: TurboPrefill – Multi-GPU prefill acceleration for llama.cpp
1 points
1 hour ago
| 0 comments
| github.com
| HN
TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.
No one has commented on this post.