TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill. |
No comments yet
TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill. |