MegaScale: Scaling Large Language Model Training to More Than 10k GPUs [pdf](usenix.org)1 points by yankcrime 1 year ago | 0 commentsNo comments yet