DeepSpeed's Bag of Tricks for Training Large Models | Dark Hacker News