Moe inference optimizations: 15% lower expert load by request reordering(blog.doubleword.ai)1 points by mezark 1 hour ago | 0 commentsNo comments yet