Why Batch Norm Causes Exploding Gradients [2020] | Dark Hacker News