Mechanics of Next Token Prediction with Self-Attention | Dark Hacker News