New deepseek paper: Natively Trainable Sparse Attention mechanism | Dark Hacker News