Debugging divergence between engine and transformers logprobs for RL | Dark Hacker News