A collection of small updates from the Anthropic Interpretability team | Dark Hacker News