Show HN: PTS Library – Analyze LLM reasoning through "thought anchors" I built PTS (Pivotal Token Search), an open-source library for mechanistic interpretability analysis of language models. The core feature is generating "thought anchors" - identifying which specific sentences in a model's reasoning chain significantly impact task success. What it does: - Generates chain-of-thought reasoning traces from any LLM - Uses counterfactual analysis to measure impact of each reasoning step - Identifies critical sentences that make-or-break task completion - Exports semantic embeddings for clustering analysis - Provides systematic failure mode categorization Example use case: I used PTS to compare Qwen3-0.6B vs DeepSeek-R1-Distill-1.5B on math problems and discovered they have fundamentally different reasoning architectures: - DeepSeek: concentrated reasoning (fewer, high-impact steps) - Qwen3: distributed reasoning (impact spread across multiple steps) Quick start: # Generate thought anchors pts run --model="your-model" --dataset="gsm8k" --generate-thought-anchors # Export for analysis pts export --format="thought_anchors" --output-path="analysis.jsonl" The library implements the thought anchors methodology from Bogdan et al. (2025) with extensions for: - Comprehensive metadata collection - 384-dimensional semantic embeddings - Causal dependency tracking - Systematic failure analysis Why this matters: Most interpretability tools focus on individual tokens or attention patterns. Thought anchors operate at the sentence level, revealing which complete reasoning steps actually matter for getting correct answers. Limitations: Currently focused on mathematical reasoning tasks. Planning to extend to other domains and larger models. Links: - GitHub: https://github.com/codelion/pts - Research example: https://huggingface.co/blog/codelion/understanding-model-rea... - Generated datasets: Available on HuggingFace Would appreciate feedback on extending this to other reasoning domains or interpretability approaches. |