3Blue1Brown Follow-Up: From Hypothetical Examples to LLM Circuit Visualization(peterlai.github.io) |
3Blue1Brown Follow-Up: From Hypothetical Examples to LLM Circuit Visualization(peterlai.github.io) |
A year later, the field of mechanistic interpretability has seen significant advancements, and we're now able to "decompose" models into interpretable circuits that help explain how LLMs produce predictions. Using the second iteration of an LLM "debugger" I've been working on, I compare the hypothetical representations used in the tutorial to the actual representations I see when extracting a circuit that describes the processing of this specific sentence. If you're into model interpretability, please take a look! https://peterlai.github.io/gpt-circuits/