TransformerLens
Visit ToolTransformerLens is an open-source library for mechanistic interpretability of GPT-style language models. It allows researchers to reverse engineer model algorithms and analyze internal activations.
At a glance
Trending
TransformerLens is an open-source library for mechanistic interpretability of GPT-style language models. It allows researchers to reverse engineer model algorithms and analyze internal activations.
Trending
About
TransformerLens is an open-source Python library designed for the mechanistic interpretability of GPT-2 style language models. Maintained by Bryce Meyer and created by Neel Nanda, this tool enables users to load over 50 different open-source language models and expose their internal activations. Researchers can cache any internal activation and add functions to edit, remove, or replace these activations during model execution. The library supports in-depth analysis to reverse engineer the algorithms models learn from their weights, making it a crucial resource for understanding how large language models function internally. It also includes experimental support for Mamba / SSM architectures, providing bridge adapters for Mamba-1 and Mamba-2.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending