TransformerLens

Visit Tool

TransformerLens is an open-source library for mechanistic interpretability of GPT-style language models. It allows researchers to reverse engineer model algorithms and analyze internal activations.

Claim this tool

5Views

At a glance

Pricing

Open Source

Free tier

Yes

API

Yes

Skill level

Technical

About

What is TransformerLens?

TransformerLens is an open-source Python library designed for the mechanistic interpretability of GPT-2 style language models. Maintained by Bryce Meyer and created by Neel Nanda, this tool enables users to load over 50 different open-source language models and expose their internal activations. Researchers can cache any internal activation and add functions to edit, remove, or replace these activations during model execution. The library supports in-depth analysis to reverse engineer the algorithms models learn from their weights, making it a crucial resource for understanding how large language models function internally. It also includes experimental support for Mamba / SSM architectures, providing bridge adapters for Mamba-1 and Mamba-2.

Best used for

Ideal for researchers and developers who need to understand the internal mechanisms of GPT-style language models, analyze their learned algorithms, and conduct mechanistic interpretability studies. Especially valuable for academic research and advanced model debugging.

Common actions

interpret language models

analyze model internals

reverse engineer algorithms

study model behavior

automated workflowcollaborationlow-code/no-codeopen-sourcedeepfakeface swapping"AI Agents"github copilotworkflows

Capabilities

Key features

Load 50+ language models
Expose internal activations
Cache internal activations
Edit/remove/replace activations
Mamba/SSM support

Target Audience

developerprofessor

Integrations

Not yet documented

Pricing & Plans

Open Source

Free

FAQs

What kind of language models does TransformerLens support?

TransformerLens supports over 50 different open-source GPT-2 style language models. It also includes experimental bridge adapters for Mamba-1 and Mamba-2 (SSM architectures), allowing for broader model analysis beyond traditional transformers.

What is the primary goal of mechanistic interpretability with TransformerLens?

The primary goal is to reverse engineer the algorithms that a trained language model has learned from its weights. TransformerLens facilitates this by exposing internal activations, allowing users to inspect, cache, and manipulate them during model execution.

Is TransformerLens suitable for beginners in mechanistic interpretability?

While the field itself is young, TransformerLens aims to lower the bar for entry. It provides key tutorials, resources, and community support, making it accessible for those looking to get started, even with limited prior experience.

Trending

Subcategories trending in Coding & Development

Open Source & Models Code Assistants DevOps & Infrastructure No-Code / Low-Code Backend & APIs Prompt Engineering

Trending

Also listed in

This tool also appears in

Research & Education › Academic Research

Explore

Browse AI tools by category

Content & Design Productivity & Business Coding & Development AI Agents & Automation Research & Education Wellness & Lifestyle Career Development Marketing & Growth Data & Analytics Customer Support & CX Finance E-commerce