Atlas / People / Detail

Neel Nanda

Neel Nanda is an independent researcher focused on mechanistic interpretability and understanding neural networks. He previously worked on Anthropic's model diffing team, did the ML Alignment & Theory Scholars program, studied mathematics at the University of Cambridge, and is known for interpretability tooling such as TransformerLens.

Independent mechanistic interpretability researcher and former Anthropic model diffing engineer1 organizations1 reports

Profile status: updated

updated 4 public sources

alignmentneural networksmechanistic interpretability

Current frame

Independent mechanistic interpretability researcher and former Anthropic model diffing engineer