Current frame
Independent mechanistic interpretability researcher and former Anthropic model diffing engineer
Atlas / People / Detail
Neel Nanda is an independent researcher focused on mechanistic interpretability and understanding neural networks. He previously worked on Anthropic's model diffing team, did the ML Alignment & Theory Scholars program, studied mathematics at the University of Cambridge, and is known for interpretability tooling such as TransformerLens.
Profile status: updated
Independent mechanistic interpretability researcher and former Anthropic model diffing engineer