LLMpeople
Home People Organizations Reports Fields Schools
Public Atlas People first, reports as evidence, organizations as context.

Atlas / Reports / Detail

On the Biology of a Large Language Model

Interpretability report from Anthropic with 13 connected researchers in the LLMpeople atlas.

Anthropic2025-03-2713 researchers
Field
Interpretability
Organization
Anthropic
arXiv
N/A

Canonical link

https://www.anthropic.com/research/tracing-thoughts-language-model

Connected researchers

Ethan Perez portrait
Researcher 8 reports

Ethan Perez

Anthropic

Research scientist at Anthropic focused on scalable oversight, AI safety, and language model evaluation; previously worked at New York University and Google.

Anthropic
Samuel Marks portrait
Researcher 6 reports

Samuel Marks

Anthropic

Senior research engineer at Anthropic interested in agent foundations, model organisms of misalignment, and human-computer interaction.

Anthropic
David Duvenaud portrait
Researcher 4 reports

David Duvenaud

Anthropic

Associate Professor at the University of Toronto whose research spans deep learning, probabilistic modeling, and machine learning methods for science and AI safety.

Anthropic
Canada
David Bau portrait
Researcher 3 reports

David Bau

Anthropic

Research scientist at Anthropic and assistant professor of computer science at Northeastern University working on interpretability and model understanding.

Anthropic
United States
Can Rager portrait
Researcher 1 reports

Can Rager

Anthropic

Can Rager is listed as an author of the Anthropic technical report On the Biology of a Large Language Model.

Anthropic
Eric J. Michaud portrait
Researcher 1 reports

Eric J. Michaud

Anthropic

Eric J. Michaud is listed as an author of the Anthropic technical report On the Biology of a Large Language Model.

Anthropic
Yonatan Belinkov portrait
Researcher 1 reports

Yonatan Belinkov

Anthropic

Associate Professor in the Technion Faculty of Data and Decision Sciences and a visiting research professor at Google working on natural language processing and machine learning.

Anthropic
Israel
Nikhil Prakash portrait
Researcher 2 reports

Nikhil Prakash

Anthropic

Nikhil Prakash is a research scientist at Anthropic. His homepage says he recently completed a PhD in computer science at Berkeley advised by Jacob Steinhardt, after earlier work at Mila and Carnegie Mellon University. He studies learning in deep networks and the principles that lead to emergent phenomena.

Anthropic
Stephen Casper portrait
Researcher 1 reports

Stephen Casper

Anthropic

Alignment science researcher at Anthropic whose work focuses on black-box evaluations, white-box evaluations, and AI risk.

Anthropic
Nora Belrose portrait
Researcher 2 reports

Nora Belrose

Anthropic

Nora Belrose is an AI researcher whose work studies neural language models, latent structure, and cognition. She has contributed to Anthropic research on tracing and interpreting reasoning in large language models.

Anthropic
David Krueger portrait
Researcher 1 reports

David Krueger

Anthropic

David Krueger is an assistant professor in robust, reasoning, and responsible AI at the University of Montreal and a core academic member at Mila. His homepage says he trained in deep learning under Yoshua Bengio, Roland Memisevic, and Aaron Courville from 2013 to 2021, was at the University of Cambridge from 2021 to 2024, and founded the nonprofit Evitable in 2025.

Anthropic
Benjamin Crouzier portrait
Researcher 1 reports

Benjamin Crouzier

Anthropic

Benjamin Crouzier is listed as an author of the Anthropic technical report On the Biology of a Large Language Model.

Anthropic
Max Tegmark portrait
Researcher 1 reports

Max Tegmark

Anthropic

Max Tegmark is a physicist and professor at MIT whose work spans cosmology, fundamental physics, and the implications of advanced AI systems. After earning his PhD in physics from the University of California, Berkeley in 1994 and undergraduate training at the Royal Institute of Technology in Stockholm, he worked as a postdoctoral researcher at the University of Pennsylvania before joining MIT and later coauthored Anthropic interpretability work on large language models.

Anthropic

LLMpeople is a public atlas for discovering frontier AI researchers with context, provenance, and respect.

Privacy ยท Terms