Atlas / Reports / Detail
Collective Constitutional AI: Aligning a Language Model with Public Input
Alignment and RLHF report from Anthropic with 25 connected researchers in the LLMpeople atlas.
Connected researchers
Dario Amodei
Anthropic / OpenAI
Co-founder and CEO of Anthropic.
Amanda Askell
Anthropic / OpenAI
Amanda Askell is a philosopher and AI alignment researcher at Anthropic. Her personal site says she previously worked as a research scientist on the policy team at OpenAI.
Jack Clark
Anthropic / OpenAI
Co-founder and Head of Policy at Anthropic. His public biography also notes earlier work as Policy Director at OpenAI, a technical journalist, and author of the Import AI newsletter.
Yuntao Bai
Anthropic
Anthropic researcher whose work includes reinforcement learning from human feedback and Constitutional AI; previously a Sherman Fairchild Postdoctoral Scholar in theoretical high-energy physics at Caltech.
Andy Jones
Anthropic
Anthropic researcher working on machine learning and AI-assisted science; previously built tools for learning from text, images, and tabular data.
Kamal Ndousse
Anthropic
Researcher at Anthropic working on alignment, reasoning, and evaluation for large language models.
Anna Chen
Anthropic
Anthropic report author listed on RLHF, Constitutional AI, Collective Constitutional AI, and Many-shot Jailbreaking reports, with report-backed work on alignment and adversarial evaluation.
Nova DasSarma
Anthropic
Anthropic report author whose public publication record includes work on language model evaluations, AI safety, and model behavior.
Nicholas Joseph
Anthropic
Researcher at Anthropic working on the alignment and evaluation of advanced AI systems.
Saurav Kadavath
Anthropic
Researcher at Anthropic whose public report authorships and scholarly profiles cover language model evaluation, AI safety, and robustness.
Jackson Kernion
Anthropic
Member of Anthropic's Interpretability team, where he works on understanding how large language models work.
Tom Conerly
Anthropic
Anthropic report author whose public publication record includes work on language model calibration, interpretability, and AI safety.
Nelson Elhage
Anthropic
Nelson Elhage is an engineer and researcher at Anthropic, where he works on the pretraining team after earlier work on reverse-engineering large language models. He previously worked at Stripe and Ksplice/Oracle on systems software and is known for open-source systems projects such as livegrep and reptyr.
Zac Hatfield-Dodds
Anthropic
Staff software engineer at Anthropic building systems for AI safety, reliability, and alignment.
Catherine Olsson
Anthropic
Catherine Olsson is an AI alignment researcher and writer whose public website and Anthropic author page describe work on AI safety, interpretability, and building helpful, harmless assistants.
Tom Brown
Anthropic
Research scientist at Anthropic working on model behavior and interpretability.
Sam McCandlish
Anthropic
Sam McCandlish is listed as an author of the Anthropic technical report Collective Constitutional AI: Aligning a Language Model with Public Input.
Ben Mann
Anthropic
A public Anthropic/AWS presentation describes Ben Mann as an Anthropic co-founder and former GPT-3 and API engineer at OpenAI. The previously attached benmann.com homepage is now a parked domain rather than a personal research site.
Jared D. Kaplan
Anthropic
Jared D. Kaplan is a co-founder and Chief Science Officer at Anthropic. Anthropic's public materials also identify him as the company's Responsible Scaling Officer.
Jared Mueller
Anthropic
Jared Mueller is affiliated with Anthropic. Public materials list him as an Anthropic participant at the 2023 Economics of Robots Conference, and the linked arXiv paper lists him as a coauthor on Anthropic's Constitutional AI work.
Joshua Landau
Anthropic
Joshua Landau is affiliated with Anthropic. Public Anthropic research materials list him as a coauthor of Measuring Progress on Scalable Oversight for Large Language Models, and the linked arXiv paper lists him as a coauthor of Constitutional AI.
Timothy Telleen-Lawton
Anthropic
Timothy Telleen-Lawton is an independent researcher focused on inspiring and scaling collective intelligence. His public LessWrong profile says he previously served as Head of Procurement at Anthropic starting in 2021, and also lists prior work at CFAR and GiveWell.
Nicholas Schiefer
Anthropic
Member of Technical Staff at Anthropic and cofounder of Oulipo Labs, working on language model safety, evaluations, and scientific forecasting.
Herbie Bradley
Anthropic
Computer scientist and machine learning researcher with public work spanning AI systems and alignment-related research.