Atlas / Fields / Detail
Multimodal Models
Researchers connected to this field in the public atlas.
Radu Soricut
Google Gemini
Radu Soricut is a Distinguished Scientist at Google DeepMind working on natural language processing and machine learning, with earlier Google Research and Google Translate work.
Xinyun Chen
Google Gemini / Meta AI
Xinyun Chen's homepage identifies her as an AI research scientist at Meta Superintelligence Labs, previously a staff research scientist at Google DeepMind. It also lists a PhD in Computer Science from UC Berkeley and a BS in Computer Science from Shanghai Jiao Tong University.
Jean-Baptiste Alayrac
Google Gemini / Meta AI
DeepMind researcher working on machine learning, computer vision, and structured learning from video and language.
Yang Song
OpenAI / Alibaba Qwen
Research Principal at Meta Superintelligence Labs. He previously led the strategic explorations team at OpenAI and is known for foundational work on score-based diffusion models.
David Dohan
Google Gemini / OpenAI
David Dohan is a computer scientist at OpenAI studying scalable alignment of language models and generally intelligent reasoning systems. His personal site also notes prior work at Google Brain on foundation model programs, code generation, protein engineering, and scientific reasoning.
Chuanqi Tan
Alibaba Qwen / Z.ai
Chuanqi Tan's homepage says he received a PhD from Tsinghua University in July 2019, is currently focused on LLM research and applications, and is also a postdoctoral fellow at the University of Hong Kong.
Kevin Robinson
Google Gemini
Kevin Robinson is a research engineer at Google Research working on evaluations of language models and NLP systems. His Google Research profile says he previously worked as a special education teacher, a software engineer building visualization and analytics systems, and a researcher in K12 computer science education.
Jiahui Yu
Google Gemini
Jiahui Yu is a Research Lead at OpenAI leading the Perception team. His homepage notes prior co-leadership on Gemini Multimodal at Google DeepMind and work on deep learning and high-performance computing.
Ben Wang
Google Gemini / OpenAI
OpenAI's GPT-4 contributions page credits Ben Wang as attention architecture lead for long context. Public profiles identify him as a University of Pennsylvania undergraduate and an OpenAI researcher from 2021 to 2022.
Vedant Misra
Google Gemini
Google researcher and founding member of the Gemini core team. Public pages reviewed say he previously oversaw Algorithms and Reasoning teams at OpenAI and earlier founded Kemvi, which was acquired by HubSpot.
Clement Farabet
Google Gemini
Clément Farabet's homepage says he is building AI at Google DeepMind. It also describes prior leadership in AI infrastructure at NVIDIA and earlier deep learning platform work at Twitter.
Jingren Zhou
MiniMax / Moonshot AI
Jingren Zhou is Chief Technology Officer of Alibaba Cloud. Public speaker biographies describe him as a computer scientist and entrepreneur whose work includes large-scale AI and cloud systems.
Jiabo Ye
Alibaba Qwen
Research scientist in Tongyi Lab whose public homepage and OpenReview profile describe work on large language models, multimodal learning, and visual grounding. His public profiles also list affiliations with Alibaba Group and East China Normal University.
Aakanksha Chowdhery
Google Gemini
Aakanksha Chowdhery is a machine learning researcher based in New York City. She works on large-scale machine learning across pre-training, post-training, inference, and system efficiency, and is known for contributions such as PaLM, Pathways, and Gemini.
Yuntian Deng
Google Gemini
Yuntian Deng is a machine learning researcher whose public work spans language modeling, reasoning, and large multimodal systems.
Yale Song
Alibaba Qwen
Yale Song is an assistant professor in artificial intelligence at Yonsei University and is also affiliated with the Stanford AI Lab while working part-time with Adobe Research.
Furu Wei
Microsoft
Furu Wei is a Distinguished Scientist and Chief Scientist of Microsoft Research Asia, listed on Microsoft Research and connected in LLMpeople to Microsoft technical reports including Kosmos, VALL-E, BitNet, and Multilingual E5.
Mohammad Norouzi
Google Gemini
Research scientist and engineer focused on machine learning, computer vision, and natural language processing.
Yuhuai Wu
Google Gemini
Research scientist working on large language models, reasoning, agents, and reinforcement learning.
Li Dong
Microsoft
Li Dong is a Microsoft Research principal researcher focused on human language technologies and machine intelligence.
Mingkun Yang
Alibaba Qwen
Mingkun Yang works on multimodal large language models, embodied AI, and robotics. His public profile says he is a postdoc at Zhejiang University and a research scientist at Qwen.
Shaohan Huang
Microsoft
Shaohan Huang is a senior researcher in the General Artificial Intelligence Group at Microsoft Research Asia in Beijing. OpenReview lists him as a Microsoft researcher and a former master's student at Beihang University.
Yueting Wang
Google Gemini
Research scientist at Google DeepMind working on post-training, large language model evaluation, and multimodal alignment.
Matthias Minderer
Google Gemini
Research Scientist at Google DeepMind in London working on large multimodal models, evaluation, agents, and computer vision; he completed a PhD at the University of Tuebingen and MPI for Intelligent Systems.
Olivier Bachem
Google Gemini
Olivier Bachem is a director and research scientist at Google DeepMind working on reinforcement learning from human feedback, language model post-training, and machine learning at scale. He earned his PhD at ETH Zurich, where he studied coresets and sampling methods for large-scale machine learning.
Peng Wang
Alibaba Qwen
Alibaba Qwen report author whose DBLP profile identifies an Alibaba Group affiliation and Qwen technical report authorship.
Yonghui Wu
Google Gemini
Google researcher whose official profile says he joined Google in September 2008 and has been with Google Brain since January 2015, with research interests spanning information retrieval, machine learning, machine translation, and natural language processing.
Shuming Ma
Microsoft
Co-author of the BitNet b1.58 2B4T Technical Report; the paper's author note states that S. Ma is with Microsoft Research.
Chengzheng Xu
Google Gemini
Chengzheng Xu is a research scientist at Google DeepMind whose public homepage highlights work on vision-language models, multimodal learning, and efficient large-scale machine learning.
David Silver
Google Gemini
Computer scientist and reinforcement learning researcher, Professor at University College London, and former Principal Research Scientist at DeepMind.
Hanie Sedghi
Google Gemini
Senior Staff Research Scientist at Google DeepMind working on machine learning, with a focus on efficient inference and training algorithms for large language and vision-language models.
Kelvin Guu
Google Gemini
Research Scientist at Google DeepMind focused on agents, memory, and reasoning; completed a PhD at Stanford advised by Percy Liang.
Qazi Irfan
Google Gemini
Qazi Irfan is a research scientist at Google DeepMind. His public homepage highlights work spanning multimodal learning, visual reasoning, and efficient large-scale machine learning.
Rishabh Singh
Google Gemini
Rishabh Singh is a research scientist at Google DeepMind working on human-centered AI, programming systems, and AI for software and problem solving. His work spans program synthesis, code intelligence, education, and interactive AI systems.
Will Isaacs
Google Gemini
Founding engineer at Anysphere, previously at Google Brain, UC Berkeley, and Scale AI, interested in machine learning, statistics, and systems.
Vladislav Kolesnikov
Google Gemini
Professor at ISTA working on cryptography and machine learning, with interests including privacy-preserving machine learning, large language models, and algorithmic fairness.
Zhifeng Chen
Google Gemini / Z.ai
Zhifeng Chen's public homepage describes him as a distinguished software engineer at Google Brain focused on large-scale computer systems and machine learning applications.
Koray Kavukcuoglu
Google Gemini
Chief Technology Officer at Google DeepMind, with work spanning machine learning and reinforcement learning.
Yifeng Lu
Google Gemini
Member of Technical Staff at Google DeepMind working on machine learning, natural language processing, and large language models.
Raia Hadsell
Google Gemini
VP of Research at Google DeepMind working on robotics and embodied intelligence, with expertise in machine learning, reinforcement learning, neuroscience, and computer vision.
Hongning Wang
Alibaba Qwen
Associate professor at the University of Virginia and Qwen contributor whose research focuses on personalization and recommender systems, online advertising, and AI systems.
Noam Shazeer
Google Gemini
Distinguished Scientist at Google Research and one of the inventors of the transformer architecture; his work also includes language models, speech recognition, and multi-agent reinforcement learning.
Shilong Liu
Alibaba Qwen
Researcher whose public homepage focuses on computer vision, multimodal foundation models, and embodied AI; publication context connects Shilong Liu to the Qwen2.5-Omni technical report.
Douglas Eck
Google Gemini
Research director at Google working on music AI, multimodal generation, and human-AI interaction. He co-founded the Magenta project and has led widely used work on music generation with neural networks.
Jianwei Niu
Alibaba Qwen
Jianwei Niu is a tenure-track research assistant professor in the School of Data Science at Lingnan University, Hong Kong. His research focuses on multimodal learning, computer vision, and embodied AI.
Brennan Saeta
Google Gemini
Public report authorship links Brennan Saeta to the Gemma 2: Improving Open Language Models at a Practical Size at Google.
Jason Wei
Google Gemini / OpenAI
Public report authorship links Jason Wei to the Gemma 3n Technical Report at Google.
Qinyu Chen
DeepSeek / Alibaba Qwen
Qinyu Chen is listed as an author of the DeepSeek technical report DeepSeek-V3 Technical Report.
Donald W. McFadden
Google Gemini
Google Gemini report author listed on Gemini, Gemini 1.5, RecurrentGemma, and CodeGemma technical reports, with report-backed work on multimodal models, long-context models, efficient architectures, and code models.
Yipeng Wang
Z.ai
Z.ai report author listed on GLM-Z1, GLM-4.5, GLM-4.1V/4.5V, and GLM-5 materials, with report-backed work on reasoning, coding, agentic, and multimodal models.
Zihan Jiang
Z.ai
Z.ai report author listed on GLM-Z1, GLM-4.5, GLM-4.1V/4.5V, and GLM-5 materials, with report-backed work on reasoning, agentic, and multimodal models.
Aidan Clark
Google Gemini
Aidan Clark's OpenReview profile shows publications on compute-optimal large language model training and high-fidelity speech synthesis. The profile also lists undergraduate studies at the University of California, Berkeley.
Aitor Lewkowycz
Google Gemini
Research scientist at Google DeepMind interested in large language models and mathematical reasoning. He earned a Ph.D. in mathematics from Columbia University.
Carrie Cai
Google Gemini
Carrie Cai is a machine learning researcher with interests in generative modeling, reinforcement learning, and deep learning theory.
Christopher Potts
Google Gemini
Professor of Linguistics and, by courtesy, Computer Science at Stanford University whose research spans natural language semantics, pragmatics, and AI; he directs CSLI.
Johan Schalkwyk
Google Gemini
Johan Schalkwyk is a speech and language researcher whose public profile highlights work on speech recognition, multilingual systems, conversational AI, and large language models.
Sharat Muralidharan
Google Gemini
Research scientist at Google DeepMind and PhD student at Imperial College London. His public site highlights interests in deep learning, reinforcement learning, and multimodal models, with work spanning Gemini, large-scale reinforcement learning, and self-driving.
Yiang Gu
Google Gemini
Research scientist at Google DeepMind working on multimodal large language models. He completed a PhD at Tsinghua University and was a visiting PhD student at UC San Diego.
Yun-Hsuan Sung
Google Gemini
Yun-Hsuan Sung is a machine learning researcher focused on multimodal learning, robotics, and representation learning.
Yuanzhi Zhu
Alibaba Qwen
Yuanzhi Zhu is a Qwen researcher whose public work includes multimodal and audio-language models.
Sebastian Faust
Google Gemini
Research scientist at Google DeepMind.
Aäron van den Oord
Google Gemini
Aäron van den Oord is a Google DeepMind researcher known for generative and sequence-model research.
HyoukJoong Lee
Google Gemini
HyoukJoong Lee is a research scientist at Google DeepMind. His public work includes long-context and multimodal model research, including Gemini 1.5 and Gemini Diffusion.
Ling Chen
Z.ai
Z.ai researcher focused on multimodal large language models and computer vision, with interests in large-model training and post-training.
Yuan Cao
Google Gemini / Moonshot AI
Researcher whose public Google Scholar profile lists Google Research affiliation and publications on multimodal, generative, and reasoning-focused models.
Amelie Royer
Google Gemini
Research scientist at Google DeepMind working on efficient, adaptive systems that learn on the job and collaborate with people. She completed a Ph.D. in machine learning at Mila and Universite de Montreal.
Branislav Kveton
Google Gemini
Staff research scientist at Google DeepMind and associate professor at Purdue University working on sequential decision making, machine learning, and algorithms.
C. Le Lan
Google Gemini
Research scientist at Google DeepMind.
Dale Schuurmans
Google Gemini
Professor of computing science at the University of Alberta and Canada CIFAR AI Chair with public work on reinforcement learning, optimization, and scalable machine learning.
Eugene N. Ie
Google Gemini
Eugene N. Ie is a Google DeepMind researcher with public work on machine learning and multimodal language models.
Tiago Cai
Google Gemini
Research scientist at Google DeepMind working on machine learning and large-scale multimodal models.
Tony G. Cai
Google Gemini
Tony G. Cai is a researcher at Google DeepMind and a computer science PhD student at Columbia University. His public research interests include large language models, reinforcement learning, optimization, and robotics.
Junnan Li
Salesforce AI Research
Junnan Li is a report-backed author in the LLMpeople atlas, connected through 3 technical reports.
Wenhui Wang
Microsoft
Co-author of "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits"; the paper's author notes list Wenhui Wang with Microsoft Research.
Xi Chen
Z.ai
Xi Chen is listed as an author of the Z.ai technical report GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
Yichi Zhang
Moonshot AI / Z.ai
Researcher at Moonshot AI and co-author of the Kimi K2.5 report on visual agentic intelligence.
Yuxuan Hu
Z.ai
Co-author of GLM-4.1V-Thinking and GLM-4.5V, multimodal reasoning models trained with scalable reinforcement learning.
Andrew M. Dai
Google Gemini
Research scientist at Google DeepMind in Mountain View working on machine learning, reinforcement learning, and robotics.
Demis Hassabis
Google Gemini
Founder and CEO of Google DeepMind, leading AI research and product development; his work spans AI, neuroscience, game playing, and structural biology.
D. Sculley
Google Gemini
Research Director at Google working on machine learning, production systems, and sociotechnical AI.
Oriol Vinyals
Google Gemini
Chief Scientist at Google DeepMind and Vice President of Research leading Gemini, with work spanning scalable sequence learning, large language models, games, and robotics.
Rohan Anil
Google Gemini
Rohan Anil is a research scientist at Google DeepMind. His public homepage highlights work on large language models, efficient machine learning systems, and multimodal AI.
Sebastian Borgeaud
Google Gemini
Research scientist at Google DeepMind in London working on agentic reasoning, efficient inference, and large-scale post-training, with a background in high-dimensional statistics and theory.
Vincent Vanhoucke
Google Gemini
Senior Staff Research Scientist at Google DeepMind and CTO of the Gemini app, with work spanning speech, language, vision, and large-scale AI systems.
Yossi Matias
Google Gemini
Vice President of Engineering and Research at Google and site lead for the Google Center in Israel; he also leads Search, Research, and AI for Crisis Response.
Dongxu Li
Salesforce AI Research
Dongxu Li is a report-backed author in the LLMpeople atlas, connected through 2 technical reports.
Steven Hoi
Salesforce AI Research
Steven Hoi is a report-backed author in the LLMpeople atlas, connected through 2 technical reports.
Yaru Hao
Microsoft
Yaru Hao is a report-backed author in the LLMpeople atlas, connected through 2 technical reports.
Zhang Zhang
Z.ai
Public report authorship links Zhang Zhang to the GLM-5: Thinking, Coding, and Agentic Intelligence at Z.ai.
Zhiliang Peng
Microsoft
Zhiliang Peng is a report-backed author in the LLMpeople atlas, connected through 2 technical reports.
Chaoyou Fu
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Dangyang Chen
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Di Fu
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Dongrui Liu
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Gao Huang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Jiaming Zhang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Jiangning Zhang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Jie Wen
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Kaihao Zhang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Kuang Zhang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Shitao Xiao
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Tianyu Zou
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Wenbo Huang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Xingwen Zhang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Xiwei Wu
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Xu Han
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yasheng Huang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yichao Yan
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yimin Lin
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yiqi Li
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yitian Yuan
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yue Hu
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yuheng Li
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yunxiang Peng
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Yuxin Chen
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Zhang Zhang
BIGAI
Researcher at BIGAI and coauthor of the Emu3: Next-Token Prediction is All You Need.
Alina Beygelzimer
Google Gemini
Senior staff research scientist at Google working on algorithms for decision making under uncertainty and online learning.
Avinatan Hassidim
Google Gemini
Professor of Computer Science at the Hebrew University of Jerusalem and Visiting Faculty Researcher at Google, with work spanning algorithms, algorithmic economics, and AI-related decision systems.
Hang Zhang
Alibaba Qwen
Researcher at Alibaba Group working on multimodal large language models; public profile and publication context connect Hang Zhang to the Qwen2-VL technical report.
Hao Xu
Z.ai
Research scientist at Z.ai focused on multimodal understanding and generation, reinforcement learning, AI agents, and end-to-end models. He received a bachelor's degree from Tsinghua University and a master's degree from Peking University.
James Manyika
Google Gemini
James Manyika is a Google leader whose public work focuses on research, technology, and society.
Linjie Li
Alibaba Qwen
Linjie Li is a research scientist at Alibaba Group and a contributor to the Qwen2.5-Omni Technical Report.
Melvin Johnson
Google Gemini
Senior Staff Research Scientist at Google DeepMind working on language modeling, speech recognition, machine translation, and multimodal understanding.
Nan Ding
Google Gemini
Researcher at Google Research whose public work includes multimodal and vision-language modeling, with arXiv publications tied to PaliGemma and related transfer work.
Qingyang Zhang
Alibaba Qwen
Second-year PhD student at Peking University focused on audio-language foundation models, trustworthy AI, and embodied AI; coauthor of Qwen2-Audio.
Quoc V. Le
Google Gemini
VP at Google DeepMind working on deep learning, computer vision, and language understanding.
Xiaoyu Hu
Alibaba Qwen
Research engineer at Alibaba Group working on audio and multimodal foundation models, multimodal RL, and speech processing; coauthor of Qwen2.5-Omni.
Yinghao Li
Alibaba Qwen
Machine learning engineer and researcher interested in large language models and multimodal audio-language systems; coauthor of Qwen2-Audio.
Zhenyang Wu
Z.ai
Research scientist at Z.ai with research interests in multimodal understanding and generation, large language models, and reinforcement learning. He received a bachelor's degree from the University of Science and Technology of China and a master's degree from Tsinghua University.
Zoubin Ghahramani
Google Gemini
VP of Research at Google DeepMind and Professor of Information Engineering at the University of Cambridge, known for work in probabilistic machine learning and Bayesian statistics.
Alexander Vladymyrov
Google Gemini
Alexander Vladymyrov is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Alex Beutel
Google Gemini
Alex Beutel is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Anthony Meng Huat Tiong
Salesforce AI Research
Anthony Meng Huat Tiong is a report-backed author in the LLMpeople atlas, connected through InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Astra Sharma
Google Gemini
Astra Sharma is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Boyang Li
Salesforce AI Research
Boyang Li is a report-backed author in the LLMpeople atlas, connected through InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Chen Li
ByteDance Seed
Chen Li is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Gabriel Murphy
Google Gemini
Gabriel Murphy is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Geng Ji
Google Gemini
Geng Ji is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Jason Choi
Google Gemini
Jason Choi is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Jiaxuan Wang
Google Gemini
Jiaxuan Wang is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Jinguo Zhu
ByteDance Seed
Jinguo Zhu is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Junqi Zhao
Salesforce AI Research
Junqi Zhao is a report-backed author in the LLMpeople atlas, connected through InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Kate Lee
Google Gemini
Kate Lee is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Kun Yi
ByteDance Seed
Kun Yi is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Limin Wu
Google Gemini
Limin Wu is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Lin Song
ByteDance Seed
Lin Song is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Lisa Luu
Google Gemini
Lisa Luu is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Mandy Guo
Google Gemini
Mandy Guo is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Pascale Fung
Salesforce AI Research
Pascale Fung is a report-backed author in the LLMpeople atlas, connected through InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Rhomni St. John
Google Gemini
Rhomni St. John is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Robin Lester
Google Gemini
Robin Lester is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Shibo Wang
Google Gemini
Shibo Wang is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Sijie Zhao
ByteDance Seed
Sijie Zhao is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Weisheng Wang
Salesforce AI Research
Weisheng Wang is a report-backed author in the LLMpeople atlas, connected through InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Wenhu Chen
Microsoft
Wenhu Chen is a report-backed author in the LLMpeople atlas, connected through Kosmos-G: Generating Images in Context with Multimodal Large Language Models.
Wenjing Li
Google Gemini
Wenjing Li is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Wenliang Dai
Salesforce AI Research
Wenliang Dai is a report-backed author in the LLMpeople atlas, connected through InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Xiaohan Ding
ByteDance Seed
Xiaohan Ding is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Xichen Pan
Microsoft
Xichen Pan is a report-backed author in the LLMpeople atlas, connected through Kosmos-G: Generating Images in Context with Multimodal Large Language Models.
Yilin Wu
Google Gemini
Yilin Wu is listed as an author of the Google technical report Gemini: A Family of Highly Capable Multimodal Models.
Ying Shan
ByteDance Seed
Ying Shan is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Yixiao Ge
ByteDance Seed
Yixiao Ge is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Yu Sun
Google Gemini
Yu Sun is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Yuying Ge
ByteDance Seed
Yuying Ge is a report-backed author in the LLMpeople atlas, connected through SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.
Zhenkai Zhu
Google Gemini
Zhenkai Zhu is listed as an author of the Google technical report Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
Adam Lewkowycz
Google Gemini
Research scientist at Google DeepMind working on the theoretical foundations of machine learning.
Adrian Ibarz
Google Gemini
Adrian Ibarz is a Google DeepMind researcher whose public work spans machine learning, reasoning, and large multimodal models.
Alan Rabinovich
Google Gemini
Research scientist and founder of rabinovich.ai, with work spanning multimodal generative models, visual perception, and immersive experiences.
Anmol Kalra
Google Gemini
Anmol Kalra is a research scientist at Google DeepMind. His public homepage presents his work and publications in machine learning and AI systems.
Arun Narayanan
Google Gemini
Research scientist at Google DeepMind working on large language models and natural language processing.
Bhupendra Gupta
Google Gemini
Software engineer at Google DeepMind and PhD student at Cornell working on machine learning for social impact, with interests in LLMs, generative models, and optimization.
Chris McLeavey
Google Gemini
Chris McLeavey is a research scientist at Google DeepMind working on generalist multimodal models at the intersection of language and vision.
Danny Zhou
Google Gemini
Research scientist at Google DeepMind working on large language models and multimodal models. He earned a PhD in computer science from Stanford University.
David Krueger
Google Gemini
Assistant Professor in the Department of Computer Science and Technology at the University of Cambridge, with research focused on making AI systems safer, more efficient, and more robust.
David Uthus
Google Gemini
Research scientist at Google DeepMind focused on human-computer interaction, accessibility, and interfaces for AI systems.
Hao Ma
Google Gemini
Research scientist at Google DeepMind working on multimodal language models and long-context machine learning systems.
Jianfei Chen
Alibaba Qwen
Jianfei Chen is an assistant professor at Monash University. His research spans computer vision, machine learning, multimodality, and trustworthy AI.
Joaquin Ferrer
Google Gemini
Director of Product Management at Google DeepMind leading ML and AI platforms, model developer experiences, and workflows that power the Gemini app and API.
Jonathan Toulis
Google Gemini
Principal statistician at Google DeepMind whose work spans causal inference, statistics, and machine learning.
Joseph W. Demmel
Google Gemini
Distinguished professor emeritus of electrical engineering, computer science, and mathematics at UC Berkeley. His research focuses on numerical linear algebra, parallel computing, and communication-avoiding algorithms.
Julien Perolat
Google Gemini
Julien Perolat is a research scientist at Google DeepMind whose public homepage highlights work on game theory, multi-agent learning, reinforcement learning, and responsible AI.
Kelly Huang
Google Gemini
Research scientist at Google DeepMind working on large-scale multimodal models.
Kenrick Cato
Google Gemini
Google researcher whose publications include the Gemini technical report.
Kexuan Wei
Alibaba Qwen
Researcher working on multimodal foundation models, including Qwen3-Omni and related speech-language systems.
Lehou Cheng
Google Gemini
Postdoctoral researcher at UC Berkeley and Berkeley AI Research interested in natural language processing, machine learning, and human-computer interaction.
Linjun Yang
Alibaba Qwen
Research scientist in Tongyi Lab and technical lead of Qwen2.5-Omni, with public work on end-to-end speech understanding and generation.
Mahesh Shanmugam
Google Gemini
Mahesh Shanmugam is a research scientist at Google DeepMind whose public homepage highlights work on multimodal representation learning, self-supervised learning, and generative models.
Mark Bosma
Google Gemini
Mark Bosma is a senior research scientist at Google DeepMind. His public homepage highlights work in machine learning, reinforcement learning, and neural networks.
Mateusz Malinowski
Google Gemini
Research scientist at Google DeepMind in Switzerland working on large multimodal models and generative AI.
Matt Hoffman
Google Gemini
Researcher and engineer focused on machine learning, distributed systems, and applied algorithms; his personal site also highlights interests in psychology, neuroscience, and evolutionary biology.
Miao Du
Google Gemini
Research scientist at Google DeepMind and PhD student at Stanford University. His homepage highlights work on machine learning, reinforcement learning, language models, and recommendation systems.
Mina Lee
Google Gemini
Assistant Professor of Computer Science at the University of Southern California and incoming part-time Visiting Faculty Researcher at Google DeepMind; her research combines linguistic structure and machine learning for natural language processing.
Moe Drammeh
Google Gemini
Research scientist at Google DeepMind working on large language models, multimodal language models, and computer vision, according to his public OpenReview profile.
MohammadHassan Moghimi
Google Gemini
MohammadHassan Moghimi is a senior staff software engineer at Google DeepMind whose work focuses on multimodal models for vision and natural language, including parameter-efficient tuning, adaptation, and evaluation.
Natalie Bergas
Google Gemini
Research scientist at Google DeepMind working on multimodal machine learning, reinforcement learning, and mathematical optimization.
Paul Welbl
Google Gemini
Research scientist at Google DeepMind in London. He completed a PhD at the University of Oxford, where his work focused on natural language processing and computational argumentation.
Ravi Seethapathy
Google Gemini
Ravi Seethapathy is a research scientist at Google DeepMind. His public homepage presents work at the intersection of machine learning, science, and large-scale AI systems.
Rebecca Roelofs
Google Gemini
Senior research scientist at Google DeepMind with public work on machine learning evaluation, uncertainty, and reliability.
Rory Pilgrim
Google Gemini
Research scientist at Google DeepMind working on large language models and multimodal models. He completed a PhD in computer vision and machine learning at the University of Oxford.
Sasha Seneviratne
Google Gemini
Research scientist at Google DeepMind working on multimodal, multilingual, and efficient machine learning.
Siyao Guo
Google Gemini
Research scientist at Google DeepMind in New York working on vision-language and multimodal large language models. He is completing a PhD in computer science at Carnegie Mellon University.
Yong Cheng
Google Gemini
Research scientist at Google DeepMind in Mountain View working on large multimodal foundation models and agents. He received a PhD from the Chinese University of Hong Kong.