Atlas / Reports / Detail
PaliGemma: A versatile 3B VLM for transfer
Vision-Language Models report from Google Gemini with 14 connected researchers in the LLMpeople atlas.
Connected researchers
Nan Ding
Google Gemini
Researcher at Google Research whose public work includes multimodal and vision-language modeling, with arXiv publications tied to PaliGemma and related transfer work.
Yonghui Wu
Google Gemini
Google researcher whose official profile says he joined Google in September 2008 and has been with Google Brain since January 2015, with research interests spanning information retrieval, machine learning, machine translation, and natural language processing.
Koray Kavukcuoglu
Google Gemini
Chief Technology Officer at Google DeepMind, with work spanning machine learning and reinforcement learning.
Jiahui Yu
Google Gemini
Jiahui Yu is a Research Lead at OpenAI leading the Perception team. His homepage notes prior co-leadership on Gemini Multimodal at Google DeepMind and work on deep learning and high-performance computing.
Radu Soricut
Google Gemini
Radu Soricut is a Distinguished Scientist at Google DeepMind working on natural language processing and machine learning, with earlier Google Research and Google Translate work.
Matthieu Devin
Google Gemini
Research scientist at Google DeepMind based in Paris, focused on deep learning and computer vision.
Nikolay Savinov
Google Gemini
Research scientist at Google DeepMind on the Gemini team, working on multimodal AI.
Leonardo Beyer
Google Gemini
Leonardo Beyer is a research scientist at Google DeepMind. His public homepage highlights work across representation learning, multimodal models, and large-scale machine learning systems.
Xiaohua Zhai
Google Gemini
Xiaohua Zhai is a researcher on the Google Research team in Zurich whose work focuses on large multimodal models and efficient deep learning.
William Kolesnikov
Google Gemini
Staff software engineer at Google DeepMind working on post-training, alignment, multimodal models, and data filtering. He previously worked on hardware and software co-design for machine learning.
Siyuan Li
Google Gemini / NVIDIA
Siyuan Li is a research scientist at NVIDIA working on large language models, multimodal foundation models, and reinforcement learning. His homepage says he received a PhD in computer science from the University of Toronto in 2024 and previously worked at Meta AI, Microsoft Research, and Mila.
Xinyi Chen
Google Gemini
Xinyi Chen is a PhD candidate in computer science at Princeton University and concurrently a research scientist at Google DeepMind. Her public homepage says she works at the intersection of machine learning, optimization, and dynamical systems, focusing on robust and efficient methods for sequential decision-making and control, and that she previously completed undergraduate studies in mathematics at Princeton.
Xiuye Gu
Google Gemini
Xiuye Gu is a researcher whose public work focuses on vision-language modeling and machine learning systems.
Maxwell Collins
Google Gemini
Maxwell Collins is a Research Scientist at Google DeepMind.