Daya Guo portrait
Researcher 13 reports

Daya Guo

DeepSeek / Moonshot AI

AI researcher at DeepSeek working on natural language processing, code intelligence, and large language model reasoning.

Radu Soricut portrait
Researcher 4 reports

Radu Soricut

Google Gemini

Radu Soricut is a Distinguished Scientist at Google DeepMind working on natural language processing and machine learning, with earlier Google Research and Google Translate work.

Haoyu Lu portrait
Researcher 3 reports

Haoyu Lu

DeepSeek / Moonshot AI

Haoyu Lu is a Ph.D. student at Renmin University of China working on multimodal foundation models and video understanding. His homepage highlights papers and code including DeepSeek-VL, UniAdapter, and VDT.

Wanli Ouyang portrait
Researcher 2 reports

Wanli Ouyang

MiniMax

Wanli Ouyang is a professor at Shanghai AI Laboratory. His homepage says he is also with MMlab and the SIGMA lab, obtained a PhD from the Chinese University of Hong Kong, and works on AI4Science, computer vision, and pattern recognition.

Yue Cao portrait
Researcher 2 reports

Yue Cao

DeepSeek

CEO of Sand AI. His homepage describes prior work leading multimodal and vision research at BAAI and serving as a senior researcher at Microsoft Research Asia.

Jifeng Dai portrait
Researcher 1 reports

Jifeng Dai

Shanghai AI Laboratory

Jifeng Dai is a tenured associate professor in electronic engineering at Tsinghua University and founder of Fundamental Vision. His research spans computer vision, deep learning, multimodal learning, and autonomous driving. He previously worked at Microsoft Research Asia and SenseTime Research, and he received both his bachelor's and PhD degrees from Tsinghua University.

Junyang Lin portrait
Researcher 13 reports

Junyang Lin

Alibaba Qwen

Junyang Lin (Justin Lin) is a researcher and open-source maintainer known for the Qwen family of models. His public profiles list interests in LLMs, AI agents, multimodal learning, long-horizon reasoning, world models, and reinforcement learning; multiple March 2026 news reports said he stepped down from the Qwen tech lead role.

Jifeng Dai portrait
Researcher 3 reports

Jifeng Dai

DeepSeek / MiniMax

Jifeng Dai is a tenured associate professor in the Department of Electronic Engineering at Tsinghua University. His homepage says his current research focuses on agentic AI and continual learning, and lists prior roles at Shanghai AI Lab, SenseTime Research, and Microsoft Research Asia.

Xingcheng Yao portrait
Researcher 1 reports

Xingcheng Yao

Moonshot AI

Xingcheng Yao is a research scientist at Moonshot AI. His public profile notes prior work as a research engineer at Tencent AI Lab, a PhD in computer science from the University of Southern California, and research interests spanning NLP, multimodal systems, and AI agents.

Xinyi Chen portrait
Researcher 1 reports

Xinyi Chen

Google Gemini

Xinyi Chen is a PhD candidate in computer science at Princeton University and concurrently a research scientist at Google DeepMind. Her public homepage says she works at the intersection of machine learning, optimization, and dynamical systems, focusing on robust and efficient methods for sequential decision-making and control, and that she previously completed undergraduate studies in mathematics at Princeton.

Hao Zhang portrait
Researcher 3 reports

Hao Zhang

Moonshot AI / NVIDIA

Researcher at NVIDIA Research. Previously a PhD student in Computer Science and Engineering at HKUST, with earlier internships at International Digital Economy Academy and Microsoft Research.

Luke Zettlemoyer portrait
Researcher 6 reports

Luke Zettlemoyer

Ai2

Luke Zettlemoyer works on empirical methods for natural language semantics, machine learning, new tasks and datasets, and self-supervision for pre-training.

Pengyu Cheng portrait
Researcher 1 reports

Pengyu Cheng

Moonshot AI

Pengyu Cheng is a researcher at Alibaba Group leading reinforcement-learning training for the Qwen large-model application team. His homepage also lists prior work with Moonshot AI and Tencent's Hunyuan large-model team.

Hao Yang portrait
Researcher 5 reports

Hao Yang

DeepSeek / Moonshot AI

Hao Yang works on multimodal data infrastructure at Moonshot.ai. He previously worked at ByteDance ICVG and Microsoft Research Asia, and received BS and PhD degrees from Tsinghua University.

Yiheng Xu portrait
Researcher 1 reports

Yiheng Xu

Alibaba Qwen

Researcher at OpenAI whose homepage highlights work on document understanding, coding agents, and computer-use agents.

Runxin Xu portrait
Researcher 6 reports

Runxin Xu

DeepSeek

Researcher at DeepSeek whose public homepage describes work on DeepSeek R1, V1, V2, V3, Math, Coder, and mixture-of-experts systems.

Jiahui Yu portrait
Researcher 8 reports

Jiahui Yu

Google Gemini

Jiahui Yu is a Research Lead at OpenAI leading the Perception team. His homepage notes prior co-leadership on Gemini Multimodal at Google DeepMind and work on deep learning and high-performance computing.

Shuai Bai portrait
Researcher 6 reports

Shuai Bai

Alibaba Qwen

Senior algorithm expert at Alibaba Group working on large language models, multimodal large language models, and diffusion models.

Jingren Zhou portrait
Researcher 23 reports

Jingren Zhou

MiniMax / Moonshot AI

Jingren Zhou is Chief Technology Officer of Alibaba Cloud. Public speaker biographies describe him as a computer scientist and entrepreneur whose work includes large-scale AI and cloud systems.

Jian Yang portrait
Researcher 4 reports

Jian Yang

Alibaba Qwen

Jian Yang is an Associate Professor at Beihang University whose research focuses on code intelligence, large language models, and AI agents. He worked with Alibaba Qwen from 2023 to July 2025.

Huazuo Gao portrait
Researcher 7 reports

Huazuo Gao

DeepSeek

Researcher at DeepSeek AI working on decision-making and post-training for large language models.

Jiabo Ye portrait
Researcher 3 reports

Jiabo Ye

Alibaba Qwen

Research scientist in Tongyi Lab whose public homepage and OpenReview profile describe work on large language models, multimodal learning, and visual grounding. His public profiles also list affiliations with Alibaba Group and East China Normal University.

Jinze Bai portrait
Researcher 3 reports

Jinze Bai

Alibaba Qwen

PhD student at The Hong Kong University of Science and Technology (Guangzhou) whose research interests include large language models, vision-language models, AI agents, and multimodal retrieval.

Yulun Du portrait
Researcher 2 reports

Yulun Du

Moonshot AI

Yulun Du is a Moonshot AI-affiliated researcher. Public profiles also show prior work and study at Carnegie Mellon University, including a Master of Language Technologies completed in 2020.

Siyuan Li portrait
Researcher 2 reports

Siyuan Li

Google Gemini / NVIDIA

Siyuan Li is a research scientist at NVIDIA working on large language models, multimodal foundation models, and reinforcement learning. His homepage says he received a PhD in computer science from the University of Toronto in 2024 and previously worked at Meta AI, Microsoft Research, and Mila.

Flood Sung portrait
Researcher 2 reports

Flood Sung

Moonshot AI

Researcher and engineer focused on reinforcement learning and embodied intelligence; his public profile lists work spanning Huawei Noah's Ark Lab, Momenta, Moonshot AI, and XVI Robotics, and he is credited on Moonshot AI technical reports.

Liang Chen portrait
Researcher 2 reports

Liang Chen

Moonshot AI

Research scientist at Moonshot AI working on foundation models, multimodal large language models, and agents; previously worked at Huawei Noah's Ark Lab and studied at the Chinese University of Hong Kong.

Wei Ding portrait
Researcher 2 reports

Wei Ding

Alibaba Qwen

Research scientist at Alibaba working on multimodal learning and generation; previously a postdoctoral researcher at Carnegie Mellon University.

Yixiao Ge portrait
Researcher 2 reports

Yixiao Ge

Shanghai AI Laboratory

Yixiao Ge is a Research Scientist at Shanghai AI Laboratory and OpenGVLab. His work focuses on multimodal large language models, computer vision, efficient deep learning, and vision-language understanding.

Congcong Wang portrait
Researcher 2 reports

Congcong Wang

Moonshot AI

Research scientist at Moonshot AI focused on large multimodal models and large language model post-training.

Deyao Zhu portrait
Researcher 2 reports

Deyao Zhu

DeepSeek

Researcher focused on AGI, multimodal models, and reasoning. Coauthor of Janus and JanusFlow.

Dongliang Wang portrait
Researcher 2 reports

Dongliang Wang

Moonshot AI

Dongliang Wang is a research scientist at Moonshot AI whose public profiles highlight multimodal large language models. His homepage also notes earlier PhD work at Shanghai AI Lab and Shanghai Jiao Tong University.

Huabin Zheng portrait
Researcher 2 reports

Huabin Zheng

Moonshot AI

Huabin Zheng is a research scientist at Moonshot AI. His homepage says he works on large language models, multi-agent systems, code generation, and game agents.

Jun Tang portrait
Researcher 2 reports

Jun Tang

Alibaba Qwen

Jun Tang works on multimodal foundation models, open-source language models, and agent systems. His personal site highlights work on Qwen and Qwen3-VL alongside related multimodal research.

Junzhe Pan portrait
Researcher 2 reports

Junzhe Pan

DeepSeek

PhD student at Tsinghua University focusing on multimodal large language models, reasoning, and reinforcement learning.

Keqin Chen portrait
Researcher 2 reports

Keqin Chen

Alibaba Qwen

Researcher focused on large language models and multimodal learning, with public profiles linking Keqin Chen to Beihang University and to Qwen vision-language model work.

Tao Yu portrait
Researcher 2 reports

Tao Yu

Moonshot AI

Assistant Professor of Computer Science at the University of Hong Kong and director of XLANG Lab, focusing on natural language processing and embodied AI agents.

Xiaoqian Shen portrait
Researcher 2 reports

Xiaoqian Shen

DeepSeek

PhD student at Tsinghua University focusing on LLM reasoning, RLHF, and multimodal large language models; research intern at DeepSeek.

Zesen Cheng portrait
Researcher 2 reports

Zesen Cheng

Alibaba Qwen

Qwen researcher and author on the Qwen2-VL and Qwen2.5-VL technical reports, with public profiles linking his work to multimodal and vision-language systems.

Andrea Steiner portrait
Researcher 1 reports

Andrea Steiner

Google Gemini

Research scientist at Google DeepMind working on multimodal generative models, visual generation, and image editing; previously completed a PhD at TU Munich.

Jiahao Liu portrait
Researcher 1 reports

Jiahao Liu

Alibaba Qwen

Jiahao Liu works on multimodal large language models, reasoning systems, and continual learning. His public profiles connect him to the Qwen2.5-VL technical report and related open research work.

Sangho Lee portrait
Researcher 1 reports

Sangho Lee

Ai2

Researcher at the Allen Institute for AI (Ai2) working on vision-language and multimodal AI, with a focus on reliable reasoning and understanding beyond text.

Xi Zhang portrait
Researcher 1 reports

Xi Zhang

Alibaba Qwen

Xi Zhang works on multimodal and vision-language model research. Public profiles connect him to Qwen2-VL and related open research projects.

Noah A. Smith portrait
Researcher 7 reports

Noah A. Smith

Ai2

Noah A. Smith is a computer scientist and professor at the University of Washington, where he serves as Vice Provost for Artificial Intelligence and co-directs the OLMo open language modeling effort with Ai2. His research focuses on natural language processing, machine learning, and evaluation methodology.

Caiming Xiong portrait
Researcher 5 reports

Caiming Xiong

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the xLAM-2 Technical Report.

Mingkun Yang portrait
Researcher 3 reports

Mingkun Yang

Alibaba Qwen

Mingkun Yang works on multimodal large language models, embodied AI, and robotics. His public profile says he is a postdoc at Zhejiang University and a research scientist at Qwen.

Angang Du portrait
Researcher 2 reports

Angang Du

Moonshot AI

Research Scientist at Moonshot AI whose public work focuses on large language models, multimodal models, and embodied AI; he previously earned a PhD from Zhejiang University and was a visiting student at Oxford.

Jianlin Su portrait
Researcher 2 reports

Jianlin Su

Moonshot AI

Research scientist and writer behind Scientific Spaces whose public profile lists work on large language models and service on the Kimi team at Moonshot AI.

Jianqiang Wan portrait
Researcher 2 reports

Jianqiang Wan

Alibaba Qwen

Research scientist in Alibaba DAMO Academy's Tongyi Lab working on multimodal learning, vision-language models, and embodied AI; author on the Qwen2-VL and Qwen2.5-VL technical reports.

Lucas Beyer portrait
Researcher 2 reports

Lucas Beyer

Google Gemini

Lucas Beyer is an ML researcher at Google DeepMind in Zurich. His public homepage highlights prior work at Google Brain and a PhD at ETH Zurich.

Maarten Sap portrait
Researcher 2 reports

Maarten Sap

Ai2

Maarten Sap is an assistant professor at the University of Washington and a senior research scientist at the Allen Institute for AI. His work focuses on human-centered language technologies and social NLP.

Nikolay Savinov portrait
Researcher 2 reports

Nikolay Savinov

Google Gemini

Research scientist at Google DeepMind on the Gemini team, working on multimodal AI.

Qing Yu portrait
Researcher 2 reports

Qing Yu

DeepSeek

Researcher at DeepSeek and a first-year computer science PhD student at the University of Science and Technology of China; works on multimodal reasoning and world models; coauthor of Janus.

Shaowei Liu portrait
Researcher 2 reports

Shaowei Liu

Moonshot AI

Researcher working on multimodal learning and vision-language systems, with public academic work on visual question answering and related topics.

Y. Charles portrait
Researcher 2 reports

Y. Charles

Moonshot AI

Research scientist at Moonshot AI focused on multimodal large language models.

Yunfei Chu portrait
Researcher 2 reports

Yunfei Chu

Alibaba Qwen

Algorithm expert at Alibaba Group working on computer vision, multimodal learning, and large language models.

Yuqi Wang portrait
Researcher 2 reports

Yuqi Wang

DeepSeek

Research scientist at DeepSeek and PhD student at the University of Illinois Urbana-Champaign working on multimodal foundation models, large language models, and embodied AI.

Zaida Zhou portrait
Researcher 2 reports

Zaida Zhou

Moonshot AI

Associate research scientist at Moonshot AI based in Beijing, China; previously worked as a postdoctoral researcher.

Zhibo Yang portrait
Researcher 2 reports

Zhibo Yang

Alibaba Qwen

Zhibo Yang works on multimodal and vision-language systems. Public profiles connect him to the Qwen2.5-VL technical report and to an individual GitHub account that links back to his personal site.

Zhiqi Huang portrait
Researcher 2 reports

Zhiqi Huang

Moonshot AI

Machine learning researcher at Moonshot AI and incoming assistant professor at Shanghai Jiao Tong University.

Junxiao Song portrait
Researcher 8 reports

Junxiao Song

DeepSeek

DeepSeek report author whose DBLP record includes DeepSeek LLM, DeepSeekMath, DeepSeek-Coder-V2, DeepSeek-V3, DeepSeek-R1, Janus, and JanusFlow work.

Haowei Zhang portrait
Researcher 7 reports

Haowei Zhang

DeepSeek

DeepSeek report author whose DBLP-linked publication record includes DeepSeek LLM, DeepSeek-Coder-V2, Janus, DeepSeek-V3, and DeepSeek-R1 work.

Dejian Yang portrait
Researcher 7 reports

Dejian Yang

DeepSeek

DeepSeek team member and co-author of the DeepSeek-V3, DeepSeek-V2, and DeepSeek LLM technical reports.

Peng Wang portrait
Researcher 5 reports

Peng Wang

Alibaba Qwen

Alibaba Qwen report author whose DBLP profile identifies an Alibaba Group affiliation and Qwen technical report authorship.

Wenbin Ge portrait
Researcher 5 reports

Wenbin Ge

Alibaba Qwen

Alibaba Qwen report author whose DBLP record includes Qwen2.5-VL and Qwen technical report work on multimodal and large language models.

Yonghui Wu portrait
Researcher 5 reports

Yonghui Wu

Google Gemini

Google researcher whose official profile says he joined Google in September 2008 and has been with Google Brain since January 2015, with research interests spanning information retrieval, machine learning, machine translation, and natural language processing.

Yejin Choi portrait
Researcher 2 reports

Yejin Choi

Ai2

Dieter Schwarz Foundation Professor and Senior Fellow in Stanford Computer Science and HAI. Her public homepage notes previous roles as professor at the University of Washington and senior director at Ai2.

Leonardo Beyer portrait
Researcher 1 reports

Leonardo Beyer

Google Gemini

Leonardo Beyer is a research scientist at Google DeepMind. His public homepage highlights work across representation learning, multimodal models, and large-scale machine learning systems.

Nuo Xu portrait
Researcher 1 reports

Nuo Xu

Moonshot AI

Multimodal and omni-model engineer whose public profile lists Moonshot AI experience and Kimi-VL among recent projects.

Yufei Zhang portrait
Researcher 1 reports

Yufei Zhang

DeepSeek

Researcher at the University of Illinois Urbana-Champaign focused on vision-language models, multimodal large language models, and physical AI.

Yuqing Wang portrait
Researcher 1 reports

Yuqing Wang

DeepSeek

Research intern at DeepSeek and PhD student at Princeton University whose research interests include large language models and multimodal foundation models.

Zhengyang Wang portrait
Researcher 1 reports

Zhengyang Wang

DeepSeek

Research intern at DeepSeek and master's student at Renmin University of China working on multimodal large language models and AI agents.

Zheren Fu portrait
Researcher 1 reports

Zheren Fu

Alibaba Qwen

Tongyi Lab researcher working on large language models, vision-language models, and reinforcement learning; public profiles connect Zheren Fu to the Qwen2-VL technical report.

Ali Farhadi portrait
Researcher 7 reports

Ali Farhadi

Ai2

CEO of the Allen Institute for AI and professor of computer science at the University of Washington. His work spans computer vision, multimodal learning, reasoning, and embodied AI.

An Yang portrait
Researcher 9 reports

An Yang

Alibaba Qwen

Alibaba researcher working on large language models and multimodal pretraining; public research profiles connect An Yang to Qwen-related work and earlier study at Peking University.

Kai Dang portrait
Researcher 8 reports

Kai Dang

Alibaba Qwen

Researcher on Alibaba's Qwen team focused on large language models and NLP, with public research profiles listing a Nankai University background.

Koray Kavukcuoglu portrait
Researcher 3 reports

Koray Kavukcuoglu

Google Gemini

Chief Technology Officer at Google DeepMind, with work spanning machine learning and reinforcement learning.

Xinlong Wang portrait
Researcher 3 reports

Xinlong Wang

DeepSeek

Xinlong Wang is a researcher working across computer vision, embodied AI, robotics, and machine learning. Public profiles link him to OpenGVLab and Shanghai AI Laboratory, and he is a coauthor of DeepSeek-VL2.

Chenzhuang Du portrait
Researcher 2 reports

Chenzhuang Du

Moonshot AI

Technical staff member at Moonshot AI whose public profile highlights work on web and app agents, multimodal systems, reinforcement learning, and LLMs.

Dikang Du portrait
Researcher 2 reports

Dikang Du

Moonshot AI

Dikang Du is a research scientist at Moonshot AI. His homepage says he received a Ph.D. from Cornell University and works on natural language processing, machine learning, and multimodal learning.

Hao Hu portrait
Researcher 2 reports

Hao Hu

Moonshot AI

Technical staff member at Moonshot AI working on general AI agents, reinforcement learning, and multimodal foundation models.

Haoning Wu portrait
Researcher 2 reports

Haoning Wu

Moonshot AI

PhD student in computer science at the University of Hong Kong working in vision and machine intelligence.

Lin Sui portrait
Researcher 2 reports

Lin Sui

Moonshot AI

Researcher in computer vision and multimodal learning. Public profile lists PhD study in computer science and engineering at HKUST under Qifeng Chen.

Christopher Clark portrait
Researcher 2 reports

Christopher Clark

Ai2

Christopher Clark is a researcher working on language models, efficient inference, and trustworthy NLP systems. His public profile highlights work at the intersection of NLP, efficiency, and model evaluation.

Tianbao Xie portrait
Researcher 2 reports

Tianbao Xie

Alibaba Qwen

Research scientist on the Qwen team at Alibaba Group, focusing on foundation models and language agents. He received a PhD in computer science from the University of Illinois Urbana-Champaign.

Wenfeng Liang portrait
Researcher 8 reports

Wenfeng Liang

DeepSeek

Wenfeng Liang, also known as Liang Wenfeng, is linked to DeepSeek technical reports in LLMpeople and is identified in public references as the founder and CEO of DeepSeek.

Fei Huang portrait
Researcher 7 reports

Fei Huang

Alibaba Qwen

Alibaba Qwen report author listed on Qwen, Qwen2.5, Qwen2.5-1M, Qwen3, Qwen3 Embedding, QwQ-32B, and Qwen-VL reports, with report-backed work on large language models, embeddings, reranking, and multimodal models.

Zeyu Cui portrait
Researcher 5 reports

Zeyu Cui

Alibaba Qwen

Zeyu Cui is listed as an author of the Qwen technical report Qwen3 Technical Report.

Yang Fan portrait
Researcher 5 reports

Yang Fan

Alibaba Qwen

Alibaba Qwen report author listed on Qwen, Qwen2.5, Qwen3, Qwen-VL, and Qwen-Image technical reports, with report-backed work on large language models, vision-language models, and image generation.

Xiaodong Deng portrait
Researcher 4 reports

Xiaodong Deng

Alibaba Qwen

Research scientist in Tongyi Lab whose official profile highlights post-training and multimodal large language models.

Antonio Torralba portrait
Researcher 1 reports

Antonio Torralba

Google Gemini

Antonio Torralba is the Delta Electronics Professor in the EECS Department at MIT and a member of CSAIL whose research focuses on computer vision, visual learning, and scene understanding.

Jinbo Zhao portrait
Researcher 1 reports

Jinbo Zhao

Alibaba Qwen

PhD student in CSLT at Tsinghua University working on large language models, multimodal large language models, and speech-language models; publication context connects Jinbo Zhao to the Qwen2.5-VL technical report.

Wenhai Wang portrait
Researcher 1 reports

Wenhai Wang

DeepSeek

Wenhai Wang is a researcher working on visual perception foundation models, efficient learning, and multimodal large models. Public profiles list him with OpenGVLab and Shanghai AI Laboratory, and he is a coauthor of DeepSeek-VL2.

Yuanzhi Zhu portrait
Researcher 3 reports

Yuanzhi Zhu

Alibaba Qwen

Yuanzhi Zhu is a Qwen researcher whose public work includes multimodal and audio-language models.

Jena D. Hwang portrait
Researcher 3 reports

Jena D. Hwang

Ai2

Research scientist at the Allen Institute for AI (Ai2) whose work focuses on natural language understanding and commonsense reasoning.

Pradeep Dasigi portrait
Researcher 3 reports

Pradeep Dasigi

Ai2

Research scientist on the AllenNLP team at the Allen Institute for AI, focused on post-training language models.

Bowen Wang portrait
Researcher 2 reports

Bowen Wang

Moonshot AI

PhD student at the University of Hong Kong who worked as a research intern at Moonshot AI in 2025 and studies digital agents, computer-use agents, and multimodal intelligence.

Cheng Chen portrait
Researcher 2 reports

Cheng Chen

Moonshot AI

Research scientist at Moonshot AI with public profiles covering large language models, diffusion models, and generative AI.

Jiaqi Deng portrait
Researcher 2 reports

Jiaqi Deng

Moonshot AI

Computer science graduate from the University of Hong Kong who worked as a research intern at Moonshot AI on general-purpose computer-use agents.

Jin Xie portrait
Researcher 2 reports

Jin Xie

Moonshot AI

Researcher at Moonshot AI with public homepage and GitHub profiles under the name Xixia Zhong.

Kun Ouyang portrait
Researcher 2 reports

Kun Ouyang

Moonshot AI

Technical staff at Moonshot AI working on large language model reasoning, agents, and multimodal large models.

Matthieu Devin portrait
Researcher 2 reports

Matthieu Devin

Google Gemini

Research scientist at Google DeepMind based in Paris, focused on deep learning and computer vision.

Weixin Xu portrait
Researcher 2 reports

Weixin Xu

Moonshot AI

Research scientist at Moonshot AI with public GitHub and Google Scholar profiles covering efficient inference and multimodal systems.

Xiaodong Zhu portrait
Researcher 2 reports

Xiaodong Zhu

DeepSeek

Research intern at DeepSeek and master's student at Tsinghua University working on large language models, multimodal models, and reinforcement learning.

Xiaohua Zhai portrait
Researcher 2 reports

Xiaohua Zhai

Google Gemini

Xiaohua Zhai is a researcher on the Google Research team in Zurich whose work focuses on large multimodal models and efficient deep learning.

Xiaokun Yuan portrait
Researcher 2 reports

Xiaokun Yuan

Moonshot AI

AI researcher at Moonshot AI with a public homepage and Google Scholar profile spanning robust AI, computer vision, and multimodal systems.

Yibo Miao portrait
Researcher 2 reports

Yibo Miao

Moonshot AI

Moonshot AI researcher working on large language models, coding agents, and multimodal safety; his public homepage also documents earlier study at Shanghai Jiao Tong University and Huazhong University of Science and Technology.

Yiqin Wang portrait
Researcher 2 reports

Yiqin Wang

Moonshot AI

Researcher at Moonshot AI with a personal homepage and GitHub profile covering machine learning research.

Yuzhi Wang portrait
Researcher 2 reports

Yuzhi Wang

Moonshot AI

Researcher at Moonshot AI focused on large language models, computational photography, and low-level computer vision; previously worked at Megvii and completed a PhD and postdoc at Tsinghua University.

Zhaowei Li portrait
Researcher 2 reports

Zhaowei Li

Moonshot AI

Research scientist at Moonshot AI working on multimodal AI agents, large multimodal models, video generation, speech, machine learning systems, and AI for science.

Yuzi Yan portrait
Researcher 2 reports

Yuzi Yan

Moonshot AI

PhD student in Computer Science and Technology at Tsinghua University with public research interests in machine learning, natural language processing, and large language models.

Sifan Zhou portrait
Researcher 1 reports

Sifan Zhou

DeepSeek

DeepSeek report author listed on DeepSeek-VL2, with report-backed work on mixture-of-experts vision-language models and multimodal understanding.

Han Zhu portrait
Researcher 1 reports

Han Zhu

Moonshot AI

Machine learning researcher with a public homepage and GitHub profile covering AI research and engineering projects.

Hao Fei portrait
Researcher 1 reports

Hao Fei

Shanghai AI Laboratory

Research scientist at Shanghai AI Laboratory working on NLP and multimodal AI, and a co-author of InternLM-XComposer2.5.

Jiaming Guo portrait
Researcher 1 reports

Jiaming Guo

DeepSeek

PhD student at The Chinese University of Hong Kong focused on multimodal reasoning, optical character recognition, and document parsing; coauthor of DeepSeek-VL.

Liangtao Shi portrait
Researcher 1 reports

Liangtao Shi

Ai2

Research scientist at the Allen Institute for AI working on multimodal large language models, embodied agents, and reasoning for robots and games.

Matt Deitke portrait
Researcher 1 reports

Matt Deitke

Ai2

Matt Deitke is a researcher at Ai2 whose public homepage and Google Scholar profile highlight work on multimodal learning, vision-language models, embodied AI, and open models.

Molly S. Lewis portrait
Researcher 1 reports

Molly S. Lewis

Ai2

Molly S. Lewis is an Assistant Professor of Psychology at Princeton University whose research examines how language is shaped by social and cultural structure.

Roi Reichart portrait
Researcher 1 reports

Roi Reichart

Ai2

Professor at the Technion and head of the CHIA Lab, with research spanning natural language processing, machine learning, and social-good applications.

Xiuye Gu portrait
Researcher 1 reports

Xiuye Gu

Google Gemini

Xiuye Gu is a researcher whose public work focuses on vision-language modeling and machine learning systems.

Yan Zhong portrait
Researcher 1 reports

Yan Zhong

Moonshot AI

Research scientist at Kimi AI (Moonshot AI). Previously completed a PhD in computer science at the University of Wisconsin-Madison.

Yao Lu portrait
Researcher 3 reports

Yao Lu

DeepSeek / Google Gemini

Yao Lu is listed as an author of the Google technical report Gemini Robotics: Bringing AI into the Physical World.

Zheng Zhang portrait
Researcher 3 reports

Zheng Zhang

Moonshot AI

Publicly available Moonshot AI technical reports list Zheng Zhang as a coauthor on Kimi-VL and Kimi K2. The surviving public evidence supports research authorship on language and multimodal systems, not a separately verified individual employer profile.

Chang Zhou portrait
Researcher 3 reports

Chang Zhou

Alibaba Qwen

Qwen researcher and co-lead whose work focuses on pretraining and post-training, multimodal models, agent systems, and large-scale model infrastructure.

Shijie Wang portrait
Researcher 3 reports

Shijie Wang

Alibaba Qwen

Senior research scientist in Tongyi Lab whose official profile highlights post-training, AI for science, evaluation and alignment, multimodal reasoning, and large language model reasoning.

Maxwell Collins portrait
Researcher 1 reports

Maxwell Collins

Google Gemini

Maxwell Collins is a Research Scientist at Google DeepMind.

Dahua Lin portrait
Researcher 3 reports

Dahua Lin

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM2 Technical Report.

Jiaqi Wang portrait
Researcher 3 reports

Jiaqi Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM2 Technical Report.

Yu Qiao portrait
Researcher 3 reports

Yu Qiao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM2 Technical Report.

Shujie Wang portrait
Researcher 3 reports

Shujie Wang

DeepSeek

First-year PhD student at Shanghai Jiao Tong University focused on multimodal large language models, text-to-image generation, and image/video generation; coauthor of DeepSeek-VL2.

Yonggang Zhang portrait
Researcher 3 reports

Yonggang Zhang

DeepSeek

Yonggang Zhang is a researcher whose public OpenReview profile includes the DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding paper.

Jiaqi Gao portrait
Researcher 2 reports

Jiaqi Gao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM2 Technical Report.

Niket Tandon portrait
Researcher 2 reports

Niket Tandon

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-7B Technical Report.

Prasad Reddy Yadati portrait
Researcher 2 reports

Prasad Reddy Yadati

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-7B Technical Report.

Shan Lu portrait
Researcher 2 reports

Shan Lu

DeepSeek

Shan Lu is listed as an author of the DeepSeek technical report JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation.

Xiaogang Wang portrait
Researcher 2 reports

Xiaogang Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Xinxing Zu portrait
Researcher 2 reports

Xinxing Zu

Moonshot AI

Xinxing Zu is listed as an author of the Moonshot AI technical report Kimi K2.5: Visual Agentic Intelligence.

Yafei Wen portrait
Researcher 2 reports

Yafei Wen

MiniMax

Yafei Wen is a MiniMax report-backed author on MiniMax-Text-01, a MiniMax technical report in the LLMpeople catalog.

Yijia Shao portrait
Researcher 2 reports

Yijia Shao

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-7B Technical Report.

Yuhang Zang portrait
Researcher 2 reports

Yuhang Zang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Zhenguo Li portrait
Researcher 2 reports

Zhenguo Li

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Zhe Wang portrait
Researcher 2 reports

Zhe Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM2 Technical Report.

Zhongyue Zhang portrait
Researcher 2 reports

Zhongyue Zhang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Ziyu Shao portrait
Researcher 2 reports

Ziyu Shao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Adam Koepke portrait
Researcher 1 reports

Adam Koepke

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Ailin Qiu portrait
Researcher 1 reports

Ailin Qiu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Akshay Gupta portrait
Researcher 1 reports

Akshay Gupta

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Alon Albalak portrait
Researcher 1 reports

Alon Albalak

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Ankush Garg portrait
Researcher 1 reports

Ankush Garg

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Chien-Sheng Wu portrait
Researcher 1 reports

Chien-Sheng Wu

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Chris Alberti portrait
Researcher 1 reports

Chris Alberti

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Chun-Liang Li portrait
Researcher 1 reports

Chun-Liang Li

Apple

Chun-Liang Li is listed as a core author of the FastVLM paper, with Apple affiliation and an @apple.com contact address in the report HTML.

Chunping Li portrait
Researcher 1 reports

Chunping Li

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Conghui He portrait
Researcher 1 reports

Conghui He

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

David Yang portrait
Researcher 1 reports

David Yang

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Dian Shen portrait
Researcher 1 reports

Dian Shen

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Fengyun Rao portrait
Researcher 1 reports

Fengyun Rao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Haidong Duan portrait
Researcher 1 reports

Haidong Duan

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Han Zhou portrait
Researcher 1 reports

Han Zhou

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Hongshan Yu portrait
Researcher 1 reports

Hongshan Yu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Hu Xu portrait
Researcher 1 reports

Hu Xu

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Jaeson Jang portrait
Researcher 1 reports

Jaeson Jang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jiahao Huang portrait
Researcher 1 reports

Jiahao Huang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Jiajin Wu portrait
Researcher 1 reports

Jiajin Wu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jianbin Jiao portrait
Researcher 1 reports

Jianbin Jiao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jiangning Zhang portrait
Researcher 1 reports

Jiangning Zhang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Jian Guo portrait
Researcher 1 reports

Jian Guo

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jiarui Wang portrait
Researcher 1 reports

Jiarui Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jie Tang portrait
Researcher 1 reports

Jie Tang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jingren Zhou portrait
Researcher 1 reports

Jingren Zhou

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jingyao Ye portrait
Researcher 1 reports

Jingyao Ye

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Jinpeng Wang portrait
Researcher 1 reports

Jinpeng Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Jonathan Young portrait
Researcher 1 reports

Jonathan Young

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Jun Liu portrait
Researcher 1 reports

Jun Liu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Justin Wang portrait
Researcher 1 reports

Justin Wang

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Kai Chen portrait
Researcher 1 reports

Kai Chen

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Kai Chen portrait
Researcher 1 reports

Kai Chen

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Kaipeng Zhang portrait
Researcher 1 reports

Kaipeng Zhang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Kaizheng Wang portrait
Researcher 1 reports

Kaizheng Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Kiyoung Song portrait
Researcher 1 reports

Kiyoung Song

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Likun Wang portrait
Researcher 1 reports

Likun Wang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Lintao Zhang portrait
Researcher 1 reports

Lintao Zhang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Mauro Caccia portrait
Researcher 1 reports

Mauro Caccia

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Meng Liao portrait
Researcher 1 reports

Meng Liao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Mingqi Gao portrait
Researcher 1 reports

Mingqi Gao

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Minlie Huang portrait
Researcher 1 reports

Minlie Huang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Nuan Wen portrait
Researcher 1 reports

Nuan Wen

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Pan Zhang portrait
Researcher 1 reports

Pan Zhang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Peng Gao portrait
Researcher 1 reports

Peng Gao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Peter Henderson portrait
Researcher 1 reports

Peter Henderson

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Ping Luo portrait
Researcher 1 reports

Ping Luo

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Qinglin Lu portrait
Researcher 1 reports

Qinglin Lu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Sebastian Borgeaud portrait
Researcher 1 reports

Sebastian Borgeaud

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Shaodong Wang portrait
Researcher 1 reports

Shaodong Wang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Shijie Cao portrait
Researcher 1 reports

Shijie Cao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Shu Liu portrait
Researcher 1 reports

Shu Liu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Siyuan Yang portrait
Researcher 1 reports

Siyuan Yang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Tianning Zhao portrait
Researcher 1 reports

Tianning Zhao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Wangtianyu Luo portrait
Researcher 1 reports

Wangtianyu Luo

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Weijie Liu portrait
Researcher 1 reports

Weijie Liu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Wei Liu portrait
Researcher 1 reports

Wei Liu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Weizhu Chen portrait
Researcher 1 reports

Weizhu Chen

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Wenbo Chen portrait
Researcher 1 reports

Wenbo Chen

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Wenhai Wang portrait
Researcher 1 reports

Wenhai Wang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Wenyan Cong portrait
Researcher 1 reports

Wenyan Cong

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Xiangyu Yue portrait
Researcher 1 reports

Xiangyu Yue

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Xiaohan Ding portrait
Researcher 1 reports

Xiaohan Ding

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Xiaowen Zhang portrait
Researcher 1 reports

Xiaowen Zhang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Xiaoyi Dong portrait
Researcher 1 reports

Xiaoyi Dong

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Xinyu Gao portrait
Researcher 1 reports

Xinyu Gao

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Xin Zhang portrait
Researcher 1 reports

Xin Zhang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Xunliang Cai portrait
Researcher 1 reports

Xunliang Cai

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Yabin Zhang portrait
Researcher 1 reports

Yabin Zhang

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Yao Zhu portrait
Researcher 1 reports

Yao Zhu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Yimeng Zhu portrait
Researcher 1 reports

Yimeng Zhu

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Yingjie Chen portrait
Researcher 1 reports

Yingjie Chen

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Yiping Wang portrait
Researcher 1 reports

Yiping Wang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Yiwen Lu portrait
Researcher 1 reports

Yiwen Lu

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Yiwen Luo portrait
Researcher 1 reports

Yiwen Luo

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Yongqiang Ma portrait
Researcher 1 reports

Yongqiang Ma

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Yucheng Zou portrait
Researcher 1 reports

Yucheng Zou

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Yuchen Zhou portrait
Researcher 1 reports

Yuchen Zhou

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Yuchong Xiao portrait
Researcher 1 reports

Yuchong Xiao

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Yujia Qin portrait
Researcher 1 reports

Yujia Qin

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

Yulong Chen portrait
Researcher 1 reports

Yulong Chen

Salesforce AI Research

Researcher at Salesforce AI Research and coauthor of the XGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

Yuntao Liu portrait
Researcher 1 reports

Yuntao Liu

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Yutaka Matsuo portrait
Researcher 1 reports

Yutaka Matsuo

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Yutao Yue portrait
Researcher 1 reports

Yutao Yue

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory and coauthor of the InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

Zehuan Yuan portrait
Researcher 1 reports

Zehuan Yuan

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Zhaoye Yang portrait
Researcher 1 reports

Zhaoye Yang

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Zihan Dong portrait
Researcher 1 reports

Zihan Dong

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Zihan Wei portrait
Researcher 1 reports

Zihan Wei

ByteDance Seed

Researcher at ByteDance Seed and coauthor of the Seed1.5-VL Technical Report.

Yu Qiao portrait
Researcher 2 reports

Yu Qiao

MiniMax

Yu Qiao is listed as an author of the MiniMax technical report MiniMax-01: Scaling Foundation Models with Lightning Attention.

Zihao Huang portrait
Researcher 2 reports

Zihao Huang

Moonshot AI

Zihao Huang is listed as an author of the Moonshot AI technical report Kimi-VL Technical Report.

Aman Singh portrait
Researcher 2 reports

Aman Singh

DeepSeek

Research intern at DeepSeek and PhD student at Stanford University working on generative vision-language models, large language models, and large-scale training.

Bohong Yin portrait
Researcher 2 reports

Bohong Yin

Moonshot AI

Research scientist at Moonshot AI focused on machine learning systems; public profiles note prior PhD study at the Max Planck Institute and Technical University of Munich.

Bowei Xing portrait
Researcher 2 reports

Bowei Xing

Moonshot AI

Technical Staff at Moonshot AI.

Bowen Qu portrait
Researcher 2 reports

Bowen Qu

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Chu Wei portrait
Researcher 2 reports

Chu Wei

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Dehao Zhang portrait
Researcher 2 reports

Dehao Zhang

Moonshot AI

Technical staff member at Moonshot AI and machine learning researcher; public profiles note prior study at the Gaoling School of AI at Renmin University of China.

Enming Yuan portrait
Researcher 2 reports

Enming Yuan

Moonshot AI

Research scientist at Moonshot AI with public scholarly work on multimodal learning and generative models.

Enzhe Lu portrait
Researcher 2 reports

Enzhe Lu

Moonshot AI

PhD student in Computer Science at the University of Hong Kong. His research interests include multimodal large language models and embodied AI, and he co-authored the Kimi-VL technical report.

Fang Li portrait
Researcher 2 reports

Fang Li

Moonshot AI

Research Scientist at Moonshot AI.

Guokun Lai portrait
Researcher 2 reports

Guokun Lai

Moonshot AI

Research scientist at Moonshot AI whose work focuses on large foundation models and multimodal models.

Haiyang Xu portrait
Researcher 2 reports

Haiyang Xu

Alibaba Qwen

Independent researcher focused on multimodal learning, document intelligence, and efficient training; coauthor of Qwen2.5-VL and mPLUG-related vision-language systems.

Hang Zhang portrait
Researcher 2 reports

Hang Zhang

Alibaba Qwen

Researcher at Alibaba Group working on multimodal large language models; public profile and publication context connect Hang Zhang to the Qwen2-VL technical report.

Hao Ding portrait
Researcher 2 reports

Hao Ding

Moonshot AI

Research scientist at Moonshot AI with public scholarly work on multimodal learning and computer vision.

Haotian Yao portrait
Researcher 2 reports

Haotian Yao

Moonshot AI

Research scientist at Moonshot AI who previously studied at Tsinghua University and works on large foundation models.

Hongcheng Gao portrait
Researcher 2 reports

Hongcheng Gao

Moonshot AI

Generative AI researcher at Moonshot AI with public work spanning computational imaging and AI systems.

Jialin Wang portrait
Researcher 2 reports

Jialin Wang

Alibaba Qwen

Research scientist in Tongyi Lab and contributor to Qwen2-VL, with public work on multimodal large language models.

Jiezhong Qiu portrait
Researcher 2 reports

Jiezhong Qiu

Moonshot AI

Researcher at Moonshot AI with public GitHub and scholarly profiles covering machine learning and AI systems.

Jinhong Wang portrait
Researcher 2 reports

Jinhong Wang

Moonshot AI

Technical Staff at Moonshot AI.

Junjie Yan portrait
Researcher 2 reports

Junjie Yan

Moonshot AI

Research scientist at Moonshot AI with public work on computer vision and multimodal models.

Longhui Yu portrait
Researcher 2 reports

Longhui Yu

Moonshot AI

Research Scientist at Moonshot AI.

Mengfan Dong portrait
Researcher 2 reports

Mengfan Dong

Moonshot AI

Technical Staff at Moonshot AI.

Mengnan Dong portrait
Researcher 2 reports

Mengnan Dong

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Nan Ding portrait
Researcher 2 reports

Nan Ding

Google Gemini

Researcher at Google Research whose public work includes multimodal and vision-language modeling, with arXiv publications tied to PaliGemma and related transfer work.

Pengfei Wang portrait
Researcher 2 reports

Pengfei Wang

Alibaba Qwen

Research scientist in Alibaba DAMO Academy's Tongyi Lab working on machine learning, computer vision, and multimodal large language models; author on the Qwen2-VL and Qwen2.5-VL technical reports.

Qizheng Gu portrait
Researcher 2 reports

Qizheng Gu

Moonshot AI

Research scientist at Moonshot AI with public work on language models and reasoning.

Rui Hu portrait
Researcher 2 reports

Rui Hu

DeepSeek

PhD student at the University of Science and Technology of China focused on machine learning and multimodal understanding and generation; coauthor of Janus.

Runjie Zhou portrait
Researcher 2 reports

Runjie Zhou

Moonshot AI

Research scientist at Moonshot AI and PhD student at Shanghai Jiao Tong University whose homepage highlights multimodal understanding, generation, large language models, and agents.

Sibo Song portrait
Researcher 2 reports

Sibo Song

Alibaba Qwen

Research scientist in Tongyi Lab and maintainer of Qwen-VL, with public work on vision-language models.

Tianhui Song portrait
Researcher 2 reports

Tianhui Song

Moonshot AI

Research Scientist at Moonshot AI.

Tongtong Bai portrait
Researcher 2 reports

Tongtong Bai

Moonshot AI

Research scientist at Moonshot AI and the University of Wisconsin-Madison with public work on large language models and reasoning.

Weiran He portrait
Researcher 2 reports

Weiran He

Moonshot AI

Research scientist at Moonshot AI whose public GitHub profile highlights work on multimodal large language models and agents.

Weixiao Huang portrait
Researcher 2 reports

Weixiao Huang

Moonshot AI

Research Scientist at Moonshot AI.

Xinhao Li portrait
Researcher 2 reports

Xinhao Li

Moonshot AI

Research scientist at Moonshot AI with public scholarly work on multimodal and long-context model research.

Xinyuan Wang portrait
Researcher 2 reports

Xinyuan Wang

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Xinyu Luo portrait
Researcher 2 reports

Xinyu Luo

DeepSeek

PhD student at Shanghai Jiao Tong University working on multimodal large language models and image understanding and generation; coauthor of Janus.

Xinyu Zhou portrait
Researcher 2 reports

Xinyu Zhou

Moonshot AI

Technical Staff at Moonshot AI.

Xuejing Liu portrait
Researcher 2 reports

Xuejing Liu

Alibaba Qwen

Xuejing Liu is a researcher whose public OpenReview profile includes the Qwen2-VL and Qwen2.5-VL technical report papers.

Yang Li portrait
Researcher 2 reports

Yang Li

Moonshot AI

Co-founder and chief executive officer of Moonshot AI.

Yangyang Hu portrait
Researcher 2 reports

Yangyang Hu

Moonshot AI

Researcher at Moonshot AI with a public GitHub profile and work spanning machine learning systems.

Yanru Chen portrait
Researcher 2 reports

Yanru Chen

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Yejie Wang portrait
Researcher 2 reports

Yejie Wang

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Yibo Liu portrait
Researcher 2 reports

Yibo Liu

Moonshot AI

Research scientist at Moonshot AI whose public profile highlights work on multimodal generation, multimodal large language models, and efficient LLMs.

Yimin Chen portrait
Researcher 2 reports

Yimin Chen

Moonshot AI

Research scientist at Moonshot AI with public GitHub projects spanning language models and multimodal systems.

Yiping Bao portrait
Researcher 2 reports

Yiping Bao

Moonshot AI

Researcher at Moonshot AI with a public GitHub profile covering AI systems work.

Yuanxin Liu portrait
Researcher 2 reports

Yuanxin Liu

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Yu Han portrait
Researcher 2 reports

Yu Han

Alibaba Qwen

Researcher affiliated with Alibaba Group on Google Scholar and coauthor of the Qwen technical report.

Yuhao Dong portrait
Researcher 2 reports

Yuhao Dong

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report and Kimi K2.5: Visual Agentic Intelligence.

Yukang Chen portrait
Researcher 2 reports

Yukang Chen

DeepSeek

PhD student at The University of Hong Kong focused on large multimodal models and data-centric AI, especially multimodal understanding and generation; coauthor of Janus.

Yuxin Wu portrait
Researcher 2 reports

Yuxin Wu

Moonshot AI

Researcher at Moonshot AI with public GitHub projects spanning AI systems.

Yuxuan Cao portrait
Researcher 2 reports

Yuxuan Cao

DeepSeek

Research assistant at The University of Hong Kong focused on multimodal reasoning and generation, large language models, and embodied AI; coauthor of Janus.

Zhejun Jiang portrait
Researcher 2 reports

Zhejun Jiang

Moonshot AI

Research scientist at Moonshot AI with public scholarly work on multimodal learning and generative models.

Zhilin Yang portrait
Researcher 2 reports

Zhilin Yang

Moonshot AI

Co-founder and CTO of Moonshot AI, and co-author of the Kimi-VL and Kimi K2.5 technical reports.

Zhiyuan Ruan portrait
Researcher 2 reports

Zhiyuan Ruan

DeepSeek

PhD student at The University of Hong Kong focused on multimodal large language models, image and video understanding, generation, and editing; coauthor of Janus.

Zijia Zhao portrait
Researcher 2 reports

Zijia Zhao

Moonshot AI

Research scientist at Moonshot AI with public scholarly work on language and multimodal models.

Ziwei Chen portrait
Researcher 2 reports

Ziwei Chen

Moonshot AI

Research scientist at Moonshot AI with public work on multimodal learning and language models.

Andrew Shen portrait
Researcher 1 reports

Andrew Shen

Google Gemini

Andrew Shen is listed as an author of the Google technical report PaliGemma 2: A Family of Versatile VLMs for Transfer.

Jiangxin Wang portrait
Researcher 1 reports

Jiangxin Wang

Ai2

Jiangxin Wang is listed as an author of the Ai2 technical report Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models.

Matias Mazzocconi portrait
Researcher 1 reports

Matias Mazzocconi

Google Gemini

Matias Mazzocconi is listed as an author of the Google technical report PaliGemma 2: A Family of Versatile VLMs for Transfer.

Mikhail Ryabinin portrait
Researcher 1 reports

Mikhail Ryabinin

Google Gemini

Mikhail Ryabinin is listed as an author of the Google technical report PaliGemma 2: A Family of Versatile VLMs for Transfer.

Siddhartha Srinivasa portrait
Researcher 1 reports

Siddhartha Srinivasa

Ai2

Siddhartha Srinivasa is listed as an author of the Ai2 technical report Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models.

Wei Xiong portrait
Researcher 1 reports

Wei Xiong

DeepSeek

Wei Xiong is listed as an author of the DeepSeek technical report DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding.

Yuxuan Ren portrait
Researcher 1 reports

Yuxuan Ren

DeepSeek

Yuxuan Ren is listed as an author of the DeepSeek technical report DeepSeek-VL: Towards Real-World Vision-Language Understanding.

Dieter Fox portrait
Researcher 3 reports

Dieter Fox

Ai2

Senior director of embodied AI at Ai2 and professor at the University of Washington working in robotics, computer vision, and machine learning.

Alexander Kolesnikov portrait
Researcher 1 reports

Alexander Kolesnikov

Google Gemini

Alexander Kolesnikov is a Research Scientist at Google DeepMind exploring multimodal general intelligence.

Alyssa Sellitto portrait
Researcher 1 reports

Alyssa Sellitto

Ai2

Research scientist at Ai2 focused on multimodal machine learning, vision-language models, and understanding human-centered image variation.

Andrea Dafoe portrait
Researcher 1 reports

Andrea Dafoe

Google Gemini

Andrea Dafoe is a senior research scientist at Google DeepMind whose work focuses on frontier AI risks, international governance, and the societal impacts of advanced AI.

Bilal Mustafa portrait
Researcher 1 reports

Bilal Mustafa

Google Gemini

Senior research scientist at Google DeepMind.

Chenlin Zhang portrait
Researcher 1 reports

Chenlin Zhang

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report.

Guangda Wei portrait
Researcher 1 reports

Guangda Wei

Moonshot AI

Research scientist at Moonshot AI with public publications on multimodal learning and efficient large-model systems.

Heng Wang portrait
Researcher 1 reports

Heng Wang

Moonshot AI

Research Scientist at Moonshot AI.

Humen Zhong portrait
Researcher 1 reports

Humen Zhong

Alibaba Qwen

Research scientist in Tongyi Lab and a major contributor to Qwen2-VL, with public work on multimodal foundation models.

Jiaming Li portrait
Researcher 1 reports

Jiaming Li

Moonshot AI

Research Scientist at Moonshot AI.

Jianzhou Wang portrait
Researcher 1 reports

Jianzhou Wang

Moonshot AI

Research scientist at Moonshot AI working on multimodal large models and point-cloud perception and generation.

Jingyuan Liu portrait
Researcher 1 reports

Jingyuan Liu

Moonshot AI

Research scientist at Moonshot AI with a public homepage covering prior academic work and research projects.

Olivier Henaff portrait
Researcher 1 reports

Olivier Henaff

Google Gemini

Research scientist at Google DeepMind working on deep learning, reinforcement learning, self-supervised learning, and robotics.

Rohit Saxena portrait
Researcher 1 reports

Rohit Saxena

Google Gemini

Rohit Saxena is a Research Scientist at Google DeepMind working on visual perception, multimodal learning, and language understanding.

Roman Shapovalov portrait
Researcher 1 reports

Roman Shapovalov

Salesforce AI Research

Research scientist at Salesforce AI Research working on multimodal and vision-language models.

Ronak Mandlekar portrait
Researcher 1 reports

Ronak Mandlekar

Ai2

PhD student at Stanford and research scientist at the Allen Institute for AI working on robotics, multimodal models, and embodied AI.

Sihan Cao portrait
Researcher 1 reports

Sihan Cao

Moonshot AI

Researcher affiliated with Moonshot AI on Google Scholar and coauthor of the Kimi-VL technical report.

Sipeng Zhang portrait
Researcher 1 reports

Sipeng Zhang

DeepSeek

PhD student at The University of Hong Kong focused on large multimodal models, image and video generation, and multimodal understanding; coauthor of Janus.

Wei Song portrait
Researcher 1 reports

Wei Song

Moonshot AI

Researcher at Moonshot AI. Public profile notes prior PhD study in computer science at the Chinese University of Hong Kong.

Weiyi Su portrait
Researcher 1 reports

Weiyi Su

Shanghai AI Laboratory

Researcher at Shanghai AI Laboratory focused on multimodal large language models, with public publications including InternVL 1.5, Video-LLaVA, and VCD.

William Kolesnikov portrait
Researcher 1 reports

William Kolesnikov

Google Gemini

Staff software engineer at Google DeepMind working on post-training, alignment, multimodal models, and data filtering. He previously worked on hardware and software co-design for machine learning.

Xingzhe Wu portrait
Researcher 1 reports

Xingzhe Wu

Moonshot AI

Researcher and co-author of the Kimi-VL Technical Report.

Xinyu Chen portrait
Researcher 1 reports

Xinyu Chen

DeepSeek

Research intern at NUS and Nanjing University working on machine learning and multimodal large language models; coauthor of DeepSeek-VL2.

Yang Cao portrait
Researcher 1 reports

Yang Cao

Shanghai AI Laboratory

Researcher working on open multimodal models, including InternVL3.

Yanxia Cui portrait
Researcher 1 reports

Yanxia Cui

DeepSeek

Researcher working on multimodal and vision-language models, including DeepSeek-VL2 and related model optimization work.

Yidao Qin portrait
Researcher 1 reports

Yidao Qin

Moonshot AI

Research Scientist at Moonshot AI.

Yongsheng Kang portrait
Researcher 1 reports

Yongsheng Kang

Moonshot AI

Research Scientist at Moonshot AI.

Yuanhang Zhang portrait
Researcher 1 reports

Yuanhang Zhang

Alibaba Qwen

Research scientist in Tongyi Lab and major contributor to Qwen2.5-VL, with public work on multimodal large language models.

Yuhang Zheng portrait
Researcher 1 reports

Yuhang Zheng

ByteDance Seed

Researcher working on multimodal and embodied agents, including Seed1.5-VL and related planning work.

Yusuke Iwasawa portrait
Researcher 1 reports

Yusuke Iwasawa

Shanghai AI Laboratory

Project associate professor at the University of Tokyo whose work spans deep learning, artificial intelligence, and machine learning for medicine and healthcare.

Yuyi Wang portrait
Researcher 1 reports

Yuyi Wang

Alibaba Qwen

Research intern in Tongyi Lab whose public profile highlights work on multimodal large language models and video understanding.

Zhaohai Li portrait
Researcher 1 reports

Zhaohai Li

Alibaba Qwen

Research scientist in Tongyi Lab and technical lead of Qwen2-VL, with public work on vision-language models.

Zhenyu Yang portrait
Researcher 1 reports

Zhenyu Yang

Alibaba Qwen

PhD student at Nanjing University and research intern at Alibaba Tongyi Lab working on multimodal large language models and visual understanding; coauthor of Qwen2.5-VL.

Zihan Liu portrait
Researcher 1 reports

Zihan Liu

DeepSeek

Zihan Liu is a research scientist at DeepSeek. His public homepage highlights work in multimodal learning, vision-language models, and large-scale machine learning.

Zongyu Lin portrait
Researcher 1 reports

Zongyu Lin

Moonshot AI

Technical Staff at Moonshot AI.