LLMpeople
Home People Organizations Reports Fields Schools
Public Atlas People first, reports as evidence, organizations as context.

Atlas / Reports / Detail

Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities

Audio Language Models report from Stepfun with 10 connected researchers in the LLMpeople atlas.

Stepfun2025-07-2210 researchers
Field
Audio Language Models
Organization
Stepfun
arXiv
2507.16632

Canonical link

https://arxiv.org/abs/2507.16632

Connected researchers

Can Cui portrait
Researcher 1 reports

Can Cui

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Xiao Ma portrait
Researcher 1 reports

Xiao Ma

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Zeyi Yan portrait
Researcher 1 reports

Zeyi Yan

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Siyao Wang portrait
Researcher 1 reports

Siyao Wang

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Jiale Zhuang portrait
Researcher 1 reports

Jiale Zhuang

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Yu Guo portrait
Researcher 1 reports

Yu Guo

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Yizhou Zou portrait
Researcher 1 reports

Yizhou Zou

Stepfun

Researcher at StepFun AI working on speech, language, and multimodal learning, including Step-Audio 2.

Stepfun
Huan Yang portrait
Researcher 1 reports

Huan Yang

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Ruiqi Song portrait
Researcher 1 reports

Ruiqi Song

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun
Hui Yu portrait
Researcher 1 reports

Hui Yu

Stepfun

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Stepfun

LLMpeople is a public atlas for discovering frontier AI researchers with context, provenance, and respect.

Privacy ยท Terms