Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.

Researcher at StepFun AI working on speech, language, and multimodal learning, including Step-Audio 2.

Researcher at Stepfun and coauthor of the Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities.