Qwen3-ASR
Visit ToolQwen3-ASR is an open-source series of ASR models that supports multilingual speech, music, and song recognition. It also offers language detection and timestamp prediction for 52 languages and dialects.
At a glance
Trending
Qwen3-ASR is an open-source series of ASR models that supports multilingual speech, music, and song recognition. It also offers language detection and timestamp prediction for 52 languages and dialects.
Trending
About
Qwen3-ASR is an open-source series of Automatic Speech Recognition (ASR) models developed by the Qwen team at Alibaba Cloud. It includes two powerful all-in-one speech recognition models (0.6B and 1.7B versions) that support language identification and ASR for 52 languages and dialects, including 30 languages and 22 Chinese dialects. The tool also features Qwen3-ForcedAligner-0.6B, a novel non-autoregressive speech forced-alignment model that can align textβspeech pairs and predict timestamps in 11 languages. Qwen3-ASR maintains high-quality and robust recognition even in complex acoustic environments and challenging text patterns, offering both offline and streaming inference capabilities.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending