sherpa-onnx is a comprehensive open-source AI toolkit designed for offline speech and audio processing. It offers a wide array of functionalities including speech-to-text (ASR), text-to-speech (TTS), speaker diarization, speaker identification, speaker verification, spoken language identification, audio tagging, voice activity detection (VAD), speech enhancement, keyword spotting, and source separation. The tool is highly versatile, supporting numerous platforms such as Android, iOS, Windows, macOS, Linux, and HarmonyOS, across various architectures including x64, x86, ARM, and RISC-V. It also integrates with several NPUs like Rockchip, Qualcomm, Ascend, and Axera, and provides APIs for 12 programming languages, including C++, Python, Java, and Swift, along with WebAssembly support. This makes it ideal for developers building AI-powered audio applications for embedded systems and diverse environments.
Best used for
Ideal for developers and engineers who need to implement offline speech recognition, text-to-speech, and advanced audio processing functions. Especially valuable for building applications on embedded systems, mobile devices, and various operating systems without requiring an internet connection.
Common actions