Meta-Voicebox
Visit ToolMeta-voicebox is an Audio & Music tool that implements Voicebox, a generative AI model for speech. It offers text-guided multilingual universal speech generation with state-of-the-art performance.
At a glance
Trending
Meta-voicebox is an Audio & Music tool that implements Voicebox, a generative AI model for speech. It offers text-guided multilingual universal speech generation with state-of-the-art performance.
Trending
About
Meta-voicebox is a PyTorch implementation of Voicebox, a generative AI model for speech designed to generalize across various tasks with state-of-the-art performance. Unlike traditional speech models, Voicebox is a non-autoregressive flow-matching model trained on over 50,000 hours of unfiltered speech, allowing it to perform tasks not explicitly taught. It supports text-guided multilingual universal speech generation, including mono or cross-lingual zero-shot text-to-speech synthesis, noise removal, content editing, style conversion, and diverse sample generation. Notably, Voicebox outperforms VALL-E in intelligibility and audio similarity, while being significantly faster.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending