GigaSpeech
Visit ToolGigaSpeech is a large, modern dataset for speech recognition, offering over 10,000 hours of transcribed audio. It provides an evolving, multi-domain corpus for training and evaluating ASR models.
At a glance
Trending
GigaSpeech is a large, modern dataset for speech recognition, offering over 10,000 hours of transcribed audio. It provides an evolving, multi-domain corpus for training and evaluating ASR models.
Trending
About
GigaSpeech is a comprehensive, open-source dataset specifically designed for advancing speech recognition research and development. It features over 10,000 hours of high-quality human-transcribed audio, alongside an additional 33,000+ hours suitable for unsupervised or semi-supervised learning. The dataset encompasses diverse acoustic conditions and domains, including audiobooks, podcasts, and YouTube content, with various ages and accents. It provides pre-processed versions via HuggingFace and includes detailed metadata in a version-controlled JSON file, allowing users to extract relevant information for tasks like speech recognition. GigaSpeech also offers data preparation scripts for popular toolkits like Kaldi, Espnet, and Icefall, making it easier for researchers to integrate and utilize the dataset.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending