Lhotse
Visit ToolLhotse is an open-source Python library for handling multimodal data in machine learning projects, simplifying data preparation for speech, audio, video, image, and text.
At a glance
Trending
Lhotse is an open-source Python library for handling multimodal data in machine learning projects, simplifying data preparation for speech, audio, video, image, and text.
Trending
About
Lhotse is an open-source Python library designed to make multimodal data preparation flexible and accessible for machine learning projects. It supports various modalities including speech, audio, video, image, and text. Key features include state-of-the-art data loading algorithms like dataset blending and efficient on-the-fly bucketing, as well as handling data randomization for distributed multi-node training. Lhotse provides standard data preparation recipes for common corpora and offers flexible data preparation for model training with the concept of audio/video cuts. It also supports efficient sequential I/O data formats like Lhotse Shar and integrates seamlessly with PyTorch through task-specific Dataset classes.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending