ShypdShypd.ai
Data & AnalyticsData Labeling & AnnotationOpen Source & ModelsFree

DataDesigner

Visit site

DataDesigner is an open-source library for generating synthetic data. It allows users to create high-quality datasets from scratch or based on existing seed...

2
Views

Boost your confidence score by at least 15%

Page created: Mar 2, 2026·Last updated by Shypd: Mar 2, 2026

SHYPD CONFIDENCE SCORE

Likely Legit

PRICING

ModelFree

CHECK OTHER DATA LABELING & ANNOTATION AI TOOLS

RedPajama-Data

RedPajama-Data

71%

RedPajama-Data is a repository containing code for preparing large datasets used in training large language models. It provides resources for creating and managing training data. The project is open-source and hosted on GitHub.

chatgpt-corpus

chatgpt-corpus

71%

chatgpt-corpus is a Chinese corpus for training large language models. It includes dialogue, novel, and customer service data. The corpus is designed to help improve the performance of AI models in Chinese language tasks and is available for free.

GigaSpeech

GigaSpeech

71%

GigaSpeech is a large, open-source dataset for speech recognition. It is designed for training and evaluating speech recognition models. The dataset contains 10,000 hours of transcribed audio. GigaSpeech is suitable for researchers and developers working on speech recognition technologies.

segmentation_models.pytorch

segmentation_models.pytorch

71%

segmentation_models.pytorch is a Python library providing neural networks for image semantic segmentation based on PyTorch. It includes over 500 pretrained convolutional and transformer-based backbones. The library facilitates the development and deployment of image segmentation models. It is designed for researchers and practitioners in computer vision.

semantic-segmentation-editor

semantic-segmentation-editor

71%

semantic-segmentation-editor is a web-based labeling tool for creating AI training datasets. It supports both 2D images and 3D point clouds. Developed for autonomous driving research, it is built with React, Paper.js, and three.js. The tool is available as a Meteor app.

3d-bat

3d-bat

71%

3D-BAT is a 3D Bounding Box Annotation Tool for point cloud and image labeling. It is an open-source toolbox available on GitHub. The tool supports custom data annotation. It is used for labeling 3D data for machine learning and computer vision applications.

View all Data Labeling & Annotation tools →