Coding & Development
Browsing page 149 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
garage
garage is a comprehensive, open-source toolkit designed for developing and evaluating reinforcement learning (RL) algorithms, emphasizing reproducibility in research. It offers a wide array of modular tools, including composable neural network models, high-performance samplers, replay buffers, and an expressive experiment definition interface. The toolkit supports logging to various outputs like TensorBoard, ensures reliable experiment checkpointing and resuming, and provides environment interfaces for popular benchmark suites. garage is compatible with Python 3.6+ and supports both PyTorch and TensorFlow for neural network implementations, with algorithms not requiring neural networks found in the `garage.np` package. Its robust testing strategy, including continuous integration and comprehensive benchmarks, ensures state-of-the-art performance and reliability.
GPT-3-Encoder
GPT-3-Encoder is a Javascript BPE Encoder Decoder specifically designed for GPT-2 and GPT-3 models. This tool facilitates the conversion of human-readable text into a series of integers, which is the format required for input into these advanced language models. It serves as a direct Javascript implementation of OpenAI's original Python encoder/decoder, ensuring compatibility and accuracy in tokenization. Developers can easily integrate it into their projects using npm, and it is compatible with Node.js versions 12 and above. This encoder/decoder is crucial for anyone working with GPT-2 or GPT-3, enabling them to preprocess text data effectively for model training or inference.
Deep-Learning-for-Tracking-and-Detection
Deep-Learning-for-Tracking-and-Detection is a comprehensive open-source repository on GitHub, offering a curated collection of papers, datasets, code, and other resources specifically focused on object tracking and detection using deep learning. This tool is invaluable for AI researchers, engineers, and students who are actively engaged in computer vision projects. It covers a wide array of topics including static detection (RCNN, YOLO, SSD, RetinaNet, Anchor Free), video detection (Tubelet, FGFA, RNN), and multi-object tracking (Joint-Detection, Identity Embedding, Association, Deep Learning, RNN, Unsupervised Learning, Reinforcement Learning, Network Flow, Graph Optimization). The repository also provides resources for single object tracking, various deep learning techniques, and a multitude of datasets, making it a central hub for cutting-edge research and development in this field.
DANN
DANN provides a PyTorch implementation of the Domain-Adversarial Training of Neural Networks (DANN) paper, enabling unsupervised domain adaptation through backpropagation. This open-source tool is designed for researchers and developers working with neural networks who need to improve model performance across different data distributions or domains without extensive labeled data for the target domain. It includes the necessary network structure and training scripts, with specific instructions for setting up the environment using PyTorch 1.0 and Python 2.7. Users can download the required mnist_m dataset from provided links to begin training. The project also offers a separate version, DANN_py3, for Python 3 and Docker environments, indicating ongoing development and support for modern setups. Its primary utility lies in allowing models trained on one domain to generalize effectively to another, reducing the need for costly data annotation in new environments.
efficient-dl-systems
efficient-dl-systems is an open-source GitHub repository offering comprehensive educational materials for the Efficient Deep Learning Systems course, taught at HSE University and Yandex School of Data Analysis. The repository includes a detailed syllabus, lecture notes, and seminar materials covering a wide range of topics, from foundational GPU architecture and CUDA API to advanced concepts like distributed training, large model optimization, and inference algorithms. It provides practical insights into performance measurement, mixed-precision training, data-parallel techniques, and deployment of deep learning models. The course content is structured week-by-week, making it an invaluable resource for students and researchers looking to deepen their understanding of efficient deep learning practices.
evalite
evalite is an open-source tool designed for developers to evaluate their LLM-powered applications using TypeScript. It provides a robust framework for testing and assessing the performance of AI applications, ensuring quality and reliability. Developers can use evalite to build, run, and analyze tests for their language model integrations. The tool supports a development workflow that includes building, running tests, and a UI dev server for real-time evaluation. It is particularly useful for identifying and fixing issues in LLM-based projects before deployment, contributing to more stable and effective AI solutions.
feature-engineering-book
feature-engineering-book is the official GitHub code repository accompanying the book "Feature Engineering for Machine Learning" by Alice Zheng and Amanda Casari, published by O'Reilly in 2018. This resource is invaluable for students, researchers, and practitioners looking to implement the feature engineering techniques discussed in the book. The repository contains various Jupyter Notebooks covering topics such as binning, count features, log and Box-Cox transformations, interaction features, text processing (TF-IDF, chunking), regression on categorical variables, feature hashing, PCA, K-means clustering for featurization, and HOG image features. It also includes end-to-end recommender system examples, providing practical code for a deeper understanding of machine learning concepts.
ttach
ttach is an open-source PyTorch library designed for Test Time Augmentation (TTA) in image processing tasks. Similar to data augmentation during training, TTA involves applying random modifications like flips, rotations, and scaling to test images. Instead of feeding a model a single 'clean' image, ttach allows users to show augmented versions multiple times, then averages the predictions from each augmented image to produce a more robust final output. The library provides wrappers for segmentation, classification, and keypoint detection models, along with a flexible `Compose` function for custom transform pipelines. It supports various merge modes for predictions, including mean, geometric mean, sum, max, and min, making it a versatile tool for enhancing model accuracy and stability during inference.
kubedl
KubeDL is a CNCF sandbox project designed to simplify and optimize the execution of deep learning workloads on Kubernetes. It provides a unified controller for managing training and inference tasks across frameworks like TensorFlow, PyTorch, and Mars. Key features include advanced scheduling, acceleration through caching, metadata persistence, file synchronization, and service discovery for host network training. KubeDL also integrates with Morphling for automatic tuning of ML model deployment configurations and allows for native tracking of model lineage using Kubernetes CRDs. This tool aims to make the deployment and scaling of deep learning models within a Kubernetes environment more accessible and efficient for developers and data scientists.
linfa
linfa is a robust, open-source machine learning framework written in Rust, designed to provide a comprehensive toolkit for building various ML applications. It is conceptually similar to Python's scikit-learn, offering a wide array of common preprocessing tasks and classical machine learning algorithms. The framework includes implementations for algorithms such as Naive Bayes, K-Means, Gaussian-Mixture-Model, DBSCAN, OPTICS, ensemble methods like random forest, linear and logistic regression, support vector machines, decision trees, and dimensionality reduction techniques like PCA and t-SNE. linfa also supports various BLAS/LAPACK backends for optimized linear algebra routines, allowing developers to choose between pure-Rust implementations or external libraries like OpenBLAS, Netlib, or Intel MKL. This flexibility makes it suitable for developers looking to leverage Rust's performance and safety features in their ML projects.
MLJ.jl
MLJ.jl (Machine Learning in Julia) is an open-source machine learning framework designed for the Julia programming language. It offers a unified interface and a collection of meta-algorithms for various machine learning tasks, including model selection, hyperparameter tuning, evaluation, composition, and comparison. The framework integrates over 200 machine learning models, encompassing those developed in Julia and other languages, providing a comprehensive ecosystem for machine learning workflows. It serves as an umbrella package, distributing components across several other specialized packages, making it a versatile tool for developers and data scientists working with Julia.
MML-Book
MML-Book is an open-source repository offering comprehensive code and solutions for the "Mathematics for Machine Learning" (MML) book. This resource is specifically designed to aid self-study, providing Python code examples that help users better understand various machine learning concepts. It includes detailed solutions to exercises for each chapter, with notebooks that render LaTeX for clear mathematical explanations. The repository covers topics from Chapter 2 through Chapter 7, with a focus on practical application and conceptual clarity. It's a valuable asset for anyone looking to deepen their understanding of the mathematical foundations of machine learning through hands-on practice and guided solutions.
ncnn-android-yolov5
ncnn-android-yolov5 is an open-source project designed to demonstrate YOLOv5 object detection on Android devices. It serves as a practical example for developers looking to implement real-time object detection capabilities in their mobile applications. The project is built upon the ncnn deep learning inference framework, ensuring efficient performance on Android platforms. Developers can easily integrate this example by downloading the ncnn library, extracting it into the project's jni directory, and then building the project with Android Studio. This tool is ideal for those who need a ready-to-use, customizable foundation for adding computer vision features to their Android apps.
Senna
Senna is an open-source project designed to integrate large vision-language models (LVLMs) with end-to-end autonomous driving systems. Developed by researchers from Huazhong University of Science and Technology and Horizon Robotics, Senna aims to enhance planning safety, robustness, and generalization in autonomous vehicles. The project provides comprehensive resources including code, model weights for Senna-VLM, and scripts for training and evaluation. It supports data preparation by generating QA data using models like LLaVA-v1.6-34b for scene descriptions and planning explanations. Senna offers both full-parameter and LoRA fine-tuning options, with full-parameter fine-tuning recommended for optimal performance. Researchers and developers can utilize Senna to build and evaluate advanced AI-driven vehicle control systems, demonstrating strong cross-scenario generalization and transferability.
sig-mlops
sig-mlops is a Special Interest Group (SIG) within the Continuous Delivery Foundation (CDF) dedicated to Machine Learning Operations (MLOps). This open-source initiative aims to foster collaboration and drive standardization within the MLOps community. The group focuses on sharing best practices, developing documentation, and providing resources for professionals involved in the deployment, monitoring, and management of machine learning models. It serves as a hub for discussions, knowledge exchange, and contributions to the evolving field of MLOps, helping to streamline processes and improve efficiency in AI/ML development workflows.
pyRiemann
pyRiemann is an open-source Python machine learning package designed for processing and classifying real or complex-valued multivariate data. It leverages the Riemannian geometry of symmetric or Hermitian positive definite matrices, offering a high-level interface that mimics the scikit-learn API. While generic for multivariate data analysis, it's specifically tailored for biosignals like EEG, MEG, or EMG in brain-computer interface (BCI) applications, including motor imagery, event-related potentials, and steady-state visually evoked potentials. It also supports multisource transfer learning and remote sensing applications, such as processing radar images. The package provides functionalities for estimating covariance matrices and classifying them, making it a powerful tool for researchers and developers in these fields. It can be easily integrated into scikit-learn pipelines for comprehensive data analysis workflows.
resources
resources is an open-source repository dedicated to curating and organizing Go-based data science resources. It serves as a central hub for developers and data scientists working with the Go programming language, offering a comprehensive collection of links to various community resources such as events, conferences, and blogs. Additionally, it provides an extensive list of tooling resources, including essential packages, libraries, and development tools specifically designed for data analysis, visualization, and machine learning tasks within the Go ecosystem. This makes it an invaluable asset for anyone looking to explore or deepen their work in data science using Go.
RealMirror
RealMirror is a comprehensive, open-source embodied AI VLA (Vision-Language-Action) platform designed to address fundamental challenges in humanoid robotics, such as high data acquisition costs, lack of standardized benchmarks, and the simulation-to-real-world gap. It offers an efficient, low-cost system for data collection, model training, and inference, allowing researchers to conduct VLA studies without needing a physical robot. The platform includes a dedicated VLA benchmark with multiple scenarios and extensive trajectories to facilitate model evolution and fair comparison. RealMirror also integrates generative models and 3D Gaussian Splatting for realistic environment and robot model reconstruction, enabling zero-shot Sim2Real transfer where models trained in simulation can perform tasks on real robots seamlessly. Recent updates include the Seed2Scale scheme for automatic large-scale upper limb trajectory generation and MirrorLimb with gesture teleoperation functionality.
skylark
Skylark Editor is a high-performance, customizable text and hex editor written in C, designed for speed and efficiency, boasting startup times under a second. It includes a built-in file manager and SFTP remote manager, making file handling and remote access seamless. The editor supports binary/hex viewing for files of unlimited size and offers encryption/decryption for common key algorithms. It features Perl Compatible Regular Expression support, AI-Powered Chat Integration, and syntax highlighting for numerous languages. Skylark also supports SumatraPDF and clang-format plugins, code snippets, and a dark mode for enhanced user experience. With embedded Database-client, Redis-client, and Lua-engine, users can directly run Lua scripts and SQL files, making it a versatile tool for developers.
smartcore
smartcore is a comprehensive, fast, and ergonomic open-source library designed for machine learning and numerical computing in Rust. It enables developers to apply machine learning algorithms leveraging first principles, covering a broad range of methods including linear models, tree-based methods, ensembles, SVMs, neighbors, clustering, decomposition, and preprocessing. The library emphasizes production-friendly APIs, strong typing, and good defaults, while remaining flexible for research and experimentation. It features strong linear algebra traits with optional ndarray integration, WASM-first defaults for portability, and practical utilities for model selection, evaluation, and data access. smartcore is ideal for developers building AI applications in Rust who need robust and efficient ML capabilities.
wilds
wilds is an open-source machine learning benchmark designed to evaluate models under real-world distribution shifts. It offers a comprehensive package including data loaders that automate downloading, processing, and splitting of datasets, along with standardized evaluators for consistent model assessment. The benchmark covers a wide range of data modalities and applications, from medical imaging (tumor identification) to environmental monitoring (wildlife monitoring) and socio-economic analysis (poverty mapping). It also provides example scripts with default models, optimizers, and training/evaluation code, making it easy for researchers to integrate new algorithms and run experiments across its 10 included datasets. The package is installable via pip and supports optional integration with Weights & Biases for experiment tracking.
theMLbook
theMLbook is an open-source GitHub repository offering Python code designed to replicate the illustrations found in 'The Hundred-Page Machine Learning Book'. This resource is invaluable for students and professionals seeking to deepen their understanding of machine learning concepts through practical, visual examples. By providing the exact code used for the book's figures, theMLbook allows users to interact directly with the algorithms and models discussed, facilitating a hands-on learning experience. It covers a range of machine learning topics, from fundamental algorithms like linear regression and K-means to more advanced concepts such as autoencoders and UMAP, making it a comprehensive companion for the book's readers.
Theo-Docs
Theo-Docs is an open-source GitHub repository offering comprehensive guides for unlocking and utilizing various streaming services and AI tools. It provides detailed documentation for popular platforms such as Netflix, Disney+, Spotify, YouTube Premium, ChatGPT, and Gemini. Beyond streaming and AI, the repository also delves into practical topics like daily records, ESXI virtualization, OpenWrt router firmware, VPS guides, and information on various cloud service providers. This resource is ideal for users looking to optimize their digital experience across entertainment, AI applications, and personal server management.
V3D
V3D is an open-source implementation of the research paper "V3D: Video Diffusion Models are Effective 3D Generators." This tool leverages video diffusion models to create 3D content, offering capabilities such as generating dense multi-views from a single image and reconstructing 3D assets using techniques like 3D Gaussian Splatting or NeuS. It provides instructions for installation, downloading weights, and running scripts to generate and reconstruct 3D models. The project is actively being developed, with plans for more checkpoints and examples, making it a valuable resource for researchers and developers interested in advanced 3D generation from video data.