AI Agents & Automation
Browsing page 135 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
Osprey
Osprey is a cutting-edge computer vision tool that enhances multimodal large language models (MLLMs) by incorporating pixel-wise mask regions into language instructions. This innovative approach enables fine-grained visual understanding, allowing Osprey to generate detailed semantic descriptions, including both short and elaborate explanations, based on specific input mask regions. It seamlessly integrates with Segment Anything Model (SAM) in various modes like point-prompt, box-prompt, and segmentation everything, to extract and describe semantics associated with particular parts or objects within an image. Osprey is built upon the LLaVA-v1.5 codebase and is designed for researchers and developers working on advanced visual instruction tuning and pixel-level image analysis.
PyTorch-BayesianCNN
PyTorch-BayesianCNN provides an implementation of Bayesian Convolutional Neural Networks (CNNs) with variational inference, specifically utilizing Bayes by Backprop, within the PyTorch framework. This tool allows researchers and developers to build CNNs that can infer intractable posterior probability distributions over weights, offering a significant advantage over traditional frequentist approaches by providing uncertainty estimations. It includes two types of Bayesian layer implementations: BBB (Bayes by Backprop) and BBB_LRT (Bayes by Backprop with Local Reparametrization Trick), which enhances sampling efficiency. The repository supports standard datasets like MNIST, CIFAR10, and CIFAR100, and includes implementations of common models such as AlexNet and LeNet, making it a valuable resource for experimenting with Bayesian deep learning and understanding model uncertainty.
pytorch_active_learning
pytorch_active_learning is an open-source PyTorch library designed for active learning, accompanying the "Human-in-the-Loop Machine Learning" book. It offers a range of active learning methods, including Least Confidence, Margin of Confidence, Ratio of Confidence, and Entropy sampling. The library also supports more advanced techniques like Model-based Outlier sampling, Cluster-based sampling, and various forms of Active Transfer Learning. It is suitable for researchers and practitioners looking to experiment with and apply active learning strategies in computer vision and natural language processing, with a focus on real-world diversity to avoid bias. The code is stand-alone and can be easily integrated with existing PyTorch installations.
tensorflow-federated
TensorFlow Federated (TFF) is an open-source framework designed for machine learning and other computations on decentralized data. It specifically supports Federated Learning (FL), an approach where a shared global model is trained across many participating clients while their sensitive training data remains local. This framework enables developers to utilize included federated learning algorithms with their existing TensorFlow models and data, or to experiment with novel algorithms. TFF provides both a high-level Federated Learning (FL) API for applying federated training and evaluation, and a lower-level Federated Core (FC) API for expressing new federated algorithms. It includes a single-machine simulation runtime for experiments, making it suitable for researchers and developers exploring privacy-preserving machine learning.
Thesis
Thesis is an AI-native platform designed for data science and machine learning, offering an environment where researchers can build and deploy frontier models. The platform allows ML research scientists to run experiments and train models autonomously and at scale within its datacenters. Key features include an intuitive interface for managing datasets, experiments, and models, as well as tools for exploratory data analysis (EDA) and lineage tracking for model development. Thesis aims to accelerate AI R&D, making it easier for data scientists to turn curiosity into consequential discoveries. It offers both a free Spark plan and a 'Pay as you go Ultra' option for production workloads.
veles
Veles is a distributed platform designed for rapid deep learning application development, released under the Apache 2.0 license. It comprises several key components, including the core Veles platform, the Znicz Plugin which serves as a neural network engine, and Mastodon, a bridge facilitating integration between Veles and Java-based systems like Hadoop. Additionally, it features a SoundFeatureExtraction library for audio processing. This platform is ideal for developers and researchers looking to build and deploy deep learning applications in a distributed environment, offering tools for both model development and data processing.
TransmogrifAI
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an open-source AutoML library written in Scala, designed to run on Apache Spark. Developed by Salesforce, it focuses on enhancing machine learning developer productivity by automating various stages of the ML workflow, from feature engineering and validation to model selection. The library enforces compile-time type-safety, modularity, and reusability, enabling the creation of robust machine learning applications in a fraction of the time compared to traditional hand-tuned methods. It supports building models with minimal machine learning expertise, making advanced ML accessible to a broader range of developers. TransmogrifAI is particularly useful for structured data and offers flexibility for users who require more control over their ML pipelines.
TPVFormer
TPVFormer is an academic project offering a Tri-Perspective View (TPV) representation for vision-based 3D semantic occupancy prediction, serving as an alternative to Tesla's Occupancy Network for autonomous driving research. It addresses the limitations of traditional bird's-eye-view (BEV) representations by incorporating two additional perpendicular planes, allowing for a more fine-grained description of 3D scenes. The tool features a transformer-based TPV encoder (TPVFormer) to effectively obtain TPV features by aggregating image features. It demonstrates that camera inputs alone can achieve performance comparable to LiDAR-based methods on LiDAR segmentation tasks. The project also includes resources for semantic scene completion and comparisons with Tesla's Occupancy Network.
nlprule
Nlprule is a fast, low-resource Natural Language Processing and Text Correction library written in Rust. It implements a rule- and lookup-based approach, leveraging resources from LanguageTool for its NLP tasks. Key features include rule-based grammatical error correction with thousands of rules, a comprehensive text processing pipeline covering sentence segmentation, part-of-speech tagging, lemmatization, chunking, and disambiguation. The library supports English, German, and Spanish, with spellchecking currently in progress. Nlprule is designed for speed and efficiency, making it suitable for pre/post-processing in more sophisticated AI approaches, background application tasks with low overhead, or client-side execution via WebAssembly.
Complexio
Complexio offers an intelligence layer designed for enterprise AI, connecting an organization's data, people, and systems into a unified operational view. It builds a live map of how work happens, called the Event Knowledge Graph (EKG), providing real-time insights. The Context Broker links this EKG to existing systems and teams, ensuring all insights and actions are grounded in a shared understanding of operational reality. Users can ask questions in natural language through Stevie and receive answers based on their real operations. Additionally, the Automated Automations Engine (AAE) identifies patterns and orchestrates executable workflows, turning observations into automated actions with traceability and control.
WeDLM
WeDLM is an open-source diffusion language model developed by Tencent, designed for high-speed inference. It uniquely reconciles diffusion language models with standard causal attention, enabling native KV cache compatibility with technologies like FlashAttention and PagedAttention. This approach allows for direct initialization from pre-trained autoregressive models such as Qwen2.5 and Qwen3, delivering significant real speedups compared to vLLM-optimized baselines. WeDLM achieves 3-6x speedup on tasks like math reasoning and up to 10x on sequential/counting tasks, while maintaining competitive accuracy. It includes an inference engine, evaluation suite, and a fine-tuning framework, making it a powerful tool for developers and researchers focused on efficient language model deployment.
brevitas
Brevitas is an open-source PyTorch library designed for neural network quantization, offering support for both post-training quantization (PTQ) and quantization-aware training (QAT). This tool enables developers and researchers to optimize and compress neural networks, making them more efficient for deployment on various hardware platforms. It provides quantized implementations of common PyTorch layers, such as QuantConv1d, QuantConv2d, and QuantLSTM, allowing individual tuning of quantization settings for different tensors. Brevitas is a research project from Xilinx, providing examples for ImageNet classification models to demonstrate PTQ under various configurations.
Genie-TTS
Genie-TTS is an open-source, lightweight inference engine and model converter specifically designed for GPT-SoVITS ONNX models. It excels in providing near-instantaneous speech synthesis on CPUs, making it highly efficient for various applications. The tool integrates essential functionalities such as TTS inference, ONNX model conversion, and an API server, all aimed at delivering ultimate performance and convenience. It supports GPT-SoVITS V2 and V2ProPlus models, with planned support for V3 and V4, and handles Japanese, English, Chinese, and Korean languages. Genie-TTS also offers significant performance advantages over official PyTorch models, particularly in first inference latency and runtime size, making it an ideal solution for developers and content creators seeking high-performance, CPU-based speech synthesis.
evidential-deep-learning
evidential-deep-learning is an open-source Python package designed to help neural networks learn their own measures of uncertainty directly from data. It provides the necessary code to reproduce the Deep Evidential Regression paper published in NeurIPS 2020, offering a general framework for evidential learning. The tool allows users to integrate evidential layers and loss functions into existing `tf.keras` model pipelines, supporting both fully connected and convolutional layers. This enables the development of models that can provide fast, scalable, and calibrated measures of uncertainty, enhancing their trustworthiness and utility. The package is compatible with Python (>=3.7) and TensorFlow (>=2.0), with PyTorch support planned.
DALI
The NVIDIA Data Loading Library (DALI) is a GPU-accelerated library designed to optimize data loading and pre-processing for deep learning applications. It offers a collection of highly optimized building blocks and an efficient execution engine, specifically tailored for processing image, video, and audio data. DALI addresses the common bottleneck of CPU-bound data pipelines by offloading these tasks to the GPU, significantly enhancing performance and scalability for training and inference. It supports various data formats and is portable across popular deep learning frameworks like TensorFlow, PyTorch, and PaddlePaddle. Key features include prefetching, parallel execution, batch processing, and extensibility for custom operators, making it a versatile solution for accelerating complex deep learning workflows.
deepnet
deepnet is an open-source project providing GPU-based Python implementations of several deep learning algorithms. It supports a range of models including feed-forward neural networks, Restricted Boltzmann Machines, Deep Belief Nets, Autoencoders, Deep Boltzmann Machines, and Convolutional Neural Nets. Built upon the cudamat library by Vlad Mnih and cuda-convnet library by Alex Krizhevsky, deepnet offers a foundational resource for developers and researchers working with deep learning. Its focus on core algorithm implementations makes it a valuable tool for understanding and experimenting with these fundamental AI architectures.
katib
Katib is a Kubernetes-native project designed for automated machine learning (AutoML), providing robust capabilities for hyperparameter tuning, early stopping, and neural architecture search. It is framework-agnostic, allowing users to tune hyperparameters for applications written in any language and supporting popular ML frameworks like TensorFlow, PyTorch, and XGBoost. Katib can execute training jobs using various Kubernetes Custom Resources, including Kubeflow Training Operator, Argo Workflows, and Tekton Pipelines. It offers a range of search algorithms such as Random Search, Bayesian Optimization, TPE, and CMA-ES, and integrates with frameworks like Goptuna, Hyperopt, and Optuna. A Python SDK is available to simplify the creation of hyperparameter tuning jobs for data scientists.
osaurus
Osaurus is an AI edge infrastructure solution specifically designed for macOS, allowing users to run both local and cloud-based AI models efficiently. This tool provides a native, always-on runtime environment, which is crucial for powering continuous AI workflows. It also facilitates the sharing of AI tools across various applications, enhancing productivity and integration within the Apple ecosystem. The project has recently moved to a new repository at osaurus-ai/osaurus, where all active development, issues, and releases are now managed. Users are encouraged to update their git remote to the new location to access the latest features and contributions.
Real-time-stock-market-prediction
Real-time-stock-market-prediction is an open-source project that offers a complete server-side architecture for real-time stock market prediction using Machine Learning. It leverages TensorFlow.js for building the ML model architecture and Kafka for efficient real-time data streaming and pipelining. The system integrates MongoDB for updating databases with incoming stock market logs, enabling analysis and model training, and storing model performance. Developed entirely with Node.js, this architecture supports parallel processing for real-time analysis, ML model training, and prediction, making it suitable for those interested in applying machine learning to financial market analysis and developing robust predictive models.
Yatai
Yatai (屋台, food cart) is a Kubernetes deployment operator specifically designed for BentoML, enabling model deployment at scale. It allows DevOps teams to seamlessly integrate BentoML services into their existing GitOps workflows, facilitating the deployment and scaling of machine learning models on any Kubernetes cluster. Yatai is cloud-native and DevOps-friendly, utilizing a Kubernetes-native workflow with its BentoDeployment CRD (Custom Resource Definition). This approach makes it easy to fit BentoML-powered services into existing operational pipelines. The tool provides documentation for installation and offers a quick tour to try it locally in a minikube cluster, along with components for image building and deployment.
tiny-dnn
tiny-dnn is a C++14 implementation of deep learning, designed for environments with limited computational resources, such as embedded systems and IoT devices. It stands out as a header-only and dependency-free framework, meaning there's nothing to install beyond a C++14 compiler. This makes it highly portable and easy to integrate into existing applications. The framework supports a variety of network layers, activation functions, loss functions, and optimization algorithms, allowing for the construction of diverse deep learning models. It offers reasonable speed without a GPU, leveraging TBB threading and SSE/AVX vectorization. Additionally, tiny-dnn can import models from Caffe and provides a simple, exception-free operational model, making it a good choice for learning neural networks.
AgentBench
AgentBench is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) as agents across a diverse spectrum of environments. It encompasses 8 distinct environments, including 5 newly created domains like Operating System (OS), Database (DB), Knowledge Graph (KG), Digital Card Game (DCG), and Lateral Thinking Puzzles (LTP), alongside 3 recompiled from published datasets (House-Holding, Web Shopping, Web Browsing). The platform offers both Dev and Test splits for each dataset, requiring LLMs to generate responses thousands of times for thorough evaluation. AgentBench also introduces VisualAgentBench for evaluating and training visual foundation agents based on large multimodal models (LMMs), covering embodied, GUI, and visual design environments. It supports quick setup using Docker Compose and provides benchmarking results via a leaderboard.
Buyutech
Buyutech is a full-stack perception company specializing in camera-based sensing technologies for automotive, defense, and industrial mobility. They develop complete technology stacks, from photon to real-time perception, enabling safe and intelligent movement for vehicles, robots, and autonomous systems in various environments. Their offerings include core automotive products like analog and digital rear-view cameras, digital side mirror systems, occupant and driver monitoring systems, and surround-view camera systems. For defense and aerospace, they provide mission-critical terminal solutions, perception for aerial platforms, and situational awareness systems. Industrial mobility solutions include stereo depth cameras, 360° perception systems, AI-driven navigation modules, and blind-spot detection cameras. Buyutech integrates hardware, imaging pipelines, edge AI, fusion, and high-volume camera production to deliver highly reliable perception.
feathr
Feathr is a scalable, unified data and AI engineering platform widely used in production at LinkedIn and now an open-source project under the LF AI & Data Foundation. It allows users to define data and feature transformations using Pythonic APIs, register these transformations, and share them across teams. Particularly useful for AI modeling, Feathr automatically computes and joins feature transformations to training data with point-in-time correctness to prevent data leakage. It supports materializing and deploying features for online production use, offers native cloud integration with scalable architecture, and has been battle-tested for over six years. Feathr handles billions of rows and petabyte-scale data with built-in optimizations, providing rich transformation APIs including time-based aggregations and sliding window joins. It also features a built-in registry for feature reuse and an intuitive UI for searching and exploring features and their lineages.