AI Agents & Automation
Browsing page 150 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
LongVideoBench
LongVideoBench is an AI tool designed for evaluating and benchmarking long video models. It provides a platform to view and sort leaderboard data based on different criteria, including accuracy by duration groups and question categories. This allows researchers and developers to compare the performance of various AI models in understanding and analyzing long-form video content. The tool is particularly useful for those working on video analysis and understanding, offering a structured way to assess model capabilities and identify areas for improvement. Hosted on Hugging Face Spaces, it leverages a robust infrastructure for data display and sorting.
app-platform
AppPlatform is a cutting-edge, open-source AI application engineering platform designed to streamline the development process for large model training and inference applications. It achieves this through integrated declarative programming and low-code configuration tools, offering a powerful and scalable environment for software engineers and product managers. The platform supports the entire AI application development lifecycle, from concept to deployment. Its core architecture includes a backend based on the FIT framework for application management and functional extensions, and a React-based frontend with a visual interface for AI application development, an application marketplace, smart forms, and plugin management. Key features include a low-code graphical interface for intuitive AI app creation, a robust operator and scheduling platform supporting multiple programming languages, and a shared template store for collaboration and reuse of AI applications as functions, RAGs, or agents.
deep-learning-from-scratch-4
deep-learning-from-scratch-4 is an open-source GitHub repository that serves as the support site for the book "Deep Learning from Scratch 4: Reinforcement Learning Edition" (O'Reilly Japan, 2022). It provides all the source code used in the book, organized by chapter, along with common utility code. The repository also offers Jupyter Notebook versions of the code, which can be run directly on cloud services like Google Colab, Kaggle Notebook, and Studio Lab for interactive learning. It supports Python 3.x and requires libraries such as NumPy, Matplotlib, OpenAI Gym, and DeZero (or PyTorch). The project is licensed under the MIT License, allowing for free commercial and non-commercial use, making it an excellent resource for students and developers exploring reinforcement learning.
gemini-business2api
Gemini Business2API functions as an OpenAI-compatible API gateway, enabling users to leverage Gemini Business capabilities through a familiar interface. This tool is designed with multi-account load balancing, ensuring efficient distribution of requests across multiple Gemini Business accounts. It supports advanced multimodal features, including image and video generation, as well as comprehensive file parsing. The platform includes a built-in management panel for streamlined administration, allowing for unified management of account pools, system settings, and operational status. Key features like account import/export, batch operations, and status filtering enhance usability. It also offers flexible deployment options via Docker Compose or an interactive installation script, making it accessible for various technical setups.
LangChain-Chinese-Getting-Started-Guide
The LangChain-Chinese-Getting-Started-Guide is an open-source tutorial designed to help Chinese speakers learn and utilize the powerful LangChain framework. It covers essential concepts such as LLM invocation, prompt management, document loaders, text splitters, vector stores, chains, and agents. The guide provides practical examples, including performing Q&A with OpenAI models, integrating with Serpapi for internet searches, and summarizing long texts. It also addresses common challenges like API token limits and offers solutions using LangChain's features. The tutorial is actively maintained on GitHub, with updates and code examples available for hands-on learning.
KB2E
KB2E is a knowledge graph embedding tool developed as a subproject of THU-OpenSK. It provides implementations for several prominent knowledge graph embedding algorithms, including TransE, TransH, TransR, and PTransE. These algorithms are crucial for representing entities and relations in a knowledge graph as low-dimensional vectors, enabling various downstream tasks like link prediction and entity classification. While the project offers valuable resources for researchers and developers interested in knowledge graph embeddings, it is important to note that KB2E is no longer actively maintained. Users are advised to transition to the newer and actively supported OpenKE package for continued development and support in this domain.
Paddle3D
Paddle3D is an open-source, end-to-end deep learning 3D perception toolkit developed by PaddlePaddle. It provides a flexible framework for handling various 3D data formats and supports integration with PaddleDetection and PaddleSeg for 2D vision capabilities. The toolkit features a rich model library covering mainstream 3D perception algorithms across monocular, point cloud, and multi-camera modalities, including detection and segmentation tasks. It offers full-process support from data processing and model building to training, optimization, and deployment, with compatibility for major 3D datasets like KITTI, nuScenes, and Waymo. Paddle3D is optimized for performance on various autonomous driving chips and seamlessly integrates with the Apollo autonomous driving platform.
PyHealth
PyHealth is a comprehensive, open-source deep learning Python toolkit designed to support clinical predictive modeling for both ML researchers and medical practitioners. It aims to make healthcare AI applications easier to develop, test, and deploy, offering flexibility and customizability. Key features include a modular 5-stage pipeline, a healthcare-first approach with support for medical codes and clinical datasets like MIMIC and eICU, and over 33 pre-built models with production-ready trainers and metrics. The toolkit supports more than 10 healthcare tasks and datasets, providing fast data processing for quick experimentation. PyHealth also includes independent modules for medical code mapping (pyhealth.medcode) and medical code tokenization (pyhealth.tokenizer), enhancing its utility for complex healthcare data.
monk_v1
Monk is a low-code deep learning tool designed to simplify computer vision development by providing a unified wrapper for various deep learning libraries. It allows users to write less code and create end-to-end applications using a single syntax across frameworks like PyTorch, MXNet, and Keras. Monk helps manage entire projects with multiple experiments, making it ideal for students, researchers, developers, and competition participants. Key features include project management, hyper-parameter analysis, and a comprehensive study roadmap for learning computer vision. It supports real-world image classification applications across diverse domains such as medical, fashion, autonomous vehicles, and retail.
workflow
The Workflow SDK is designed for developers to build robust and observable applications and AI agents using TypeScript. It enables the creation of apps that can suspend, resume, and maintain state with ease, ensuring durability and reliability. This open-source project, built by engineers at Vercel and the Open Source Community, streamlines the development process for complex asynchronous JavaScript applications. It is particularly useful for managing long-running processes and AI agents that require consistent state management and fault tolerance. The SDK provides a foundational framework for building resilient systems, making it a valuable tool for modern software development.
Shift-AI-models-to-real-world-products
Shift-AI-models-to-real-world-products is an Open Source repository offering comprehensive guides and references for transitioning AI models from research and development into practical, real-world products and projects. It provides insights into various stages of AI product development, including machine learning project processes, team composition, product manager challenges, pre-sales solutions, data management, model training and deployment, and MLOps. The resource is particularly valuable for those looking to understand the engineering and productization aspects of AI, especially within B/G (Business/Government) markets and computer vision applications. It aims to bridge the gap between theoretical AI models and their successful implementation in commercial or governmental settings.
Avala AI
Avala AI is a comprehensive platform designed to eliminate data entropy in Physical AI and frontier model pipelines. It serves as a unified data engine, fusing sensors, labels, and feedback into traceable ground truth. The platform connects ingestion, labeling, and deployment, allowing users to trace any model behavior back to its originating data. Avala offers a Python SDK, REST API, and CLI for programmatic management of datasets, annotation triggering, and results export. It supports various data types including 4D Point Cloud & LiDAR, 4D Video, 2D Image, 2D Video, Text, and specialized formats like Medical Imaging. The tool emphasizes glass-box traceability from sensor to deployment, ensuring data quality and compliance with standards like SOC 2 Type II, GDPR, ISO 27001, and TISAX.
99AI
99AI is a commercial AI Web platform designed to offer a comprehensive artificial intelligence service solution. It supports private deployment, allowing businesses, teams, or individuals to maintain control over their data and infrastructure. The platform includes built-in multi-user management, making it suitable for organizations that need to manage access and usage for multiple team members. With its full Node.js packaging and Docker deployment support, 99AI is ready for immediate use. It integrates mainstream AI capabilities, offers deep thinking models, real-time internet search, and intelligent chart generation, providing a versatile tool for various AI applications.
training_extensions
OpenVINO™ Training Extensions is a low-code transfer learning framework designed for computer vision tasks. It enables users to train, infer, optimize, and deploy models easily and quickly, even with limited deep learning expertise. The tool supports diverse combinations of model architectures, learning methods, and task types based on PyTorch and OpenVINO™ toolkit. Key features include support for classification, object detection, semantic segmentation, instance segmentation, and anomaly recognition. It also provides usability features like native Intel GPUs (XPU) support, Datumaro data frontend for various dataset formats, distributed training, mixed-precision training, class incremental learning, and model deployment to OpenVINO™ IR and ONNX formats. The framework offers both API and CLI-based training for flexibility and ease of use.
Salfati Group
Salfati Group's Organizational Intelligence (SOI) platform is a human-first AI solution designed to unify organizational data, encode expert wisdom, and free workforces from repetitive tasks. It acts as a Cognitive Operating System, connecting diverse data sources and mapping the "living context" of an organization through a Universal Data Fabric and Semantic Intelligence. SOI captures unwritten tribal knowledge and heuristics, transforming them into sentient agents that anticipate needs and handle mundane workflows. The platform features a self-improving memory, learning from every interaction and expert correction. It integrates natively with tools like Slack, Drive, SharePoint, and Outlook, ensuring disruption-free deployment within existing secure perimeters. Salfati Group offers two paths: a SaaS SOI Platform for scalable processes and an Embedded Platform + FDAE for complex transformations requiring hands-on engineering.
SF Tensor
SF Tensor, also known as The San Francisco Tensor Company, is dedicated to reinventing the software and infrastructure stack for modern AI and High-Performance Computing (HPC). The platform provides automatic kernel optimization and cross-cloud, cross-vendor compute capabilities, ensuring code runs faster, cheaper, and is portable across various platforms. It supports a heterogeneous future where CPUs, GPUs, TPUs, and domain-specific accelerators are treated as first-class citizens. SF Tensor offers two main options: Tensor Cloud for experiments and medium-scale training jobs, and Forward-Deployed for scaling training runs with dedicated infrastructure support. Pricing is aligned with the savings delivered to customers.
easyAI
easyAI is a pure-Python artificial intelligence framework specifically designed for two-player abstract games like Tic Tac Toe, Connect 4, and Reversi. It simplifies the process of defining game rules and allows users to implement AI opponents or solve games. The framework utilizes a Negamax algorithm with alpha-beta pruning and transposition tables for efficient AI decision-making. Users can easily install it via pip and integrate it into their Python projects. Beyond basic gameplay, easyAI also supports solving games with iterative deepening, providing optimal moves and win/loss predictions. It is an open-source project, welcoming contributions for further development.
reasoning-gym
reasoning-gym is a Python library designed for training reasoning models using reinforcement learning. It offers a comprehensive set of dataset generators and reasoning environments, allowing users to create and manage training data with adjustable complexity. The tool provides access to over 100 distinct tasks, covering a wide range of reasoning challenges. This makes it a valuable resource for researchers and developers focused on advancing AI's reasoning capabilities, particularly those working with reinforcement learning approaches. While the provided content is from GitHub's pricing page, it indicates that the underlying project is likely open-source or free to use, given its presence on GitHub and the lack of specific pricing for the 'reasoning-gym' itself, suggesting it's a development framework rather than a commercial product.
rlcard
RLCard is a comprehensive, open-source toolkit designed for reinforcement learning (RL) in card games. Developed by DATA Lab at Rice and Texas A&M University, it offers a versatile platform for researchers and developers to implement and test various RL and searching algorithms within popular card game environments such as Blackjack, Leduc Hold'em, Texas Hold'em, DouDizhu, Mahjong, UNO, Gin Rummy, and Bridge. The toolkit provides easy-to-use interfaces, supports environment local seeding, multiprocessing, and includes a model zoo with pre-trained and rule-based models. It also integrates with PettingZoo, allowing for multi-agent reinforcement learning experiments.
RL-Factory
RL-Factory is an open-source framework designed for efficient reinforcement learning (RL) post-training in Agentic Learning. It significantly simplifies the process by decoupling the environment from RL post-training, allowing users to train agents with only a tool configuration and a reward function. A key differentiator is its support for asynchronous tool-calling, which makes RL post-training up to 2x faster than existing frameworks. The platform natively supports one-click DeepSearch training, multi-turn tool-calling, model judge reward mechanisms, and training for various models, including Qwen3. Future updates aim to introduce a WebUI for data processing, environment definition, and project management, alongside support for more models and multimodal agentic learning.
schnetpack
schnetpack is an open-source toolbox designed for researchers and developers working with atomistic systems. It provides a robust framework for developing and applying deep neural networks to predict various properties of molecules and materials, such as potential energy surfaces and quantum-chemical characteristics. The tool includes fundamental building blocks for atomistic neural networks, simplifying the process of conducting simulations and making accurate property predictions. Its open-source nature, hosted on GitHub, encourages community contributions and provides transparent access to its codebase, making it a valuable resource for academic and industrial research in computational chemistry and materials science.
SpatialLM
SpatialLM is a 3D large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. It can identify architectural elements such as walls, doors, and windows, as well as oriented object bounding boxes with their semantic categories. A key differentiator is its ability to handle point clouds from diverse sources, including monocular video sequences, RGBD images, and LiDAR sensors, unlike previous methods that often required specialized equipment. This multimodal architecture bridges the gap between unstructured 3D geometric data and structured 3D representations, providing high-level semantic understanding. SpatialLM enhances spatial reasoning capabilities for applications in embodied robotics, autonomous navigation, and other complex 3D scene analysis tasks. It offers models like SpatialLM1.1-Llama-1B and SpatialLM1.1-Qwen-0.5B, available on Hugging Face, and supports detection with user-specified categories.
rl
TorchRL is an open-source Reinforcement Learning (RL) library built for PyTorch, emphasizing a modular, primitive-first, and Python-first design. It provides a comprehensive framework for developing and deploying RL agents, featuring a command-line training interface for state-of-the-art agents without extensive coding. The library also includes a revamped vLLM integration for scalable LLM inference and training, offering features like AsyncVLLM service, multiple load balancing strategies, and distributed data loading. Additionally, TorchRL offers an experimental PPOTrainer for configurable PPO training solutions and a complete LLM API for fine-tuning language models, supporting RLHF, supervised fine-tuning, and tool-augmented training. Its design principles align with the PyTorch ecosystem, ensuring efficiency, extensibility, and minimal dependencies.
streamlit-fastapi-model-serving
streamlit-fastapi-model-serving is an open-source project designed to simplify the deployment of machine learning models. It leverages FastAPI for creating a robust backend with automatic API documentation and Streamlit for building an interactive, user-friendly frontend. This combination allows developers to quickly serve PyTorch models, providing both a programmatic interface for other applications and a visual interface for direct user experimentation. The project uses Docker Compose to orchestrate these two services, ensuring seamless communication and easy setup. It's an ideal solution for developers looking to deploy ML models with a complete web application stack.