AI Agents & Automation
Browsing page 134 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
fairchem
fairchem is a comprehensive, open-source library developed by the FAIR Chemistry team, offering machine learning methods specifically tailored for chemistry. It serves as a centralized repository for data, models, demos, and applications in materials science and quantum chemistry. The library supports various tasks, including relaxing adsorbates on catalytic surfaces, optimizing inorganic crystals, running molecular dynamics simulations, and calculating spin gaps. It features pretrained models like UMA, which can be used with the ASE FAIRChemCalculator for a wide range of applications. fairchem also supports multi-GPU inference and LAMMPs integration for large-scale simulations, making it suitable for complex computational chemistry problems.
OpenAI Swarm
OpenAI Swarm is an experimental, open-source framework designed for exploring ergonomic, lightweight multi-agent orchestration. It simplifies the coordination and execution of multiple AI agents through its core abstractions: Agents and handoffs. Agents encapsulate instructions and tools, and can transfer conversations to other Agents. The framework supports direct Python function calling, efficient context management, and operates client-side using the Chat Completions API. While Swarm is an educational resource for developers curious about multi-agent orchestration, it has been superseded by the OpenAI Agents SDK for production use cases. It allows for building scalable, real-world solutions by enabling rich dynamics between tools and networks of agents.
self-attention-cv
Self-attention-cv is an open-source repository offering implementations of diverse self-attention mechanisms specifically tailored for computer vision applications. Built in PyTorch, it leverages `einsum` and `einops` for efficient and flexible module creation. The repository serves as an ongoing collection of building blocks, enabling developers to integrate advanced attention models into their projects. It supports a range of computer vision tasks, including image recognition and segmentation, with examples for Multi-head attention, Axial attention, Vision Transformers (ViT), and TransUnet. It also includes various positional embedding implementations.
LLaMA-O1
LLaMA-O1 is an open-source framework designed for the development, deployment, and evaluation of large reasoning models. It leverages PyTorch and Hugging Face, providing a robust environment for researchers and developers. The framework includes resources for supervised fine-tuning and base pretraining, with datasets like OpenLongCoT-SFT and OpenLongCoT-Pretrain-1202 available on Hugging Face. LLaMA-O1 also offers pre-trained models and a CPU-only online demo, making it accessible for experimentation. Future developments include Reinforcement Learning With Self-Play and Inference-time Reasoning Enhancement Frameworks, indicating continuous advancement in the field of large reasoning models.
TurboTransformers
TurboTransformers is an open-source, fast, and user-friendly runtime environment designed for transformer inference on both CPU and GPU. Developed by WeChat AI, it supports various transformer models including BERT, ALBERT, GPT2, and Decoders. A key feature is its ability to handle variable length inputs without requiring time-consuming offline tuning, allowing for real-time changes in batch size and sequence length. It offers excellent CPU/GPU performance and includes smart batching to minimize zero-padding overhead for requests of different lengths. TurboTransformers provides both Python and C++ APIs, and can be integrated as a plugin for PyTorch, enabling end-to-end acceleration with just a few lines of code. It has been successfully applied in Tencent's online BERT service scenarios, demonstrating significant acceleration for services like WeChat FAQ and QQ recommendation systems.
Whisper
Whisper is a general-purpose speech recognition model developed by OpenAI, trained on an extensive and diverse audio dataset. It functions as a multitasking model capable of multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. The tool uses a Transformer sequence-to-sequence model, processing various speech tasks as a sequence of tokens. This allows a single model to handle multiple stages of a traditional speech-processing pipeline. Whisper offers several model sizes, including English-only and multilingual versions, with varying speed and accuracy tradeoffs. It supports command-line and Python usage, making it versatile for developers and researchers.
nnabla
nnabla is a deep learning framework developed by Sony, designed for research, development, and production across diverse platforms including desktop PCs, HPC clusters, embedded devices, and production servers. It features a flexible Python API built on a C++11 core, enabling both static and dynamic computation graphs. The framework supports GPU acceleration via CUDA extensions and offers command-line utilities for tasks like training, evaluation, and file format conversion (e.g., ONNX, TensorFlow, TFLite). While currently in a maintenance phase with no active development, nnabla remains a robust tool for developers and researchers needing a portable and extensible deep learning solution.
software-agent-sdk
The OpenHands Software Agent SDK is a comprehensive toolkit featuring Python and REST APIs, designed for building AI agents that interact with code. It enables developers to create agents for a variety of tasks, from one-off actions like generating a README to routine maintenance such as updating dependencies, and even major refactors. A key differentiator is its flexibility, allowing agents to operate either on the local machine or within ephemeral workspaces like Docker or Kubernetes via the Agent Server. This SDK also powers the OpenHands CLI and OpenHands Cloud, providing a robust foundation for new developer experiences. It includes examples for standalone SDK usage, remote agent server interactions, and GitHub Workflows integration.
SqueezeSeg
SqueezeSeg is a TensorFlow-based implementation of convolutional neural networks designed for real-time road-object segmentation from 3D LiDAR point clouds. This repository provides the code for SqueezeSeg, a model that processes LiDAR data to identify and segment objects in a scene, crucial for applications like autonomous driving. The project also references SqueezeSegV2, a follow-up work with improved performance, and provides links to download converted datasets for training and validation. It includes instructions for installation, running a demo, and training/evaluating the model, making it a valuable resource for researchers and developers in the field of autonomous vehicles and computer vision.
Ava PLS
Ava PLS is an open-source desktop application designed to run language models directly on your computer, providing a local and private environment for AI experimentation. It features a batteries-included graphical user interface (GUI) for llama.cpp, simplifying the process of interacting with language models without needing cloud infrastructure. Users can easily download pre-built artifacts from GitHub Actions or compile the application themselves using Zig. The tool is built with a robust tech stack including Zig, C++, SQLite, Preact, Preact Signals, and Tailwind CSS, ensuring a stable and efficient local AI experience.
BotCircuits
BotCircuits is a platform designed to help businesses build and deploy reliable AI agents for customer operations. These agents can handle real business tasks across various functions like support, operations, and growth, delivering measurable results. The platform emphasizes ease of use, fast deployment, and reliability, addressing common challenges with complex and untrustworthy AI in critical customer interactions. Users can create AI agents using prompts or a visual builder, train them with their own data (URLs, PDFs, CSVs), and test their performance before integrating them with chat, voice, and messaging apps. BotCircuits is built for enterprise scale, offering always-on reliability, trusted security, advanced workflows, and rapid deployment capabilities.
susi_gassistantbot
susi_gassistantbot is an open-source project designed to integrate SUSI AI with Google Assistant, enabling developers to create custom voice-controlled applications and AI agents. The project provides a framework for building functionalities on Google Assistant using the SUSI AI platform. It requires setting up a project on Google's Actions console, configuring API.AI (now Dialogflow) with intents and webhooks, and deploying the application to a platform like Heroku. This tool is ideal for developers looking to extend Google Assistant's capabilities with custom AI logic from SUSI, offering a flexible way to build interactive voice experiences.
text-summarization-tensorflow
text-summarization-tensorflow is an open-source project providing a TensorFlow implementation of text summarization. It utilizes a seq2seq library with an encoder-decoder model, incorporating an attention mechanism for improved performance. The tool initializes word embeddings using Glove pre-trained vectors and employs LSTM cells for both encoding and decoding processes. It supports training with custom datasets and offers options for configuring hyperparameters such as network size, depth, beam width, and learning rate. Users can also test the model with pre-trained weights and evaluate performance using ROUGE metrics. This tool is ideal for researchers and students looking to understand and experiment with text summarization techniques.
tensorforce
Tensorforce is an open-source deep reinforcement learning framework built on TensorFlow, designed for both research and practical applications. It stands out for its modular, component-based design, allowing for highly configurable feature implementations. A key differentiator is the separation of the RL algorithm from the application, making algorithms agnostic to input and output structures. The entire reinforcement learning logic, including control flow, is implemented in TensorFlow, enabling portable computation graphs. It supports a wide range of features including various network layers, memory types, policy distributions, reward estimation, training objectives, and optimization algorithms. Tensorforce also offers extensive exploration techniques, preprocessing options, and regularization methods, making it a versatile tool for developing and training reinforcement learning agents.
trfl
TRFL (pronounced "truffle") is an open-source library developed by Google DeepMind, designed to simplify the implementation of Reinforcement Learning (RL) agents using TensorFlow. It offers a collection of essential building blocks and loss functions, such as Q-learning, that are crucial for developing and experimenting with various RL algorithms. The library integrates seamlessly with existing TensorFlow environments, allowing developers to leverage its powerful computational graph capabilities. TRFL does not list TensorFlow as a direct requirement, giving users flexibility to install specific CPU or GPU versions, along with TensorFlow Probability, separately. This modular approach makes it a valuable resource for researchers and practitioners in the field of AI and machine learning.
torchlayers
torchlayers is a PyTorch-based library designed to simplify the definition of neural network layers by providing automatic shape and dimensionality inference, similar to the Keras API. It eliminates the need for manual specification of input dimensions for many `torch.nn` modules, including convolutional, recurrent, transformer, attention, and linear layers. The library also includes additional building blocks found in state-of-the-art architectures, such as EfficientNet, PolyNet, Squeeze-And-Excitation, and StochasticDepth. Users can define custom modules with shape inference capabilities and benefit from useful defaults like "same" padding and automatic dropout rates. It supports zero overhead and torchscript, allowing seamless integration with existing PyTorch workflows.
tkDNN
tkDNN is a specialized Deep Neural Network library engineered for high-performance inference on NVIDIA Jetson Boards, including TK1, TX1, TX2, AGX Xavier, and Nano. Built upon cuDNN and TensorRT primitives, its core objective is to maximize inference speed on NVIDIA hardware. The library supports various deep learning tasks such as 2D/3D object detection, tracking, semantic segmentation, and monocular depth estimation. While it excels at inference, tkDNN does not support model training. It provides detailed FPS and mAP results for popular models like YOLOv3/v4 and MobileNetV2 SSD across different NVIDIA platforms, showcasing its optimization capabilities for embedded systems.
TinyChatEngine
TinyChatEngine is an open-source library designed for efficient on-device inference of Large Language Models (LLMs) and Visual Language Models (VLMs). It allows users to run these advanced AI models directly on edge devices such as laptops, cars, and robots, ensuring instant responses and enhanced data privacy by keeping processing local. The engine leverages sophisticated LLM model compression techniques, including SmoothQuant and AWQ (Activation-aware Weight Quantization), to optimize performance for low-precision models. It boasts universal compatibility across x86, ARM, and CUDA platforms, featuring a from-scratch C/C++ implementation with no external library dependencies. TinyChatEngine is recognized for its high performance, achieving real-time inference on various devices, and is designed for ease of use, requiring only download, compilation, and deployment.
MARLlib
MARLlib is a comprehensive, open-source library designed for Multi-agent Reinforcement Learning (MARL), leveraging Ray and its RLlib toolkit. It offers a unified platform for researchers and developers to create, train, and evaluate MARL algorithms across a wide array of tasks and environments. Key features include support for all task modes (cooperative, collaborative, competitive, mixed), a Gym-like interface for multi-agent environments, and flexible parameter-sharing strategies. MARLlib provides 18 pre-built algorithms with an intuitive API, making it accessible even for those new to MARL. Users can customize model architectures, policy sharing, and access over a thousand released experiments. It is compatible with Linux operating systems and offers step-by-step installation or Docker-based usage.
MM-EUREKA
MM-EUREKA is a cutting-edge project exploring the frontiers of multimodal reasoning through rule-based reinforcement learning. It introduces powerful models such as MM-Eureka-Qwen-7B and MM-Eureka-Qwen-32B, which significantly advance performance in multidisciplinary K12 and mathematical reasoning tasks. The project has iterated on model architecture, algorithms, and data, moving from InternVL to the more robust Qwen2.5-VL base models. Key improvements include enhanced online filtering, adaptive online rollout adjustment (ADORA), and novel RL algorithms like Clipped Policy Gradient Optimization with Policy Drift (CPGD). MM-EUREKA also open-sources a comprehensive pipeline, including self-collected MMK12 datasets, to foster further research and development in multimodal AI.
ms-swift
ms-swift is a comprehensive, open-source framework developed by the ModelScope community, designed for fine-tuning and deploying large language models (LLMs) and multimodal large models (MLLMs). It supports over 600 text-only LLMs and 400 MLLMs, offering full-pipeline capabilities from training to inference, evaluation, quantization, and deployment. The framework integrates advanced training technologies, including Megatron parallelism (TP, PP, CP, EP) for acceleration and a rich family of GRPO reinforcement learning algorithms. ms-swift also supports various fine-tuning methods like LoRA, QLoRA, and DoRA, and provides memory optimization techniques such as Flash-Attention 2/3. It offers a Web-UI interface for simplified training, inference, evaluation, and quantization workflows, making it accessible for a wide range of users.
mlops-v2
The Azure MLOps (v2) solution accelerator offers enterprise-ready templates designed to streamline the deployment of machine learning models on the Azure Platform. This project serves as a foundational starting point for MLOps implementation within Azure, emphasizing repeatable, automated, and collaborative workflows. It empowers teams of ML professionals to efficiently get their machine learning models into production. The accelerator focuses on simplicity, modularity, repeatability, security, collaboration, and enterprise readiness, utilizing a template-based approach to enhance operational efficiency across the data science lifecycle. It supports both Azure DevOps and GitHub-based deployments, providing architectural patterns and quickstart guides for various project scenarios.
Barbara
Barbara is an Edge AI platform designed for industrial companies to deploy, run, and monitor Edge Applications and AI models directly on-site. It offers a simplified approach to managing industrial infrastructure compared to traditional cloud solutions. The platform provides container orchestration, industrial connectors for various assets, and ecosystem integration, allowing users to deploy Docker-based apps and integrate with existing development environments. For AI/ML developers, Barbara facilitates model deployment to Edge Nodes and offers an Apps Marketplace for off-the-shelf tools. Edge Infrastructure Managers benefit from effortless device lifecycle management, professional-grade network connectivity, and zero-touch provisioning for faster deployments. The platform emphasizes cybersecurity, IT/OT convergence, and MLOps capabilities to optimize and package trained models for efficient inference.
OpenML
OpenML is a collaborative online machine learning platform designed to facilitate the sharing and organization of data, machine learning algorithms, and experimental results. It aims to create a frictionless, networked ecosystem where scientists and practitioners can easily integrate their existing processes and tools to collaborate globally. The platform provides significant benefits for science by enabling rapid building upon others' results, answering complex questions quickly through prior experiments, and making larger studies feasible. For scientists, it saves time on routine duties, compares new experiments to the state of the art, and offers potential for new discoveries and publications. OpenML also serves as a valuable learning environment for students and citizen scientists, allowing them to explore state-of-the-art methods and contribute their own work.