AI Agents & Automation
Browsing page 142 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
RQ-VAE-Recommender
RQ-VAE-Recommender offers a PyTorch implementation of a generative retrieval model, specifically designed for recommender systems. The model operates in two stages: first, it maps items in a corpus to a tuple of semantic IDs by training an RQ-VAE. Second, it tokenizes sequences of these semantic IDs using a frozen RQ-VAE and then trains a transformer-based model to predict the next IDs in the sequence. This approach is based on the research presented in "Recommender Systems with Generative Retrieval." It supports various datasets, including Amazon Reviews (Beauty, Sports, Toys), MovieLens 1M, and MovieLens 32M, and provides both RQ-VAE and decoder-only retrieval model training scripts. Pre-trained checkpoints are available on Hugging Face for Amazon Beauty.
pytriton
PyTriton is a Flask/FastAPI-like framework designed to streamline the use of NVIDIA's Triton Inference Server within Python environments. It allows developers to serve machine learning models with ease, supporting direct deployment from Python. Key features include native Python support for exposing any Python function as an HTTP/gRPC API, framework-agnostic operation compatible with PyTorch, TensorFlow, or JAX, and performance optimizations like dynamic batching, response caching, and model pipelining. The tool also provides decorators for handling batching and pre-processing, high-level model clients for HTTP/gRPC requests, and alpha support for streaming partial responses.
Data Wizards
Data Wizards is an AI consulting firm specializing in helping corporates and ambitious SMEs unlock their business potential through expert AI solutions. They provide comprehensive services including AI strategy development, AI solution and development, and AI education. Data Wizards builds high-performing AI solutions to overcome challenges, streamline operations, and identify new growth opportunities. Their expertise spans various industries such as Automotive, Retail, Pharmaceutical, Manufacturing, Insurance, Financial, Logistics, Energy, Healthcare, Telecommunications, Media, SMEs, Security, Commodity, and Food, offering tailored applications like predictive maintenance, sales forecasts, customer churn analysis, and fraud detection.
sonata
Sonata is the official project repository for "Sonata: Self-Supervised Learning of Reliable Point Representations," a CVPR'25 Highlight paper. This open-source tool provides self-supervised pre-trained Point Transformer V3 models specifically designed for various 3D point cloud downstream tasks. Users can leverage Sonata for quick inference and visualization, with easy-to-use installation options for both standalone and package modes. The repository includes pre-trained models, inference code, and visualization demos, making it accessible for researchers and developers. It supports custom data integration and offers a flexible data transformation pipeline, along with options for loading models from Huggingface or local paths, even accommodating environments without FlashAttention.
streaming
Streaming is a data streaming library built by MosaicML designed to make training on large datasets from cloud storage as fast, cheap, and scalable as possible. It is specifically optimized for multi-node, distributed training for large models, ensuring correctness, performance, and ease of use. The library supports various data types including images, text, video, and multimodal data, and is compatible with major cloud storage providers like AWS, OCI, GCS, Azure, and any S3 compatible object store. It integrates seamlessly into existing training workflows as a drop-in replacement for PyTorch IterableDataset. Key features include seamless data mixing, true determinism for reproducible training runs, instant mid-epoch resumption, high throughput, and equal convergence compared to local disk solutions.
InsightNext
InsightNext is a Google Cloud Partner specializing in AI/ML and Data Engineering. They offer deep expertise in Google Cloud Platform (GCP) and Google Workspace, helping organizations modernize their infrastructure and secure their workloads with robust governance. Their services focus on implementing AI/ML solutions and advanced data engineering practices to solve complex business challenges. InsightNext aims to drive enterprise data transformation through AI-driven cloud solutions and agentic AI systems, delivering measurable outcomes for their clients.
synaptic
Synaptic is an open-source JavaScript neural network library designed for both Node.js environments and web browsers. Its core strength lies in its architecture-free algorithm, which allows developers to construct and train virtually any type of first-order or second-order neural network. The library comes equipped with several built-in architectures, including multilayer perceptrons, multilayer long-short term memory networks (LSTM), liquid state machines, and Hopfield networks. Additionally, it features a versatile trainer capable of training any given network, complete with built-in tasks for testing and comparing architectural performance, such as solving XOR problems or completing Distracted Sequence Recall tasks. This makes Synaptic a powerful tool for developers looking to implement and experiment with neural networks in their JavaScript projects.
Exabits.ai
Exabits.ai serves as the backbone of AI infrastructure, providing a comprehensive network of GPUs designed to accelerate AI development and innovation. Their offerings span from consumer-grade GPUs to high-end NVIDIA models like GB200s, H100s, H200s, and RTX5090s. Exabits is dedicated to refining raw GPU assets from leading manufacturers to deliver the most cost-effective compute solutions available. The platform is obsessed with innovating performance, ensuring that users have access to powerful and sustainable computing infrastructure for their AI applications and web3.0 initiatives. This focus on diverse GPU availability and performance optimization makes Exabits a key player in supporting advanced AI workloads.
Texygen
Texygen is an open-source benchmarking platform designed to support research in open-domain text generation models. It offers a comprehensive suite of implemented text generation models, alongside a diverse set of metrics for evaluating the diversity, quality, and consistency of generated texts. The platform aims to standardize research in the field of text generation, fostering reproducibility and reliability in future work. By facilitating the sharing of fine-tuned open-source implementations among researchers, Texygen helps advance the development and understanding of text generation technologies. It supports Python 3.6+ and popular libraries like TensorFlow, Numpy, Scipy, and NLTK.
Hilker Consulting
Hilker Consulting is a leading AI consulting firm and academy specializing in AI transformation for B2B decision-makers in the DACH region. They offer certified AI training programs, including AI Manager and AI Consultant courses, which are AZAV and ZFU-certified, ensuring state-recognized quality and maximum funding opportunities. Led by Dr. Claudia Hilker, a renowned AI expert, the academy combines academic rigor with practical experience from over 500 AI projects. The services include strategic AI implementation, AI-driven marketing and sales, and compliance with the EU AI Act. Their unique approach focuses on building AI competence within companies, offering group coaching, and ensuring measurable ROI for businesses looking to gain a competitive edge through AI.
tiny-cuda-nn
tiny-cuda-nn is a high-performance C++/CUDA neural network framework designed for speed and efficiency in training and querying neural networks. It incorporates a lightning-fast "fully fused" multi-layer perceptron and a versatile multiresolution hash encoding, as detailed in its technical papers. The framework supports various input encodings, losses, and optimizers, making it adaptable for diverse neural network applications. It also offers JIT fusion for significant performance boosts, particularly on newer NVIDIA GPUs, and provides PyTorch bindings for integration into Python workflows, though native CUDA performance remains superior for large batch sizes. The framework is ideal for developers and researchers working on demanding AI tasks requiring optimized computational performance.
Transformer-SSL
Transformer-SSL is an open-source project offering the official implementation for "Self-Supervised Learning with Swin Transformers." This codebase is notable for including Swin Transformer as one of its backbones, enabling the evaluation of learned representations' transferring performance on downstream tasks like object detection and semantic segmentation. It features MoBY, a self-supervised learning approach combining MoCo v2 and BYOL, achieving high accuracy on ImageNet-1K linear evaluation with significantly fewer tricks than previous works. The project provides models and code for self-supervised learning, linear evaluation, and demonstrates strong performance when transferring to object detection and semantic segmentation tasks.
MCP Registry
MCP Registry was a server registry developed by Mintlify, intended to provide a central platform for discovering and showcasing MCP (Model Context Protocol) servers. Launched after the success of Mintlify's MCP server generator, the registry aimed to solve the discoverability problem within the MCP ecosystem. Despite attracting over 3,000 unique visitors within 24 hours of its launch and receiving significant interest from developers, the project was sunsetted just five days later. The decision was made because building and supporting a marketplace would have diverted critical operational resources from Mintlify's core developer tools product, and marketplace building was not considered their core strength. This case highlights the importance of strategic focus for companies, especially during periods of rapid growth.
timm Attention Visualization
timm Attention Visualization is an AI tool designed to help users understand how deep learning models, specifically those from the timm (PyTorch Image Models) library, process visual information. By uploading an image and selecting a timm model, users can generate detailed attention maps and rollout visualizations. These visualizations highlight the specific parts of an image that the model focuses on when making predictions, offering insights into its decision-making process. This tool is invaluable for researchers, developers, and data scientists working with computer vision models, aiding in debugging, improving model interpretability, and enhancing overall model performance. It is hosted on Hugging Face Spaces, making it easily accessible for experimentation.
OSR Enterprises AG
OSR Enterprises AG positions itself as a new-age Tier1 supplier to the automotive industry, offering a speedboat for development teams at car manufacturers. The core of their offering is the EVOLVER platform, described as a multi-domain AI brain specifically designed for cars. This platform aims to provide the foundational technology for smart, autonomous, and securely connected vehicles, processing data collected from these vehicles. While the website emphasizes their role in automotive innovation and cybersecurity, specific features of the EVOLVER platform beyond its general description as an "AI brain" are not detailed on the publicly accessible pages.
Savantic AI Lab
Savantic AI Lab operates as a full-stack AI lab, combining deep scientific expertise with real-world application to develop scalable, sustainable, and transformative AI solutions. With over two decades of innovation, they focus on "Meaningful AI" to drive sustainable growth, measurable impact, and long-term value across various industries. Their services range from research to real-world implementation, helping organizations turn AI potential into business impact. Savantic emphasizes ethical and responsible AI, ensuring solutions prioritize sustainability and deliver tangible results. They work with diverse sectors including Retail & Logistics, Medtech & Life Sciences, Industry & Energy, and Public Transportation & Municipalities.
agentlabs
AgentLabs offers an open-source universal frontend solution for AI Agents, enabling developers to quickly deploy their AI agents to public users. The platform provides essential features such as an authentication portal for user management, a clean chat frontend interface for user interaction, and integrated analytics and payment functionalities. Developers can control their AI agents with a real-time bidirectional streaming SDK from their backend, available in Python and TypeScript. AgentLabs aims to simplify the deployment process for AI agents, allowing developers to concentrate on the core AI logic while it handles the user-facing aspects. It supports both cloud hosting and self-hosting via Docker Compose, with an active alpha release and ongoing development.
agent-protocol
agent-protocol offers a common interface for interacting with AI agents, addressing the challenge of diverse agent implementations. It provides an API specification, defined in OpenAPI, that agents can expose, making them interoperable regardless of their underlying framework. This protocol includes essential routes for creating tasks and executing steps, along with additional routes for managing tasks, steps, and artifacts. By adopting agent-protocol, developers can more easily benchmark agents, integrate them into other systems, and build general devtools for development, deployment, and monitoring. The project also provides an SDK for simplified implementation and a client library for users to interact with agents, fostering a more unified and efficient AI agent ecosystem.
agent-ui
Agent-ui is a modern chat interface designed for interacting with AI agents, built using Next.js, Tailwind CSS, and TypeScript. It offers seamless integration with local and live AgentOS instances through the Agno platform. Key features include a clean chat interface with real-time streaming, support for visualizing agent tool calls and their results, and the ability to display agent reasoning steps when available. It also handles multi-modality content like images, video, and audio, and provides references used by the agent. The UI is customizable with Tailwind CSS, and it's built on a modern stack including shadcn/ui and Framer Motion. Users can easily connect to their AgentOS instances, configure endpoints, and set up authentication.
AgentCPM
AgentCPM is an open-source infrastructure developed by THUNLP, Renmin University of China, ModelBest, and the OpenBMB community, designed for training and evaluating various LLM agents. It addresses challenges in real-world applications such as limited long-horizon capability, autonomy, and generalization. The platform features AgentCPM-Explore, a 4B parameter deep-search LLM agent that achieves state-of-the-art performance on long-horizon benchmarks, and AgentCPM-Report, an 8B parameter deep-research LLM agent built on MiniCPM4.1-8B, capable of generating comprehensive reports comparable to top commercial systems. AgentCPM provides end-to-end open-source code for training, inference, and evaluation, along with a unified tool sandbox environment (AgentDock) for collaborative multi-model and multi-tool setups.
ai-dial-core
AI DIAL Core is an open-source project designed to provide a unified API for various chat completion and embedding models, assistants, and applications. Built on Java 21 and Eclipse Vert.x, it offers a robust and scalable solution for integrating diverse AI functionalities. The tool supports HTTP proxy functionality and provides comprehensive configuration options for static and dynamic settings, identity providers, toolsets, security, and storage. Developers can deploy DIAL Core on Kubernetes using Helm charts, making it suitable for complex enterprise environments. Its modular design allows for flexible integration and management of AI resources, ensuring a consistent interface across different AI services.
AutoDL-Projects
AutoDL-Projects is an open-source, lightweight project offering automated deep learning algorithms implemented in PyTorch. It provides various neural architecture search (NAS) and hyper-parameter optimization (HPO) algorithms, making it suitable for beginners, engineers, and researchers. The project features simple library dependencies, a unified codebase for all algorithms, and active maintenance. Key capabilities include implementations of NAS algorithms like TAS, DARTS, GDAS, SETN, NAS-Bench-201, and NATS-Bench, as well as HPO-CG. It requires Python >= 3.6 and PyTorch >= 1.5.0, with options for knowledge distillation and pre-trained models.
DeepRec
DeepRec is a high-performance deep learning framework specifically designed for recommendation models, built upon TensorFlow 1.15, Intel-TensorFlow, and NVIDIA-TensorFlow. Developed since 2016, it powers core businesses like Taobao Search and advertising, offering robust features for training and inference. The framework excels in super large-scale distributed training, supporting models with trillions of samples and over ten trillion parameters. It includes in-depth performance optimizations for both CPU and GPU platforms, featuring advanced embedding variables, asynchronous and synchronous distributed training frameworks, and various runtime and graph-level optimizations. DeepRec also provides capabilities for delta checkpoint loading, super-scale distributed serving, and online deep learning with low latency.
DirectML
DirectML is a high-performance, hardware-accelerated DirectX 12 library designed for machine learning tasks. It offers GPU acceleration for common machine learning operations across a wide array of supported hardware and drivers, including all DirectX 12-capable GPUs from major vendors. While DirectML is currently in maintenance mode, it remains supported on previous Windows releases and continues to ship with future Windows versions, receiving security and compliance fixes. It is distributed as a system component of Windows 10 and is also available as a standalone redistributable package for applications requiring a fixed version or running on older Windows 10 versions. DirectML exposes a native C++ DirectX 12 API and integrates as a backend for frameworks like Windows ML, ONNX Runtime, PyTorch, and TensorFlow, making it suitable for high-performance, low-latency applications such as frameworks, games, and other real-time applications.