🤖

AI Agents & Automation

Browsing page 85 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.

All AI Frameworks & Infra Browser & Web Agents Chatbots & Conversational AI General-Purpose Agents Multi-Agent Systems Personal Assistants RAG & Document AI RPA Scheduling & Task Agents Voice Agents Workflow Agents

A1Base (YC W25)

61%

A1Base (YC W25) offers an API designed to give AI agents a trusted identity, complete with phone numbers and email capabilities. This platform aims to free AI agents from traditional chat interfaces, enabling them to interact with the real world more autonomously and securely. By providing these essential communication tools, A1Base helps unlock the full potential of AI, allowing developers to build more sophisticated and independent AI applications. The service emphasizes a secure platform for integrating these real-world communication features into AI agents.

ten-framework

61%

TEN is an open-source framework designed for creating real-time multimodal conversational AI agents. It provides a comprehensive ecosystem including the TEN Framework itself, Agent Examples, VAD (Voice Activity Detector), Turn Detection, and a Portal. Developers can leverage TEN to build various voice AI applications, from low-latency multi-purpose voice assistants to specialized tools like Doodler for sketch generation, Speaker Diarization, Lip Sync Avatars, and SIP Call integration. The framework supports deployment via Docker or other cloud services, offering flexibility for self-hosting and customization. It also includes resources for quick starts, documentation, and community support through Discord, LinkedIn, and Hugging Face.

Purdue AI Racing

61%

Purdue AI Racing is a program at Purdue University dedicated to fostering research and education in artificial intelligence. The initiative provides a platform for students and faculty to delve into various AI applications, particularly within the fields of engineering and robotics. It offers essential resources for the research and development of autonomous vehicle technology, contributing significantly to the university's broader mission of innovation in science and technology. The program aims to push the boundaries of AI knowledge and practical implementation through academic exploration and hands-on projects.

uniem

61%

Uniem is an open-source project dedicated to developing and refining universal text embedding models, with a strong focus on the Chinese language. The project offers comprehensive code for training, fine-tuning, and evaluating these models, making it a valuable resource for researchers and developers. All models and associated datasets are made publicly available on the Hugging Face community, promoting accessibility and collaboration. Uniem supports fine-tuning for various models, including M3E, sentence_transformers, text2vec, and even GPT series models using SGPT methods and Prefix Tuning. It also features MTEB-zh, a standardized evaluation benchmark for Chinese embedding models, allowing for rigorous comparison across different models and tasks.

traceml

61%

traceml is an open-source engine designed for comprehensive ML/Data tracking, visualization, explainability, drift detection, and dashboards, specifically integrated with Polyaxon. It enables machine learning engineers and data scientists to effectively monitor their experiments, visualize key metrics, understand model behavior, and detect data drift. The tool supports offline usage and offers integrations with popular deep learning and machine learning libraries such as Keras, PyTorch, TensorFlow, Fastai, PyTorch Lightning, and HuggingFace. Additionally, traceml provides robust artifact tracking for various chart types (Matplotlib, Bokeh, Altair, Plotly) and detailed DataFrame summaries for data profiling and quality checks.

Featurestore.org

61%

Featurestore.org serves as a comprehensive hub for all things related to feature stores in machine learning. It curates content, including blog posts and videos, to inform and educate professionals on the evolving landscape of feature stores and their surrounding data and AI environments. The platform fosters a global community of data science professionals, researchers, and engineers, facilitating the sharing of ideas and collaborative learning through monthly meetups with industry experts. It also hosts annual Feature Store Summits, providing a forum for in-depth discussions and insights into the latest advancements and best practices in the field. The site features detailed comparisons of various feature store solutions, including open-source, vendor, and in-house options, covering aspects like ingestion APIs, supported platforms, and training data handling.

Daft

61%

Daft is a high-performance data engine specifically designed for AI and multimodal workloads, enabling the processing of images, audio, video, and structured data at any scale. It features native multimodal processing, allowing users to handle various data types within a single framework. The tool also includes built-in AI operations, facilitating tasks like LLM prompts, embedding generation, and data classification using models such as OpenAI, Transformers, or custom solutions. Built with Python at its core and Rust under the hood, Daft offers blazing performance without the complexity of JVM. It supports seamless scaling from local environments to distributed clusters on Ray and Kubernetes, and provides universal connectivity to data sources like S3, GCS, Iceberg, Delta Lake, Hugging Face, and Unity Catalog. Daft ensures out-of-box reliability through intelligent memory management and sensible defaults.

DiffusionKit

61%

DiffusionKit is an open-source project designed for on-device image generation using diffusion models on Apple Silicon. It offers both Python and Swift packages, facilitating the conversion of PyTorch models to the Core ML format and enabling efficient inference with MLX. Developers can leverage DiffusionKit to run models like Stable Diffusion 3 and FLUX.1-dev directly on Apple devices, optimizing performance and reducing reliance on cloud resources. The tool supports various functionalities including text-to-image generation, image-to-image transformations, and fine-grained control over generation parameters such as seed, height, and width. Its architecture is built to support both Core ML and MLX backends, providing flexibility for integration into different application environments.

deeplearning4j

61%

Deeplearning4j is a comprehensive ecosystem designed for deploying and training deep learning models within the Java Virtual Machine (JVM) environment. It offers a high-level API for building MultiLayerNetworks and ComputationGraphs, supporting various layers including custom ones. A key feature is its ability to import models from popular frameworks like Keras, TensorFlow, ONNX, and PyTorch. The suite includes ND4J, a general-purpose linear algebra library with over 500 operations, and SameDiff, an automatic differentiation/deep learning framework similar to TensorFlow's graph mode. DataVec provides ETL capabilities for machine learning data, handling diverse formats and sources. The underlying C++ library, LibND4J, ensures high performance with CPU and GPU acceleration. Deeplearning4j supports Windows, Linux, and macOS, with broad hardware compatibility.

LeFlow

61%

LeFlow is an open-source tool-flow designed to bridge the gap between TensorFlow deep neural networks and synthesizable hardware, specifically FPGAs. It achieves this by integrating Google's XLA compiler with the LegUp high-level synthesis tool, enabling the automatic generation of Verilog code from TensorFlow specifications. This facilitates the deployment of deep neural networks on FPGAs, offering a flexible approach to hardware acceleration. The tool includes a testing framework with 15 building blocks to verify installation and functionality, ensuring that generated circuits match original TensorFlow results. It also provides examples ranging from simple tests to more complex applications, making it a comprehensive solution for hardware synthesis of AI models.

mistral.rs

61%

mistral.rs is an open-source, high-performance framework designed for fast and flexible Large Language Model (LLM) inference. It boasts zero-configuration support for any Hugging Face model, automatically detecting architecture, quantization format, and chat template. The tool offers true multimodality, handling text, vision, video, audio input, speech generation, image generation, and embeddings within a single engine. Key features include comprehensive quantization control (ISQ, GGUF, GPTQ, AWQ, HQQ, FP8, BNB), hardware-aware tuning for optimal performance, and flexible SDKs for both Python and Rust. It also provides advanced agentic features like integrated tool calling, server-side agentic loops, web search integration, and an MCP client for external tool connections. A built-in web UI simplifies interaction, making it a versatile solution for developers building AI applications.

ml-agents

61%

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project designed to transform games and simulations into dynamic environments for training intelligent agents. It leverages deep reinforcement learning and imitation learning, offering PyTorch-based implementations for easy integration. The toolkit supports various training scenarios, including single-agent, multi-agent cooperative, and competitive setups, using algorithms like PPO, SAC, MA-POCA, and self-play. It also facilitates learning from demonstrations with BC and GAIL algorithms. ML-Agents provides a flexible Unity SDK, allowing developers to integrate it into custom scenes and add their own training algorithms. It's ideal for controlling NPC behavior, automated game testing, and evaluating game design decisions.

neptune-client

61%

neptune-client is a Python client designed for the Neptune app, serving as an experiment tracker specifically for foundation model training. It enables data scientists and developers to monitor, log, and manage their machine learning experiments effectively. The tool supports various ML frameworks including TensorFlow, Keras, PyTorch, XGBoost, LightGBM, and Optuna, making it versatile for different project needs. It offers features for experiment versioning, comparison, and visualization, which are crucial for iterating on models and understanding performance. This client is essential for MLOps workflows, providing a centralized system for tracking metrics, parameters, and artifacts.

recommenders-addons

61%

TensorFlow Recommenders Addons (TFRA) is an open-source collection of projects designed to enhance TensorFlow's capabilities for building large-scale recommendation systems. It primarily introduces Dynamic Embedding Technology, which allows for trainable key-value data structures within TensorFlow, leading to better recommendation effects compared to static embedding mechanisms by avoiding hash conflicts. TFRA is compatible with native TensorFlow optimizers, initializers, CheckPoint, and SavedModel formats. It fully supports training and inference of recommender models on GPUs, including integration with TF Serving and Triton Inference Server. The project also offers support for various Key-Value implementations as dynamic embedding storage, such as cuckoohash_map and HierarchicalKV, and supports both half-synchronous and asynchronous training methods.

parrots

61%

Parrots is an open-source toolkit designed for Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) functionalities. It supports multiple languages, including Chinese, English, and Japanese, and provides multi-speaker voice synthesis with high accuracy. Key features include a Chinese ASR model based on distilwhisper, and TTS models like GPT-SoVITS and IndexTTS2. IndexTTS2 is particularly notable for its advanced capabilities, offering zero-shot speech synthesis with emotional expression and duration control, independent control over timbre and emotion, and support for various emotion control methods including audio reference, emotion vectors, and text descriptions. The tool also supports streaming TTS for low-latency real-time audio output and command-line interface (CLI) for both ASR and TTS tasks, making it suitable for developers and researchers.

chatglm-openai-api

61%

chatglm-openai-api is an open-source project that offers an OpenAI-compatible API for various large language models, specifically ChatGLM-6B, ChatGLM2-6B, and Chinese Embeddings Models. This tool simplifies the integration of these powerful models into existing applications by providing a standardized API interface, similar to what developers are accustomed to with OpenAI. It supports loading models from Hugging Face and running inference on GPUs, with options for local loading and multi-GPU inference. The project also includes advanced features like ngrok and Cloudflare tunnel integration for exposing the API, making it accessible for development and deployment. It's designed for developers looking to leverage these specific models with ease.

graph-rag-agent

61%

Graph-rag-agent is an open-source project focused on building explainable and inferential intelligent question-answering systems by combining GraphRAG and private Deep Search. It integrates various RAG technologies like GraphRAG, LightRAG, and Neo4j-llm-graph-builder for comprehensive knowledge graph construction and search capabilities. The tool features a multi-agent collaborative architecture, allowing different types of agents to work together for complex problem-solving. A key differentiator is its robust evaluation system, offering over 20 metrics to assess system performance. It also supports incremental updates for dynamic knowledge graph building and includes mechanisms for entity disambiguation and alignment to improve data quality. The project provides a full-stack solution with a backend service (FastAPI) and a frontend interface for interactive knowledge graph visualization and real-time streaming responses.

petercat

61%

PeterCat is a comprehensive solution for creating intelligent Q&A bots specifically designed for GitHub repositories. It offers a conversational Q&A agent configuration system, self-hosted deployment options, and an all-in-one application SDK. Users can easily create Q&A bots by simply providing their repository address or name, with PeterCat automating the entire setup process. The system automatically ingests GitHub documentation and issues to build a knowledge base for the bot. PeterCat supports multi-platform integration, including an SDK for websites and a GitHub App for direct repository integration. Beyond basic Q&A, the bots can handle project information queries, discussion and PR summaries, code reviews, and issue management.

qiskit-machine-learning

61%

Qiskit Machine Learning is an open-source library built on Qiskit, designed for quantum machine learning tasks at scale. It introduces fundamental computational building blocks like Quantum Kernels and Quantum Neural Networks, which are essential for applications such as classification and regression. The library aims to be user-friendly, allowing quick prototyping without extensive quantum computing knowledge, while also being flexible for proofs-of-concept and innovative research. It is extensible, facilitating the integration of new features leveraging Qiskit's architecture. Key features include kernel-based methods using FidelityQuantumKernel, generic interfaces for neural networks (EstimatorQNN, SamplerQNN), and integration with PyTorch for automatic differentiation in hybrid quantum-classical neural networks.

STT

61%

Coqui STT (🐸STT) is a fast, open-source, multi-platform, deep-learning toolkit designed for training and deploying speech-to-text models. It has been battle-tested in both production and research environments, offering a high-quality pre-trained STT model. Key features include an efficient training pipeline with multi-GPU support, streaming inference capabilities, and real-time inference. The toolkit can provide multiple possible transcripts, each with an associated confidence score, and boasts a small-footprint acoustic model. It also offers bindings for various programming languages, making it accessible for developers. However, it is important to note that this project is no longer actively maintained, with focus shifting to newer models like Whisper and Coqui's other projects.

yomo

61%

yomo is an open-source serverless AI Agent Framework designed for building scalable and ultra-fast AI agents, leveraging geo-distributed edge AI infrastructure. It empowers exceptional customer experiences by focusing on speed, reliability, and scalability of AI interactions. Key features include seamless deployment and management of serverless LLM tools, enhanced security with TLS v1.3 encryption for all data packets, and effortless Agents DevOps to streamline the entire lifecycle from development to deployment. The geo-distributed architecture brings AI inference and tools closer to users, resulting in significantly faster response times. yomo is built with Rust, ensuring high performance and efficiency for AI applications.

Modal

61%

Modal offers a serverless cloud platform specifically designed for compute-intensive AI and machine learning applications. It enables developers to define and run their code, including CPU, GPU, and data-intensive compute, at scale without managing underlying infrastructure. Key features include sub-second cold starts, instant autoscaling, and elastic GPU scaling with access to thousands of GPUs across various clouds. The platform provides a programmable infrastructure where everything is defined in code, eliminating the need for YAML or config files. It also boasts a built-in storage layer optimized for fast model loading and data processing, along with unified observability for integrated logging and full visibility into workloads. Modal supports various ML workloads like inference, training, sandboxes, batch processing, and notebooks, making it a comprehensive solution for AI and data teams.

api-for-open-llm

61%

api-for-open-llm is an open-source project that offers a unified backend interface for a wide range of open large language models, designed to mimic the OpenAI ChatGPT API. This allows developers to seamlessly integrate and utilize models such as LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, and ChatGLM into their applications. Key features include support for streaming responses, enabling printer-like effects, and the implementation of text embedding models crucial for document knowledge Q&A. It also integrates with LangChain for advanced LLM development and supports loading fine-tuned LoRA models. The project simplifies the process of using open models as ChatGPT alternatives by requiring only simple environment variable modifications, and it offers vLLM for inference acceleration and concurrent request handling.

Chainlit

61%

Chainlit is an open-source Python framework designed to accelerate the development of production-ready conversational AI applications. It allows developers to build interactive chat user interfaces in minutes, not weeks, by providing a streamlined environment for integrating AI agents and automated workflows. The framework supports popular AI tools and services such as OpenAI, Anthropic, LangChain, LlamaIndex, ChromaDB, and Pinecone, making it versatile for various AI projects. Chainlit emphasizes ease of use for Python developers, enabling them to quickly prototype and deploy AI applications. While the original team has stepped back from active development, it is now community-maintained, ensuring ongoing support and evolution.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 💻 Coding & Development 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce