🤖

AI Agents & Automation

Browsing page 130 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.

All AI Frameworks & Infra Browser & Web Agents Chatbots & Conversational AI General-Purpose Agents Multi-Agent Systems Personal Assistants RAG & Document AI RPA Scheduling & Task Agents Voice Agents Workflow Agents

Tengine

59%

Tengine, developed by OPEN AI LAB, is a high-performance, modular inference engine specifically designed for embedded devices. It facilitates the rapid and efficient deployment of deep learning neural network models across various AIoT applications. The core modules are developed in C language, with deep framework trimming to suit the limited resources of embedded systems. Tengine features a completely separated front-end and back-end design, which simplifies the porting and deployment to heterogeneous computing units like CPUs, GPUs, and NPUs, thereby reducing evaluation and migration costs. It supports various models and offers tools for conversion and quantization, making it a versatile solution for AI deployment on edge devices.

vulcan-sql

59%

VulcanSQL is an open-source Analytical Data API Framework designed to simplify the creation of RESTful APIs from various data sources like databases, data warehouses, and data lakes. It addresses common pain points in traditional API development, such as time-consuming custom coding, integration complexity, security concerns, and scalability issues. By allowing users to insert variables into templated SQL, VulcanSQL generates SQL statements on the fly, making data accessible for AI agents and data applications. It utilizes DuckDB as a caching layer to boost query speed and reduce API response times. The framework supports flexible deployment options, including Docker, and offers features like OpenAPI document generation for standardization, ensuring easier integration and maintenance.

LatticeWork

59%

LatticeWork is a cloud and AI innovations company dedicated to making cutting-edge technology accessible to everyone. Through its Amber brand, LatticeWork provides consumer-focused solutions that offer the convenience of cloud services while prioritizing privacy and freedom. Amber products, such as Amber X and AmberPRO, enable individuals, families, and small businesses to host their own private cloud for media, photo storage, and data management, freeing up space on mobile devices. For businesses, the VAISense line offers hardware, software, and cloud infrastructure to deploy AI at the edge, processing data where it's gathered for faster, more reliable results and enhanced privacy protection. VAISense solutions cater to various industries, including public safety, healthcare, construction, and retail, providing powerful insights through visual AI processing and security tools like OptiView, Security, and Track.

Transformer-TTS

59%

Transformer-TTS is a PyTorch implementation of the "Neural Speech Synthesis with Transformer Network," designed for efficient and high-quality speech synthesis. This model boasts training speeds 3 to 4 times faster than well-known seq2seq models such as Tacotron, while maintaining comparable synthesized speech quality. It utilizes a post-network based on the CBHG model from Tacotron and converts spectrograms into raw audio waves using the Griffin-Lim algorithm. The project includes detailed instructions for data preparation, training the autoregressive attention network and post-network, and generating TTS samples, making it a valuable resource for researchers and developers in speech synthesis.

WhisperS2T

59%

WhisperS2T is an optimized, lightning-fast open-source Speech-to-Text (ASR) pipeline specifically designed for the Whisper model. It boasts significant speed improvements over other implementations, including a 2.3X speed improvement over WhisperX and a 3X speed boost compared to HuggingFace Pipeline with FlashAttention 2. The tool supports multiple inference engines like Original OpenAI Model, HuggingFace Model with FlashAttention2, and CTranslate2 Model. It also includes features like easy integration of custom VAD models, efficient handling of small or large audio files, batching support with multiple language/task decoding, and reduction in hallucination. WhisperS2T is ideal for developers and researchers looking to implement high-performance speech-to-text capabilities.

whisperX

59%

WhisperX is an advanced automatic speech recognition (ASR) tool that significantly enhances OpenAI's Whisper model by providing accurate word-level timestamps and speaker diarization. It achieves impressive speeds, offering 70x real-time transcription using the large-v2 model with batched inference and a faster-whisper backend, requiring less than 8GB GPU memory. The tool utilizes wav2vec2 alignment for precise word timings and pyannote-audio for multispeaker ASR with speaker ID labels. Additionally, VAD preprocessing reduces hallucination and improves batching without degrading Word Error Rate (WER). WhisperX is ideal for transcribing long-form audio, particularly meetings, where accurate speaker identification and precise timing are crucial. It supports various languages and offers both command-line and Python usage for flexible integration.

vlmrun-hub

59%

vlmrun-hub is a comprehensive, open-source repository offering pre-defined Pydantic schemas specifically designed for extracting structured data from unstructured visual domains like images, videos, and documents. It is built for Vision Language Models (VLMs) and optimized for real-world use cases, simplifying the integration of visual ETL into various workflows. The hub addresses the common challenge of VLMs lacking strongly-typed, validated outputs for automation by providing schemas that ensure data conforms to expected types and structures, eliminating complex parsing and validation. Key benefits include ease of use, automatic data validation, type-safety, model-agnostic compatibility, and optimization for visual ETL across industries such as healthcare, finance, and retail.

TurboDiffusion

59%

TurboDiffusion is an open-source video generation acceleration framework designed to drastically reduce the time required for end-to-end diffusion generation. It boasts an impressive 100-200x acceleration on a single RTX 5090 GPU, all while preserving video quality. The framework achieves this efficiency through key technologies like SageAttention and SLA (Sparse-Linear Attention) for attention acceleration, combined with rCM for timestep distillation. It supports both text-to-video (T2V) and image-to-video (I2V) models, offering various checkpoints optimized for different resolutions and GPU memory configurations. Users can install it via pip or compile from source, with detailed instructions provided for both quantized and unquantized model inference.

APIPark

59%

APIPark is an open-source, cloud-native AI gateway and API developer portal designed to simplify the management, integration, and deployment of AI services for developers and enterprises. It offers ultra-high performance and supports over 100 mainstream AI models, including OpenAI, Azure, Anthropic Claude, Google Gemini, and many others, unifying API requests and responses. Key functionalities include combining AI models and prompt templates into custom APIs, standardizing data formats to reduce switching costs, and providing a developer portal for team collaboration. APIPark also features robust security with application and API key management, detailed usage monitoring, and advanced capabilities like load balancing and multi-model disaster recovery. It is designed for easy, one-command deployment, making it accessible for quickly building AI products and agents.

alluxio

59%

Alluxio Open Source is a Distributed Caching Platform designed for large-scale data, specifically for analytics workloads. It acts as a data orchestration layer, allowing computation applications to connect to various storage systems through a common interface. Originating from UC Berkeley's AMPLab, Alluxio accelerates structured data analytics and is widely adopted with engines like Presto, Spark, and Trino. While the open-source edition is suitable for testing and small-scale production, the Enterprise Edition offers a decentralized metadata service for AI/ML workloads, supporting billions of files and providing FUSE-based POSIX integration for frameworks like PyTorch and TensorFlow.

conformer

59%

Conformer is an unofficial PyTorch implementation of the "Conformer: Convolution-augmented Transformer for Speech Recognition" model, originally presented at INTERSPEECH 2020. This tool is designed to leverage both Convolutional Neural Networks (CNNs) for local feature extraction and Transformers for capturing global interactions within audio sequences. By combining these architectures, Conformer achieves state-of-the-art accuracies in speech recognition tasks while maintaining parameter efficiency. The repository provides the core model code, allowing developers and researchers to integrate and train Conformer within their own speech processing pipelines. It requires Python 3.7 or higher, along with Numpy and PyTorch, and can be installed from the source code.

Cold-Diffusion-Models

59%

Cold-Diffusion-Models offers the official PyTorch implementation of Cold-Diffusion, a novel approach for inverting arbitrary image transformations without the need for traditional noise. Developed by researchers at the University of Maryland, this repository provides comprehensive code to train and test cold diffusion models. It supports a range of image degradations, including Gaussian blur, animorphosis, Gaussian mask, resolution downsampling, image snow, and color desaturation. The implementation is based on lucidrains' denoising diffusion repository and includes pretrained models for CelebA and AFHQ generation. Users can explore both conditional and unconditional generation schedules, with detailed scripts and arguments for training and testing different models and degradation types.

cml

59%

CML (Continuous Machine Learning) is an open-source command-line interface (CLI) tool designed for continuous integration and continuous delivery (CI/CD) within Machine Learning Operations (MLOps). It automates various development workflows, such as machine provisioning, model training, and evaluation. CML enables users to compare ML experiments across project history and monitor changing datasets. It can automatically train and evaluate models, then generate visual reports with results and metrics on every pull request. CML supports GitFlow for data science, allowing management of ML experiments and tracking of model training or data modifications using GitLab or GitHub. It integrates with DVC for codifying data and models and offers functions to package ML workflow outputs into markdown reports for CI systems.

csghub-server

59%

csghub-server is the open-source backend server for CSGHub, a platform designed for managing large model assets. It facilitates the management of models, datasets, and other LLM assets through a robust REST API. Key features include the creation and management of users and organizations, automatic tagging of models and datasets, and comprehensive search functionalities. Users can also preview dataset files online, download individual files including LFS files, and track activity data like downloads and likes. The server supports extensible and customizable architectures, allowing integration with various Git servers and flexible configuration of LFS storage systems. It also enables on-demand content moderation and has a roadmap for supporting more Git servers, Git LFS, dataset online viewers, and model/dataset auto-tagging.

AI FARM ROBOTICS

59%

AI FARM ROBOTICS is a pioneering company dedicated to advancing Cambodia's technological landscape by focusing on robotics and AI. The company specializes in the research and development of core robotic technologies and products, aiming to establish Cambodia as a leader in the robotic industry. Beyond R&D, AI FARM ROBOTICS provides comprehensive system integration and management services tailored for Micro, Small, and Medium Enterprises (MSMEs), facilitating their automation processes. They also offer advanced AI solutions specifically designed for robotics applications and provide Robotics-as-a-Service (RaaS) offerings, making sophisticated robotic capabilities accessible to a wider range of businesses.

DeepBrain Chain

59%

DeepBrain Chain is positioned as the world's first public artificial intelligence chain, aiming to create a decentralized AI infrastructure. The platform leverages blockchain technology to address the computational demands of AI by utilizing idle computing resources globally. This approach is designed to offer a more cost-effective solution for AI development while simultaneously enhancing data privacy through the implementation of smart contracts. By decentralizing AI computing, DeepBrain Chain seeks to provide a robust and secure environment for developers and organizations working on AI projects, ensuring both efficiency and data protection.

DeepGBM

59%

DeepGBM is a deep learning framework specifically designed for online prediction tasks, leveraging the power of Gradient Boosting Decision Trees (GBDT) for distillation. Presented at KDD'2019, this framework aims to significantly improve prediction accuracy in real-time scenarios. It integrates GBDT-based models, specifically LightGBM, with PyTorch-based neural networks. The project includes comprehensive code for data preprocessing, baseline model implementations, and the proposed DeepGBM model. Users can prepare their data in CSV format, process it through encoders, and then load numerical and categorical data for training. The framework supports training GBDT2NN or the full DeepGBM model, offering flexibility for different prediction needs.

Airia - Enterprise AI Simplified

59%

Airia is an enterprise AI platform designed to simplify and secure the deployment and management of AI solutions at scale. It offers a unified solution for orchestration, security, and governance, enabling organizations to build, deploy, and manage AI agents with confidence. Key capabilities include AI discovery, agent constraints, routing engine, and security posture management. The platform also provides robust governance features such as AI inventory management, risk classifications, and compliance reporting, ensuring responsible AI use and regulatory alignment. Airia integrates with thousands of enterprise systems and data sources, allowing AI agents to operate with real-time context and accuracy, making it ideal for large organizations seeking to expand AI adoption without chaos or tool sprawl.

holodeck

59%

Holodeck is a high-fidelity simulator designed for reinforcement learning and robotics research, leveraging the power of Unreal Engine 4. It offers a robust platform with over seven rich worlds and numerous scenarios for training AI agents. The simulator supports both Linux and Windows operating systems, allowing for easy extension and modification of training scenarios. A key feature is its ability to train and control multiple agents simultaneously, providing a flexible environment for complex research. It boasts a simple, OpenAI Gym-like Python interface for ease of use and high performance, capable of simulation speeds up to 2x real-time. Holodeck can run headless or with visual feedback, catering to different research needs.

GraphWaveletNeuralNetwork

59%

GraphWaveletNeuralNetwork is an open-source PyTorch implementation of the "Graph Wavelet Neural Network" (GWNN) as presented at ICLR 2019. This novel graph convolutional neural network addresses limitations of previous spectral graph CNN methods by utilizing graph wavelet transform, which avoids computationally expensive matrix eigendecomposition. The graph wavelets are sparse and localized, enhancing efficiency and interpretability for graph convolution tasks. The tool is designed for researchers and machine learning engineers working with graph-based semi-supervised classification, demonstrating superior performance on benchmark datasets like Cora, Citeseer, and Pubmed. It includes command-line arguments for easy configuration of training parameters and model options.

Keras-Project-Template

59%

Keras-Project-Template is an open-source project template designed to streamline the development and training of deep learning models with Keras. It offers a clear, structured architecture, including predefined folders for models, trainers, data loaders, and configurations, simplifying project organization. The template supports checkpointing and TensorBoard visualization for monitoring training progress. A key feature is its integration with Comet.ml, enabling comprehensive experiment tracking, including hyper-parameters, metrics, and graphs, with real-time updates. This allows developers to easily manage and compare different model iterations and configurations, enhancing the efficiency of deep learning research and development.

kedro

59%

Kedro is an open-source Python framework designed for building production-ready data engineering and data science pipelines. It emphasizes software engineering best practices to ensure pipelines are reproducible, maintainable, and modular. Key features include a project template based on Cookiecutter Data Science, a Data Catalog for connecting to various data sources and versioning, and pipeline abstraction for automatic dependency resolution and visualization with Kedro-Viz. Kedro also supports coding standards like test-driven development with pytest and flexible deployment strategies, including integration with Argo, Prefect, Kubeflow, AWS Batch, and Databricks. It aims to address the shortcomings of one-off scripts and Jupyter notebooks by promoting team collaboration and efficiency through modular, reusable analytics code.

kornia

59%

Kornia is a differentiable computer vision library built on PyTorch, designed for spatial AI applications. It offers a comprehensive suite of differentiable image processing and geometric vision algorithms, allowing users to leverage powerful batch transformations, auto-differentiation, and GPU acceleration. Key features include a wide range of image processing operators like filters, transformations, and enhancements, as well as advanced augmentation pipelines for training AI models. Kornia also provides access to pre-trained AI models for tasks such as face detection, feature matching, segmentation, and classification. The library is expanding its focus towards end-to-end vision models, with a particular emphasis on integrating state-of-the-art Vision Language Models (VLM) and Vision Language Agents (VLA). It supports multi-framework usage, including TensorFlow, JAX, and NumPy, making it a versatile tool for developers and researchers in the AI and computer vision fields.

lmnr

59%

Laminar is an open-source observability platform specifically designed for AI agents, offering comprehensive tools for tracing, evaluations, and AI monitoring. It features an OpenTelemetry-native tracing SDK that requires only a single line of code to automatically trace popular AI frameworks like Vercel AI SDK, LangChain, OpenAI, Anthropic, and Gemini. The platform also includes an unopinionated, extensible SDK and CLI for running evaluations locally or in CI/CD pipelines, with a UI for visualizing and comparing results. Users can define events with natural language descriptions for AI monitoring, track issues, logical errors, and custom agent behavior. All data is accessible via SQL, allowing for querying traces, metrics, and events, bulk dataset creation, and custom dashboards. Laminar boasts extremely high performance, built with Rust, featuring a custom real-time engine for trace viewing and ultra-fast full-text search over span data.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 💻 Coding & Development 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce