🤖

AI Agents & Automation

Browsing page 286 of AI Agents & Automation. Sorted by confidence score — our independent quality rating.

All AI Frameworks & Infra Browser & Web Agents Chatbots & Conversational AI General-Purpose Agents Multi-Agent Systems Personal Assistants RAG & Document AI RPA Scheduling & Task Agents Voice Agents Workflow Agents

rag-from-scratch

61%

rag-from-scratch is an open-source project designed to demystify Retrieval-Augmented Generation (RAG) by guiding developers through building it from scratch. It emphasizes local LLMs and avoids black boxes or cloud APIs, fostering a deep understanding of core RAG concepts. The project covers essential components such as embeddings, local vector database construction, retrieval strategies, and context-augmented generation. It offers step-by-step code walkthroughs, explaining every function and concept, making advanced AI approachable. Key learning areas include how embeddings work, building in-memory and LanceDB/Qdrant vector stores, basic and hybrid retrieval, query preprocessing, multi-query retrieval, and query rewriting. The project aims to provide a clear, practical, and comprehensive learning path for developers interested in RAG.

61%

search is an open-source Go library designed for embedded vector search and semantic embeddings, utilizing llama.cpp. It offers an efficient solution for projects requiring semantic power without the complexities of traditional search systems. The library supports GGUF BERT models and provides GPU acceleration for quicker computations. It's particularly well-suited for datasets with fewer than 100,000 entries, offering features like llama.cpp integration without cgo, support for various BERT models in GGUF format, and precompiled binaries with Vulkan GPU support. Users can create and save search indexes from computed embeddings, enabling basic vector-based searches in Go applications.

SiT

61%

SiT (Scalable Interpolant Transformers) offers an official PyTorch implementation for exploring advanced generative models. Built on the foundation of Diffusion Transformers (DiT), SiT introduces an interpolant framework that allows for flexible connections between distributions, surpassing DiT's performance on the conditional ImageNet 256x256 benchmark with identical backbones and parameters. This repository includes pre-trained class-conditional SiT models, a training script utilizing PyTorch DDP, and sampling code with various configurable options for ODE and SDE samplers. Researchers and developers can leverage SiT to experiment with discrete vs. continuous time learning, different model predictions, interpolant choices, and deterministic or stochastic sampling strategies.

scikit-llm

61%

Scikit-LLM provides a seamless integration of powerful large language models (LLMs) such as ChatGPT into the scikit-learn ecosystem, enabling enhanced text analysis tasks. This tool is designed for data scientists and machine learning engineers who wish to leverage advanced natural language processing capabilities directly within their familiar scikit-learn workflows. It simplifies the process of incorporating LLMs for tasks like zero-shot text classification, as demonstrated by its quick start example. Scikit-LLM is an open-source project available on GitHub, fostering community contributions and support. It aims to bridge the gap between traditional machine learning frameworks and the latest advancements in large language models, making sophisticated NLP more accessible for practical applications.

Evolup

61%

Evolup is an AI-powered platform designed for creating and managing affiliate stores. It simplifies the process of setting up an online store by automating tasks like niche selection, brand name generation, and content creation. The platform integrates hosting, tools, and support, allowing users to focus on promoting products without managing inventory or customer service. Evolup offers advanced features such as Amazon product synchronization, AI-driven content writing for descriptions and articles, and over 50 unique SEO optimizations. It supports multiple affiliate programs and provides a comprehensive dashboard for managing sites and maximizing commissions.

soprano

61%

Soprano is an ultra-lightweight, on-device text-to-speech (TTS) model designed for expressive, high-fidelity speech synthesis at unprecedented speed. It boasts features like up to 20x real-time generation on CPU and 2000x real-time on GPU, lossless streaming with low latency, and minimal memory usage with a compact 80M parameter architecture. Soprano supports infinite generation length with automatic text splitting and crystal clear audio generation at 32kHz. It offers widespread support for CUDA, CPU, and MPS devices on Windows, Linux, and Mac, and provides an OpenAI-compatible endpoint, ONNX, WebUI, CLI, and Python script for easy and production-ready inference.

RWKV-Runner

61%

RWKV-Runner is a comprehensive tool designed to eliminate barriers to using large language models by automating their management and startup. Weighing in at only 8MB, it provides a lightweight executable program that handles everything from model management and one-click startup to automatic dependency installation. A key feature is its compatibility with the OpenAI API, effectively turning any ChatGPT client into an RWKV client. It supports various configurations, including pre-set multi-level VRAM configs and WebGPU for broader graphics card compatibility (AMD, Intel). The tool also includes a user-friendly chat, completion, and composition interface, along with features like chat presets, attachment uploads, MIDI hardware input, and track editing. It offers built-in model conversion, download management, remote model inspection, and one-click LoRA Finetune (Windows Only). Additionally, it can function as a client for OpenAI ChatGPT, GPT-Playground, and Ollama, supporting multilingual localization and automatic updates.

Text Generation Inference (TGI)

61%

Text Generation Inference (TGI) is an open-source toolkit designed for deploying and serving Large Language Models (LLMs) with high performance. Developed by Hugging Face, it's used in production for services like Hugging Chat and the Inference API. TGI supports popular open-source LLMs including Llama, Falcon, and BLOOM, offering features such as tensor parallelism for faster inference on multiple GPUs, token streaming, and continuous batching for increased throughput. It also includes optimized transformers code with Flash Attention and Paged Attention, various quantization methods (bitsandbytes, GPT-Q, AWQ, Marlin, fp8), and hardware support for Nvidia, AMD, Inferentia, Intel GPU, Gaudi, and Google TPU. While TGI is now in maintenance mode, it has influenced the development of other optimized inference engines like vLLM and SGLang, which Hugging Face now recommends.

Factri.Ai

61%

Factri.Ai specializes in delivering AI-powered plug-and-play solutions tailored for manufacturing companies. Leveraging deep domain expertise in industrial engineering, digital transformation, and AI research, the platform builds practical and easy-to-implement solutions for complex manufacturing challenges. Factri.Ai aims to make advanced technology accessible, affordable, and scalable for factories, enabling them to benefit from digital transformation. Their solutions are designed for rapid deployment, often remotely, in a matter of days, ensuring quick integration and tangible results for industrial operations.

stable_diffusion.openvino

61%

stable_diffusion.openvino is an open-source implementation of text-to-image generation using Stable Diffusion, specifically designed for efficient performance on Intel CPUs or GPUs. This tool allows users to generate images from text descriptions, offering capabilities like text-to-image, image-to-image, and inpainting. It supports various parameters for fine-tuning image generation, including model selection, inference device, random seed, guidance scale, and initial image strength. The project provides clear instructions for installation on Linux, Windows, and MacOS, requiring Python <= 3.9.0 and OpenVINO™ Development Tools. Performance benchmarks are included, showcasing its efficiency across different Intel processors.

HappyRobot

61%

HappyRobot is an AI-native operating system designed to power autonomous operations by deploying AI workers that understand your business, make intelligent decisions, and act in real-time. The platform allows users to build custom AI workers with access to various systems and tools, integrating via API, webhook, or AI browser agents. These AI workers can execute tasks across all channels, including conversation and document parsing, with features like smart escalation, collaboration, data extraction, and analysis. HappyRobot emphasizes robust auditing, performance reporting, and AI auditor supervision, ensuring guaranteed uptime and scalability for enterprise-level deployments. It's built for complex environments, offering rapid implementation and optimization through embedded engineers.

Acrely

61%

Acrely specializes in developing enterprise-grade voice agents tailored for the specific needs of innovative companies. The platform provides flexible deployment options, including both cloud-based and on-premise solutions, to accommodate diverse organizational requirements. Acrely empowers businesses to leverage advanced AI capabilities across various functions, such as enhancing customer service interactions, streamlining sales processes, and optimizing operational workflows. This allows organizations to integrate sophisticated voice AI into their existing infrastructure, driving efficiency and improving engagement in critical business areas.

taranis-ai

61%

Taranis AI is an advanced open-source intelligence (OSINT) tool designed to streamline information gathering and situational analysis through the power of Artificial Intelligence. It efficiently navigates various data sources, including websites, to collect unstructured news articles. The tool then employs Natural Language Processing (NLP) and AI to automatically enhance and enrich the collected content, ensuring higher quality and relevance. Analysts can utilize Taranis AI's streamlined workflow to convert these AI-augmented articles into structured reports, which serve as the foundation for various deliverables like PDF files. It also supports collaborative threat intelligence through MISP integration and offers a robust REST API for flexible integration.

lhotse

61%

Lhotse is an open-source Python library designed to make multimodal data preparation flexible and accessible for machine learning projects. It supports various modalities including speech, audio, video, image, and text. Key features include state-of-the-art data loading algorithms like dataset blending and efficient on-the-fly bucketing, as well as handling data randomization for distributed multi-node training. Lhotse provides standard data preparation recipes for common corpora and offers flexible data preparation for model training with the concept of audio/video cuts. It also supports efficient sequential I/O data formats like Lhotse Shar and integrates seamlessly with PyTorch through task-specific Dataset classes.

LlamaEdge

61%

LlamaEdge is an open-source project designed to simplify the deployment and execution of Large Language Models (LLMs) locally or on edge devices. It enables users to run customized and fine-tuned LLMs with ease and speed, offering a robust solution for local inference. A key feature is its ability to create OpenAI-compatible API services for various open-source LLMs, supporting text generation, embeddings, speech-to-text, text-to-speech, and text-to-image models. Built on a Rust+Wasm stack, LlamaEdge offers a lightweight, fast, portable, and secure environment for AI inference, compatible with multiple operating systems, CPUs, and GPUs. It supports GGUF-formatted LLMs based on the Llama2 framework and provides a command-line interface for interaction.

llamafile

61%

llamafile is a project by Mozilla.ai designed to make open LLMs more accessible to developers and end-users. It achieves this by combining llama.cpp with Cosmopolitan Libc, creating a single-file executable, known as a "llamafile," that runs locally on most operating systems and CPU architectures without requiring installation. This framework collapses the complexity of LLMs into an easily distributable format. Additionally, llamafile integrates whisperfile, a single-file speech-to-text tool built on whisper.cpp, offering transcription and translation of audio files across the same platforms without installation. The project is actively developed, with versions like v0.10.0 using a new build system for better alignment with the latest llama.cpp functionalities.

tribuo

61%

Tribuo is an open-source Java machine learning library developed by Oracle Labs' Machine Learning Research Group. It supports a wide range of prediction tasks including multi-class classification, regression, clustering, anomaly detection, and multi-label classification. The library provides its own implementations of various ML algorithms and also integrates with external tools like TensorFlow, ONNX Runtime, and XGBoost. A key feature is its use of the OLCUT configuration system, allowing repeatable model building from XML or JSON files. Tribuo emphasizes reproducibility with serializable provenance objects for models and evaluations, tracking data, transformations, and hyperparameters. It also supports exporting many models in ONNX format for deployment across different platforms.

DeciLM 7B Instruct

61%

DeciLM 7B Instruct is a large language model designed for short-form instruction following. Built by LoRA fine-tuning on the SlimOrca dataset, it is derived from the DeciLM-7B language model. This tool is specifically engineered for generative text-based tasks, allowing users to leverage its capabilities for various applications requiring text generation based on instructions. It offers a powerful foundation for developers and researchers looking to integrate or experiment with advanced language models. The model is currently available for free, making it accessible for a wide range of projects and explorations in AI.

Ivo

61%

Ivo is an AI-powered contract intelligence platform designed for enterprise legal teams, offering advanced capabilities for contract review, redlining, and insight extraction. The platform enables users to review and redline agreements against playbooks, previously negotiated contracts, and external benchmarks directly within Microsoft Word. Ivo Intelligence provides an AI-native repository to analyze entire contract libraries without manual meta-tagging, uncovering relationships between contracts and unifying amendments. The Ivo Assistant acts as a single AI agent for contract review, intelligence, and research, allowing users to draft, redline, and explain clauses using plain-language prompts. It also facilitates complex legal research with trusted sources like EDGAR/SEC Filings and Congress.gov, ensuring secure and compliant operations with SOC 2 and ISO 27001 certifications.

magnitude

61%

Magnitude is a powerful open-source Python package and vector storage file format designed for efficient utilization of vector embeddings in machine learning models. Developed by Plasticity, it serves as a faster and simpler alternative to tools like Gensim, supporting a wide range of applications beyond natural language processing. Key features include lazy-loading for faster cold starts, LRU memory caching for production performance, and support for large models that may not fit in memory. It also offers unique capabilities like out-of-vocabulary lookups, handling misspellings, and streaming large models over HTTP. Magnitude uses SQLite as its underlying data store, leveraging indexes, memory mapping, SIMD instructions, and spatial indexing for fast key lookups and similarity searches.

vllm-omni

61%

vllm-omni is a framework designed for efficient model inference and serving of omni-modality models, building upon the foundation of vLLM. It expands support beyond text-based autoregressive generation to include text, image, video, and audio data processing. The framework also accommodates non-autoregressive architectures like Diffusion Transformers (DiT) and other parallel generation models, enabling heterogeneous outputs. Key features include state-of-the-art autoregressive support through efficient KV cache management, pipelined stage execution for high throughput, and fully disaggregated architecture with dynamic resource allocation. It offers flexibility with heterogeneous pipeline abstraction, seamless integration with Hugging Face models, and support for various parallelism techniques for distributed inference. vllm-omni also provides streaming outputs and an OpenAI-compatible API server.

WebRover

61%

WebRover is an AI-powered web agent designed for autonomous browsing and advanced research. It combines task automation with sophisticated research workflows, including multi-source analysis, academic paper generation, and deep topic exploration. The system intelligently routes queries between task automation and research modes, offering a versatile tool for quick actions and comprehensive research. It features three specialized agents (Task, Research, Deep Research) with dynamic selection, real-time state visualization, and streaming actions. WebRover integrates with a local browser instance for privacy, multi-tab management, and PDF handling, providing a modern chat interface with real-time updates and interactive selections. Output options include direct chat responses, Google Docs export, PDF download, and copy to clipboard.

workflow-builder-template

61%

Workflow-builder-template is an open-source template designed for developers to build their own visual AI workflow automation platforms. Built on top of Workflow DevKit, it offers a comprehensive drag-and-drop interface powered by React Flow, enabling users to design complex workflows with ease. The template includes real integrations with popular services like Resend (emails), Linear (tickets), Slack, PostgreSQL, and external APIs. A key feature is its code generation capability, converting visual workflows into executable TypeScript code with the "use workflow" directive. It also supports AI-powered workflow generation from natural language descriptions using OpenAI, secure user authentication with Better Auth, and detailed execution tracking with logs. The modern UI is built with shadcn/ui and Tailwind CSS, and it uses PostgreSQL with Drizzle ORM for type-safe database access.

MemOS

61%

MemOS is a Memory Operating System designed for Large Language Models (LLMs) and AI agents, unifying storage, retrieval, and management of long-term memory. It facilitates context-aware and personalized interactions by integrating knowledge bases, multi-modal data, and tool memory with enterprise-grade optimizations. Key features include a unified memory API structured as a graph, native support for text, images, and tool traces, and multi-cube knowledge base management for isolation and dynamic composition. MemOS also offers asynchronous ingestion via MemScheduler for production stability and allows memory refinement through natural-language feedback. It boasts significant accuracy improvements and token savings over OpenAI Memory, making it a robust solution for advanced AI agent development.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 💻 Coding & Development 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce