AI Agents & Automation
Browsing page 54 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
langchain-ui
langchain-ui is an open-source chat-AI toolkit designed to simplify the creation and hosting of chatbots with a no-code interface, built on top of LangChain. It enables users to develop custom ChatGPT-like chatbots, providing them with context from external data sources, ChatGPT plugins, and prompts. Each chatbot comes with a dedicated API endpoint, and the platform supports bringing your own database and authentication provider. While the repository is currently unmaintained, with development shifting to Superagent for more granular access to LLM-powered Agents, it still offers features like usage quotas, chatbot embedding, and themes.
mastra
Mastra is a comprehensive framework designed for building AI-powered applications and agents with a modern TypeScript stack. Developed by the team behind Gatsby, it offers everything needed to go from early prototypes to production-ready applications. Key features include model routing, which connects to over 40 providers like OpenAI, Anthropic, and Gemini through a single interface. It enables the creation of autonomous agents that use LLMs and tools to solve open-ended tasks, and a graph-based workflow engine for orchestrating complex multi-step processes. Mastra also supports human-in-the-loop interactions, context management for agents, and integrations with existing React, Next.js, or Node.js apps. It emphasizes production essentials with built-in evals and observability for continuous refinement of AI products.
Snowflake
Snowflake is a comprehensive AI Data Cloud platform designed to mobilize data, applications, and AI. It offers a fully managed platform for developing AI products, apps, and more, securely connecting businesses across any type or scale of data. Key capabilities include Cortex Code for AI coding, Cortex AI for instant access to LLMs, and a Marketplace for third-party data. Snowflake supports analytics, AI, data engineering, and applications & collaboration, featuring tools like Snowpark, Streamlit, and Snowflake ML for streamlined model development. It also provides Snowflake Intelligence for enterprise agents and Postgres for running open-source databases. The platform aims to simplify enterprise data and AI strategies, offering a connected ecosystem and trusted security, governance, and disaster recovery.
Lumina-T2X
Lumina-T2X is a unified framework designed for Text to Any Modality Generation, utilizing advanced Flow-based Large Diffusion Transformers (Flag-DiT). This open-source tool allows users to transform textual descriptions into vivid images, dynamic videos, detailed multi-view 3D images, and synthesized speech or music. A key feature is its ability to encode various modalities into a unified 1-D token sequence, supporting generation at any resolution, aspect ratio, and temporal duration, including resolution extrapolation for out-of-domain outputs. The framework is noted for its faster training convergence and stable dynamics, requiring significantly fewer computational resources compared to similar models. It supports multilingual prompts and even emojis, making it versatile for diverse creative applications.
DeepQA
DeepQA is an open-source project that provides a TensorFlow implementation of "A neural conversational model," also known as the Google chatbot. This tool enables developers and researchers to build and experiment with deep learning-based conversational agents using a recurrent neural network (RNN) seq2seq model for sentence predictions. It supports various dialogue corpora, including Cornell Movie Dialogs, OpenSubtitles, Supreme Court Conversation Data, and Ubuntu Dialogue Corpus, with options to integrate custom datasets. DeepQA offers functionalities for training models, testing predictions, and visualizing computational graphs with TensorBoard. It also includes a web interface for user interaction, making it suitable for both development and demonstration purposes.
MMaDA
MMaDA is an open-sourced family of multimodal diffusion foundation models designed for superior performance across diverse domains including textual reasoning, multimodal understanding, and text-to-image generation. It introduces a unified diffusion architecture with a shared probabilistic formulation and modality-agnostic design, eliminating the need for modality-specific components. MMaDA also features a mixed long chain-of-thought (CoT) fine-tuning strategy for a unified CoT format across modalities, and a unified policy-gradient-based RL algorithm called UniGRPO for consistent performance improvements in both reasoning and generation tasks. The project provides various checkpoints like MMaDA-8B-Base and MMaDA-8B-MixCoT, supporting capabilities from basic text and image generation to complex textual and multimodal reasoning.
VoiceAIWrapper
VoiceAIWrapper is a white-label voice AI platform specifically designed for agencies, enabling them to rebrand and resell leading voice AI tools such as Vapi, ElevenLabs, Retell, Bolna, and Ultravox under their own brand. It provides a unified dashboard to connect multiple voice AI providers, automate client onboarding, and manage billing. Agencies can create fully branded client portals with custom domains and logos, allowing clients to manage their usage and billing. The platform supports various billing models, including subscription and usage-based, with payments directly to the agency's Stripe account. It also offers API and webhook integrations for syncing data with CRMs like HubSpot and GoHighLevel, ensuring seamless operation and 100% margin retention for agencies.
SoyHuCe|a JAKALA company
SoyHuCe, a JAKALA company, operates as a Center of Excellence in France, focusing on data, artificial intelligence, and the development of applications powered by algorithms. They provide comprehensive services including data factory solutions, algorithm laboratories for machine learning and R&D, and a digital factory for UX/UI, e-commerce, and custom development. SoyHuCe aims to support clients through every stage of their digital projects, from data storage and analysis to security and value creation, ensuring technological excellence and strategic digital transformation. They emphasize a global vision for digital development, combining strategic, technical, and controlled approaches.
Visnet
Visnet is an AI-powered framework designed for the research, development, and deployment of off-the-shelf AI models. At its core, the VISNET Framework is a comprehensive headless, multi-compatible, and universal neural networks interface. It features a universal ASGI gateway with DDOS protection and IP filtering, along with an Auth Protocol Layer supporting Oauth 2.0 and RSA encryption. Visnet provides core AI models for tasks such as translation, license plate recognition, and face feature matching. The platform specializes in Deep Vision Systems, offering solutions for surveillance, autonomous drone inspections, and advanced image and video analysis, including facial feature recognition, drone structural inspection, audio transcription, and license plate recognition.
EasyRAG
EasyRAG is a simple, lightweight, and efficient open-source framework for retrieval-augmented generation (RAG) specifically designed for automated network operations. It features an accurate question-answering scheme based on a specific data processing workflow, dual-route sparse retrieval for coarse ranking, an LLM Reranker, and LLM answer generation and optimization. The framework is easy to deploy, primarily consisting of BM25 retrieval and BGE-reranker reranking, requiring no model fine-tuning and occupying minimal VRAM. It also boasts efficient inference acceleration for the entire RAG process, significantly reducing latency while maintaining accuracy. EasyRAG provides a flexible code library with various search and generation strategies, facilitating custom process implementation.
EnsoAI
EnsoAI is a desktop application designed to enhance developer workflows by enabling multiple AI agents to operate in parallel within a single Git project. It treats each Git branch as a first-class workspace with its own dedicated AI context, allowing seamless switching between agents like Claude, Codex, Gemini, Cursor, Droid, and Auggie. The tool integrates a visual source control interface for reviewing diffs, staging changes, and managing commits, alongside a built-in Monaco editor for quick edits. Key features include AI-powered code review, auto-generated commit messages, a 3-way merge tool, and robust Git worktree management. EnsoAI is not a full IDE replacement but rather a lightweight workspace manager that can bridge to other IDEs like VS Code or Cursor for deeper development.
FindAPIs
FindAPIs serves as a comprehensive directory for developers seeking to integrate APIs into their projects. It offers a vast collection of over 15,000 APIs spanning 53 different categories, making it easy to find the perfect fit for any development need. Users can efficiently search and filter APIs based on criteria such as category, authentication type, CORS support, protocol, and pricing model. The platform highlights popular and trending APIs, including those for conversational AI like ChatGPT, and provides details for each API. FindAPIs aims to simplify the API discovery process, enabling developers to quickly identify and utilize the right resources to build and enhance their applications.
ELITE
ELITE (Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation) is a method presented at ICCV 2023 that allows users to encode visual concepts from images into textual embeddings. These embeddings can then be flexibly composed into new scenes using text-to-image generation models like Stable Diffusion. The tool features a two-module architecture: a global mapping network for encoding concept images into multiple textual word embeddings, and a local mapping network that projects foreground objects into the textual feature space for detailed local control. ELITE is built on the diffusers version of Stable Diffusion and provides scripts for environment setup, customized generation, and training, including a Gradio demo for interactive testing.
Ovis
Ovis (Open VISion) is an innovative Multimodal Large Language Model (MLLM) architecture available as an open-source project on GitHub. It is specifically designed to structurally align visual and textual embeddings, enabling advanced multimodal understanding and generation. Key features include native-resolution visual perception, enhanced reflective reasoning (thinking mode), and leading performance across STEM, chart analysis, grounding, and video understanding. Ovis supports various model sizes, from 2B to 34B parameters, and offers quantized versions for optimized deployment. It provides comprehensive installation and inference instructions, including examples for both transformers and vLLM, and supports fine-tuning with in-repo code or ms-swift.
Flowise
Flowise is an open-source, low-code platform designed for building AI agents and LLM applications visually. It offers a drag-and-drop user interface, making it accessible for both developers and non-developers to create sophisticated AI-powered workflows. Based on LangChain, Flowise simplifies the development of applications like Retrieval-Augmented Generation (RAG) systems. The platform supports various deployment options, including Docker, and can be self-hosted on major cloud providers like AWS, Azure, and Digital Ocean, or through Flowise Cloud. It features a modular architecture with separate backend, frontend, and components for third-party integrations, ensuring flexibility and scalability for AI development.
geekan/MetaGPT - GitHub
MetaGPT is an open-source, multi-agent framework designed to simulate a software company, taking a one-line requirement and outputting user stories, competitive analysis, requirements, data structures, APIs, and documents. It internally includes roles like product managers, architects, project managers, and engineers, orchestrating their collaboration through carefully defined Standard Operating Procedures (SOPs). This approach materializes SOPs and applies them to teams composed of Large Language Models (LLMs), enabling natural language programming. The framework supports various LLM types and offers functionalities like a Data Interpreter for analysis and plotting, making it a powerful tool for developers and AI enthusiasts looking to build and manage complex AI-driven projects.
peinture
Peinture is a general-purpose AI image generation framework designed for creating high-quality images from text prompts. Built with React, TypeScript, and Tailwind CSS, it offers a sleek, dark-themed interface. The tool supports a multi-provider architecture, allowing users to seamlessly switch between generative models from Hugging Face, Gitee AI, Model Scope, and A4F, with the option to add custom OpenAI-compatible providers. Key features include a professional image editor with AI-assisted prompt optimization, live motion video generation, and flexible storage options (local OPFS or cloud S3/WebDAV). It also provides advanced controls for fine-tuning creations and a privacy-focused approach with local storage of history and credentials.
Artian AI
Artian AI offers reliable autonomous AI agents and multi-agent solutions specifically designed for business-critical processes within the financial services industry. The platform enables enterprise teams to transform complex workflows into autonomous operations, ensuring human control and oversight. Artian AI is built to integrate with existing platforms and workflows, making it suitable for regulated enterprises like banks and insurers. Key applications include break remediation, payment integrity, compliance approvals, batch operations, and client operations. The system emphasizes reliability, governance with built-in data lineage and model risk management, and scalability across various functions and systems.
Roadmap-To-Learn-Agentic-AI
Roadmap-To-Learn-Agentic-AI is an open-source GitHub repository offering a comprehensive guide to mastering agentic AI systems. It begins with foundational knowledge in Python programming and essential machine learning concepts, including Natural Language Processing (NLP) techniques like TFIDF and Word2vec. The roadmap then progresses to in-depth Deep Learning for NLP, transformer explanations, and extensive Generative AI tutorials with end-to-end projects. A significant portion is dedicated to Agentic AI tutorials, exploring various frameworks such as Langchain, LangGraph, Agno, Phidata, CrewAI, and Autogen. This resource is ideal for individuals looking to build a strong understanding and practical skills in the rapidly evolving field of agentic AI.
Embedl
Embedl provides a comprehensive platform for developing and deploying efficient Edge AI. It offers both on-premise and cloud solutions tailored for Edge AI developers, focusing on optimizing performance and reducing costs. The platform includes Embedl Hub, a secure MLOps solution for compliant edge AI workflows, and Embedl Models, which provides popular models optimized for specific edge hardware. Embedl Deploy facilitates Edge AI conversion, compilation, and quantization to get models running on hardware easily. It supports a wide range of hardware platforms including Xilinx FPGAs, Nvidia GPUs, Texas Instruments DSPs, ARM CPUs, NXP NPUs, and Intel CPUs, GPUs, and FPGAs, and is compatible with any inference engine. The Embedl Model Optimization SDK helps developers prune, quantize, and compress models, significantly reducing model size and speeding up inference times.
RoleLLM-public
RoleLLM-public is a comprehensive framework designed to benchmark, elicit, and enhance the role-playing capabilities of Large Language Models (LLMs). It introduces RoleLLM, a four-stage process encompassing role profile construction, Context-Based Instruction Generation (Context-Instruct) for knowledge extraction, Role Prompting using GPT (RoleGPT) for style imitation, and Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models. The framework includes RoleBench, a systematic and fine-grained character-level benchmark dataset with over 168,000 samples. RoCIT on RoleBench has led to the development of RoleLLaMA (English) and RoleGLM (Chinese), significantly improving role-playing performance to levels comparable with GPT-4.
Sharpwiz Technologies
Sharpwiz Technologies offers comprehensive AI and digital transformation solutions, focusing on Vision AI and Generative AI. Their services range from AI/ML and data engineering to custom software development and cloud-native solutions. They help businesses unlock innovation through offerings like real-time video analytics, facial recognition, chatbots, and process automation. Sharpwiz also provides digital transformation services to connect devices, people, and processes, enhancing performance and customer satisfaction. They emphasize a customer-centric approach, excellence in delivery, and long-term relationships, aiming to be a leading global technology partner for industries.
spaCy
spaCy is a powerful, open-source library for advanced Natural Language Processing (NLP) in Python and Cython. Designed for production use, it incorporates the latest research and provides pre-trained pipelines for over 70 languages, enabling tokenization and training. Key features include state-of-the-art speed, neural network models for tasks like tagging, parsing, named entity recognition, and text classification, as well as multi-task learning with transformers like BERT. It boasts a robust training system, easy model packaging, deployment, and workflow management, making it suitable for industrial-strength applications. spaCy is released under the MIT license, offering a comprehensive solution for developers and researchers working with NLP.
lagent
Lagent is a lightweight framework designed for building sophisticated LLM-based agents, inspired by the design philosophy of PyTorch. It simplifies the process of creating multi-agent applications by allowing users to focus on defining layers and message passing between them in a Pythonic way. Key features include agent-to-agent communication via AgentMessage, memory management for conversational context, and custom message aggregation. The framework also supports flexible response formatting and consistent tool calling through ActionExecutor. Lagent offers dual interfaces (synchronous and asynchronous) for debugging and large-scale inference, making it suitable for various development needs.