AI Agents & Automation
Browsing page 81 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
MLCode
MLCode provides an Agentic AI platform designed for enterprises, focusing on secure and precise Large Language Model (LLM) integration. It addresses critical challenges such as security gaps, privacy risks, and precision pitfalls by leveraging Retrieval-Augmented Generation (RAG) architecture. This platform enables real-time observability, intelligent data retrieval, and the deployment of fully autonomous agents. MLCode helps businesses integrate LLMs into their workflows, ensuring data protection, compliance, and accurate responses tailored to unique internal knowledge and business context. By offering comprehensive monitoring, streamlined management, and automation, MLCode allows for the creation of powerful AI Agents that align with corporate policies, enhancing productivity and accelerating time-to-market.
Lumenova AI
Lumenova AI is an end-to-end Responsible AI platform designed to help organizations build trust and accelerate AI adoption by making it ethical, transparent, and compliant. The platform supports comprehensive AI governance from ML models to GenAI and agentic AI, implementing guardrails to ensure safe and compliant adoption while managing risk. It proactively identifies control gaps, evaluates models continuously for drift and data integrity, and provides instant alerts for deviations. Lumenova AI simplifies compliance for regulated industries by embedding industry-specific frameworks, automating documentation for AI explainability, and continuously monitoring AI systems. Key capabilities include a unified AI inventory, robust governance tools, end-to-end visibility for model monitoring and evaluation (including bias, robustness, and hallucinations), security guardrails against prompt injection attacks, and tools to analyze AI program costs and ROI.
infinity
Infinity is a cutting-edge AI-native database specifically designed for large language model (LLM) applications. It offers incredibly fast hybrid search capabilities, combining dense vector, sparse vector, tensor (multi-vector), and full-text search. This robust database supports a wide range of rich data types and is optimized for high performance, achieving 0.1 milliseconds query latency and 15K+ QPS on million-scale vector datasets. Infinity is ideal for various RAG (Retrieval-augmented Generation) applications, including search, recommenders, question-answering, conversational AI, and content generation. It features an intuitive Python API and a single-binary architecture for easy deployment, making it friendly to AI developers.
Odyssey Solutions
Odyssey Solutions is an offshore development company focused on making innovative technologies accessible, particularly in the energy and commodity sectors. They provide software consulting, digital transformation, and technology modernization services. The company also acts as an AI-focused investment firm, empowering startups that leverage AI and machine learning to solve challenges. Odyssey Solutions emphasizes a 'humans-first' approach, investing in ventures that prioritize humanity. Their offerings include Odyssey Analytics for energy and commodity consulting, and Odyx yHat for time series forecasting, designed for accuracy and ease of use in predicting prices and demand.
llama-models
llama-models offers a comprehensive suite of utilities for working with Llama large language models. It provides easy accessibility to cutting-edge LLMs, fostering collaboration and advancements among developers, researchers, and organizations. Users can download model weights and tokenizers, list available models, describe model details, and run inference with various quantization modes like FP8 and Int4 to optimize memory footprint. The platform supports both Meta's direct downloads and Hugging Face access, ensuring broad ecosystem compatibility. It emphasizes responsible use with dedicated guides and reporting mechanisms for issues and risky content, promoting ethical AI development.
llama-stack
OGX, previously known as llama-stack, is an open-source agentic API server designed for building AI applications with maximum flexibility. It serves as a drop-in replacement for the OpenAI API, enabling developers to use any OpenAI-compatible client or agentic framework. OGX supports various models like Llama, GPT, Gemini, and Mistral, and can be deployed on diverse infrastructures, from local development with Ollama to production with vLLM or managed services. Key features include Chat Completions & Embeddings, a Responses API for server-side agentic orchestration with tool calling and file search, and support for Vector Stores & Files. It also offers multi-SDK compatibility, working natively with Anthropic and Google GenAI SDKs alongside OpenAI.
llm-graph-builder
llm-graph-builder is an open-source tool designed to convert various forms of unstructured data, such as PDFs, DOCs, TXTs, YouTube videos, and web pages, into structured knowledge graphs. It utilizes Large Language Models (LLMs) and the LangChain framework to extract nodes, relationships, and properties, storing them in a Neo4j database. Users can upload files from local machines, GCS, S3 buckets, or web sources, select their preferred LLM model, and define custom or existing schemas for graph generation. Key features include graph visualization in Neo4j Bloom, conversational querying of data, and token usage tracking. It supports a wide range of LLMs including OpenAI, Gemini, Anthropic, and Ollama, and offers various embedding models for data vectorization.
Khorus
Khorus serves as a universal communication layer for intelligent systems, specifically designed to make AI agents interoperable on-chain. It provides the fastest way to deploy A2A (Agent-to-Agent) agents, powered by ERC-8004 identity and x402 payments. The platform allows users to create agent workforces, assign tasks, and run or sync operations. A key feature is the ability to tokenize creations and list them on a marketplace or launch them through Genesis with DAO Pools. Khorus integrates with various agent APIs and data tools, routing calls through x402 for automated signals, metered usage, and trustless on-chain settlement. It supports the design and deployment of complex dApps through coordinated agent workspaces, ensuring each agent is verified on-chain and can communicate across different chains and environments.
Liger-Kernel
Liger-Kernel is an open-source collection of Triton kernels specifically engineered to optimize Large Language Model (LLM) training. Developed by LinkedIn, this tool boasts a 20% increase in multi-GPU training throughput and a 60% reduction in memory usage, enabling longer context lengths, larger batch sizes, and massive vocabularies. It offers optimized Post-Training kernels, including DPO, ORPO, CPO, and SimPO, which can deliver up to 80% memory savings for alignment and distillation tasks. Liger-Kernel is designed for ease of use, allowing users to patch Hugging Face models with a single line of code or compose custom models using its modules. It is compatible with multi-GPU setups like PyTorch FSDP, DeepSpeed, and DDP, and integrates with popular trainer frameworks such as Axolotl and Hugging Face Trainer. The kernels are exact, ensuring computational accuracy with rigorous unit tests and convergence testing.
llm-functions
llm-functions empowers developers to easily build powerful LLM tools and agents by leveraging familiar programming languages like Bash, JavaScript, and Python. This project simplifies the integration of Large Language Models with custom code through function calling, eliminating the need for complex setups. Users can execute system commands, process data, and interact with APIs directly from their LLMs. The platform automatically generates JSON declarations for tools based on comments within the code, streamlining the development process. It supports integration with AIChat and offers a Model Context Protocol (MCP) for external tool usage, making it a versatile solution for extending LLM capabilities.
LLMTornado
LLMTornado is a comprehensive .NET provider-agnostic SDK designed for developers to build, orchestrate, and deploy AI agents and workflows with ease. It features built-in connectors to over 30 API providers, including Alibaba, Anthropic, Azure, Google, OpenAI, and many more, ensuring broad compatibility without dependencies on first-party SDKs. The library supports first-class local deployments with vLLM, Ollama, or LocalAI, and offers advanced agent orchestration capabilities with concepts like Orchestrator, Runner, and Advancer, including handoffs and parallel execution. LLMTornado accelerates development with its ability to write pipelines once and execute with any provider, and supports fully multimodal inputs and outputs (text, images, videos, documents, URLs, audio). It also integrates cutting-edge protocols like MCP and A2A, and connects to popular vector databases such as Chroma, PgVector, and Pinecone, making it enterprise-ready with guardrails and Open Telemetry support.
DAILA
DAILA, the Decompiler Artificially Intelligent Language Assistant, provides a unified interface for AI systems within decompilers. This decompiler-agnostic plugin supports a wide range of AI models, including remote LLMs like GPT-4, Claude, and Gemini via LiteLLM, as well as local models such as VarBERT for variable renaming. It integrates with popular decompilers like IDA Pro, Ghidra, Binary Ninja, and angr-management, abstracting interactions through the LibBS library. DAILA offers both a GUI for interactive use and a scripting library for programmatic access, enabling tasks like function summarization, variable renaming, vulnerability identification, and free-form prompting. It can be installed via pip or used within a Docker container for offline environments.
MemoryOS
MemoryOS is designed to provide a robust memory operating system for personalized AI agents, drawing inspiration from memory management principles in traditional operating systems. It features a hierarchical storage architecture with four core modules: Storage, Updating, Retrieval, and Generation, ensuring comprehensive and efficient memory management. The tool boasts top performance in memory management, achieving significant improvements on long-term memory benchmarks. It offers a plug-and-play architecture for seamless integration of memory modules, including storage engines, update strategies, and retrieval algorithms. MemoryOS also supports universal LLM integration, working with a wide range of models like OpenAI, Deepseek, and Qwen, and provides an Agent Workflow Creation tool (MemoryOS-MCP) to inject long-term memory capabilities into various AI applications.
Wealize
Wealize, operating under the Izertis brand, offers comprehensive technology consulting services focused on digital transformation, artificial intelligence, and cybersecurity. They are experts in propelling the technological evolution of businesses by combining strategic vision with cutting-edge technology. Their services include software engineering, cloud and infrastructure management, and developing AI and data solutions. Izertis aims to help organizations lead their industries through innovative and impactful technological solutions, ensuring they stay ahead of change and leverage data for strategic advantage.
DeepSeek-MoE
DeepSeek-MoE is an innovative Mixture-of-Experts (MoE) language model featuring 16.4 billion parameters. It utilizes a unique architecture with fine-grained expert segmentation and shared experts isolation, allowing it to achieve performance comparable to DeepSeek 7B and LLaMA2 7B while requiring only about 40% of the computations. Trained from scratch on 2 trillion English and Chinese tokens, DeepSeek-MoE provides both base and chat model checkpoints for research and commercial use. It can be deployed on a single GPU with 40GB of memory without quantization, and offers quick start guides for installation and inference using Huggingface's Transformers. The project also provides scripts for fine-tuning the models on downstream tasks, supporting both DeepSpeed and QLoRA configurations.
EmbedAPI
EmbedAPI serves as a comprehensive AI integration platform designed to simplify the process of connecting to various AI models. It offers a unified API that allows developers to integrate leading AI models such as OpenAI, Anthropic, and Vertex AI quickly and efficiently. The platform aims to streamline AI development by providing a single point of access, reducing the complexity typically associated with managing multiple AI service providers. This enables faster deployment of AI capabilities into applications and services, making it an essential tool for developers looking to leverage diverse AI technologies without extensive setup.
DeepSeek-671B-SFT-Guide
DeepSeek-671B-SFT-Guide offers an open-source solution for the full parameter fine-tuning of DeepSeek-V3/R1 671B models. Developed by the Institute of Automation of the Chinese Academy of Sciences and Beijing Wenge Technology Co. Ltd., this guide includes comprehensive code and scripts covering the entire process from training to inference. It also shares practical experiences, common pitfalls, and solutions encountered during model training and deployment. Key features include implemented modeling files for DeepSeek-V3/R1 training logic, support for full parameter fine-tuning using data parallelism (DeepSpeed ZeRO) and sequence parallelism, and detailed instructions for environment setup, data preparation, training, model weight conversion, and inference deployment. The guide is designed for technical users looking to fine-tune large language models efficiently.
DeepMCPAgent
Promptise Foundry, formerly DeepMCPAgent, is an open-source framework designed for building full-stack agentic systems. It provides a comprehensive suite of tools for developers to create production-ready, secure, and scalable AI agents. The framework includes a powerful reasoning engine with 20 node types and 7 prebuilt patterns, an MCP Server SDK for multi-user, secure tool access, and an autonomous agent runtime with crash recovery and budget enforcement. It also features advanced prompt engineering capabilities, allowing prompts to be built like software with various block types, strategies, and guards. Promptise Foundry aims to simplify the development of complex AI agents by offering a unified framework that replaces multiple individual libraries.
nunchaku
Nunchaku is a high-performance inference engine specifically designed for 4-bit neural networks, implementing the SVDQuant post-training quantization technique. This technology allows for 4-bit weights and activations while maintaining visual fidelity, as detailed in the accompanying ICLR 2025 Spotlight paper. The engine achieves significant memory reduction, up to 3.6x for 12B FLUX.1-dev models, and offers substantial speedups, such as 8.7x over 16-bit models on a 16GB laptop 4090 GPU by eliminating CPU offloading. Nunchaku also supports various features like LoRA, ControlNet, asynchronous offloading, and compatibility with ComfyUI, making it a versatile tool for accelerating diffusion models and other AI applications.
embabel-agent
embabel-agent is an open-source agent framework designed for the JVM, allowing developers to author agentic flows that combine LLM-prompted interactions with custom code and domain models. It features sophisticated planning capabilities, going beyond simple state machines to dynamically formulate and re-plan action sequences to achieve goals. The framework supports strong typing and object-oriented benefits, ensuring clean interaction between prompts and authored code. Key differentiators include superior extensibility, platform abstraction for consistent QoS, and design for effective LLM mixing, enabling the use of various models for different tasks, including local models for cost and privacy. Built on Spring, it integrates easily with existing enterprise functionality and offers robust testing capabilities for both unit and end-to-end agent flows. It supports both annotation-based and Kotlin DSL approaches for flow authoring.
nboost
NBoost is a scalable, open-source platform designed to enhance the relevance of search results by deploying state-of-the-art transformer models. It acts as a neural proxy, sitting between a client and a search engine like Elasticsearch, to rerank results based on fine-tuned models. This allows for domain-specific neural search engines and can improve other ranked input tasks such as question answering. NBoost offers easy installation via Docker, PyPi, or Kubernetes, and provides benchmarks demonstrating significant search boost compared to traditional methods. It supports both PyTorch and TensorFlow dependencies, making it flexible for various deployment environments.
OSWorld
OSWorld is an open-source benchmark designed to evaluate multimodal AI agents performing open-ended tasks in real computer environments. It offers a robust framework for researchers and developers to test and compare the capabilities of their AI agents. The platform supports various virtualization technologies like VMware, VirtualBox, and Docker, with ongoing support for cloud platforms such as AWS. Key features include parallel execution of experiments, detailed result logging with screenshots and video recordings, and tools for manual task examination. OSWorld aims to standardize the benchmarking process for AI agents, providing clear metrics for success rates across different domains like Office, Daily, and Professional tasks.
filter-pruning-geometric-median
filter-pruning-geometric-median is an open-source implementation of the Filter Pruning via Geometric Median method for accelerating deep convolutional neural networks. Developed in PyTorch, this tool enables researchers and developers to reduce the computational cost and memory footprint of their models without significant loss in accuracy. It supports both network-level and layer-level sparsity configurations, offering flexibility in how pruning is applied. The repository provides detailed usage instructions for integration with PyTorch and NNI, along with scripts for reproducing results on datasets like ImageNet and CIFAR-10, making it a valuable resource for model compression research and application.
ext-apps
ext-apps is the official repository for the specification and SDK of the Model Context Protocol (MCP) Apps protocol, offering a standardized way to deliver interactive UIs from MCP servers directly within AI chatbots. This tool allows developers to create dynamic user interfaces such as charts, forms, and dashboards that render inline in compliant chat clients like Claude and ChatGPT. It extends the core MCP specification by enabling tools to declare UI resources, which the host then fetches and displays in a sandboxed iframe, facilitating bidirectional communication. The SDK supports app developers building interactive Views, host developers embedding these Views, and MCP server authors registering tools with UI metadata. It includes agent skills to scaffold new apps, migrate existing OpenAI apps, and convert web apps to hybrid web + MCP Apps.