AI Agents & Automation
Browsing page 143 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
MiniMax-M1
MiniMax-M1 is the world's first open-weight, large-scale hybrid-attention reasoning model, powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism. Developed based on the MiniMax-Text-01 model, it features 456 billion parameters with 45.9 billion parameters activated per token. A key differentiator is its native support for a context length of 1 million tokens, significantly larger than competitors. The lightning attention mechanism ensures efficient scaling of test-time compute, consuming 25% of the FLOPs compared to DeepSeek R1 at a generation length of 100K tokens. MiniMax-M1 is trained using large-scale reinforcement learning (RL) on diverse problems, including mathematical reasoning and software engineering. It offers two versions with 40K and 80K thinking budgets, outperforming other strong open-weight models on complex software engineering, tool-using, and long-context tasks. It also supports function calling capabilities and provides a chatbot and API for general use and evaluation.
mmrazor
mmrazor is a comprehensive model compression toolkit and benchmark developed as part of the OpenMMLab project. It offers four mainstream technologies: Neural Architecture Search (NAS), Pruning, Knowledge Distillation (KD), and Quantization. Designed for flexibility and compatibility, mmrazor can be easily integrated with various OpenMMLab projects and allows for plug-n-play incorporation of different algorithms. Its modular design enables developers to implement new model compression algorithms with minimal code or by modifying configuration files. The toolbox supports a wide range of algorithms within each category, including DARTS, DetNAS, SPOS for NAS; AutoSlim, L1-norm, Group Fisher, DMCP for pruning; and various methods like CWD, WSLD, ABLoss for KD. It also includes PTQ, QAT, and LSQ for quantization, making it a versatile tool for optimizing deep learning models.
motion_imitation
motion_imitation is a code repository accompanying the paper "Learning Agile Robotic Locomotion Skills by Imitating Animals." It provides a Gym environment for training a simulated quadruped robot to imitate various reference motions, offering example training code for learning policies. The tool supports Python 3.7 or 3.8 on Ubuntu, MacOS, and Windows, and can be installed as a pip package. It includes features for training and testing imitation models, working with motion capture data, and implementing locomotion using Model Predictive Control (MPC). The repository also details how to run MPC on real A1 robots, making it a comprehensive resource for researchers and developers in robotic locomotion.
MotioNet
MotioNet is a deep neural network designed to reconstruct 3D human skeletal motion directly from monocular video. This library provides the source code for the network, which is based on a common motion representation. A key feature is its ability to output BVH files directly, eliminating the need for additional post-processing steps. The tool supports evaluation on both Human3.6m and wild videos, with integration for 2D pose detection tools like Openpose. Users can train models from scratch with customizable parameters or utilize provided pre-trained models for quick starts. It offers visualization through TensorBoardX for tracking training progress and includes detailed instructions for data preparation and testing. While powerful, it has limitations regarding moving cameras and dependence on 2D detection accuracy, which users should consider.
neurodiffeq
neurodiffeq is an open-source Python library built on PyTorch, designed for solving ordinary and partial differential equations (ODEs and PDEs) using neural networks. It provides a flexible framework for implementing existing techniques of using artificial neural networks (ANNs) to approximate solutions. Unlike traditional numerical methods, neurodiffeq aims to compute continuous and differentiable solutions. The library supports various features including solving systems of ODEs and PDEs, handling initial and boundary conditions, and customizing network architectures. It also offers tools for monitoring training progress, implementing transfer learning, and defining custom sampling strategies for training points. Additionally, neurodiffeq supports solving solution bundles and inverse problems, making it suitable for complex scientific and engineering applications.
Neuton TinyML
Neuton TinyML, part of the Nordic Edge AI Lab, is a platform designed for building and deploying ultra-compact AI models specifically optimized for Nordic System-on-Chips (SoCs). It caters to both CPU-run edge AI with Neuton's self-growing models and NPU-enabled devices with LiteRT models, requiring no-code for wake word models and LiteRT configuration. The platform simplifies the AI development process into three steps: data upload, automated or configured model training, and deployment. It supports various intelligent applications like gesture recognition, anomaly detection, and health monitoring, focusing on low-power consumption, balanced memory and performance, and extended battery life for always-on sensing. It also includes data preprocessing tools like windowing, feature extraction, and selection, alongside model analysis features such as quality diagrams and confusion matrices.
object-detection-opencv
object-detection-opencv provides a Python-based solution for object detection using the YOLO (You Only Look Once) framework, integrated with OpenCV's dnn module. This tool allows developers to perform inference on pre-trained deep learning models from popular frameworks like Caffe, Torch, and TensorFlow. Specifically, it leverages YOLOv3 weights for efficient object detection in images. The project is open-source and available on GitHub, offering a practical example for computer vision tasks. It's particularly useful for those looking to implement object recognition capabilities in their applications using Python and OpenCV, providing a foundation for further development in areas like real-time video analysis or image processing.
OpenChem
OpenChem is a deep learning toolkit specifically designed for computational chemistry and drug design research, built with a PyTorch backend. Its primary goal is to simplify the application of deep learning models for researchers in these fields. Key features include a modular design with a unified API, allowing for easy combination of different modules, and the ability to build new models using only a configuration file. The toolkit supports fast training with multi-GPU capabilities and provides utilities for data preprocessing. It also integrates with Tensorboard for visualization. OpenChem handles various tasks such as classification, regression, multi-task learning, and generative models, supporting data types like character sequences (SMILES, amino acids) and molecular graphs, with automatic conversion of SMILES to graphs.
RecommenderSystem-Paper
RecommenderSystem-Paper is an open-source GitHub repository that serves as a curated collection of significant papers, tools, and frameworks within the domain of recommender systems. It is designed to assist researchers and academics by providing a centralized resource for reading and exploring key advancements. The repository categorizes papers by conference (e.g., KDD, ICDM, AAAI, WWW, NIPS, ICML, CIKM, SIGIR, Recsys, WSDM) and by interesting topics such as Cold Start and Deep Learning. Beyond academic papers, it also lists useful recommender system engines like Mosaic and Crab, and algorithm frameworks such as Surprise and LightFM, making it a comprehensive resource for understanding and implementing recommender technologies.
qlib
Qlib is an AI-oriented quantitative investment platform developed by Microsoft, designed to empower quantitative research using AI technology. It supports diverse machine learning modeling paradigms, including supervised learning and reinforcement learning, making it suitable for various financial analysis tasks. The platform is equipped with tools to automate the research and development process, streamlining the creation and testing of investment strategies. As an open-source project available on GitHub, Qlib provides a robust framework for developers and data scientists to build and experiment with advanced AI models in the finance domain, fostering innovation in quantitative investment.
Attri
Attri specializes in creating AI employees designed for enterprise teams, offering a robust platform for managing and deploying these agents. The system, known as EnterpriseOS, allows for flexible deployment either on the client's cloud infrastructure or Attri's own. These AI agents are engineered to be trustworthy and are aimed at transforming operational workflows within large organizations. By providing a scalable and manageable AI workforce, Attri helps enterprises automate complex tasks, enhance efficiency, and innovate their business processes. The focus is on delivering reliable AI solutions that integrate seamlessly into existing enterprise environments.
RLzoo
RLzoo is a comprehensive open-source reinforcement learning zoo designed for simple usage, implemented with TensorFlow 2.0 and leveraging the neural network layer APIs of TensorLayer2.0+. It offers a hands-on approach for reinforcement learning practices and benchmarks, supporting basic toy-tests like OpenAI Gym and DeepMind Control Suite with minimal configuration. Additionally, RLzoo supports robot learning environments such as RLBench. The platform provides both implicit and explicit configuration interfaces for running learning algorithms, making it flexible and convenient for users. It also supports distributed training across multiple computational nodes using the Kungfu package, catering to more realistic and large-scale scenarios.
roboflow-python
Roboflow-python is an open-source Python package designed to streamline the development of computer vision applications. It provides a comprehensive set of tools for managing datasets, training models, and deploying them efficiently. The package supports a wide range of computer vision tasks, making it a versatile choice for developers working on object detection, image classification, and other related projects. Its open-source nature fosters community collaboration and allows for flexible integration into existing workflows, providing a robust foundation for building and experimenting with AI-powered vision systems.
relational-networks
relational-networks is an open-source Pytorch implementation of the "A simple neural network module for relational reasoning" paper, also known as Relational Networks. This tool is designed for researchers and developers working on visual reasoning and relational AI tasks. It has been thoroughly tested on the Sort-of-CLEVR task, a simplified version of CLEVR, which involves processing images with various colored shapes and answering both relational and non-relational questions. The implementation demonstrates superior performance compared to traditional CNN + MLP models, particularly in relational reasoning tasks, and includes modifications for improved computational efficiency.
rl-agents
rl-agents is an open-source project providing a comprehensive collection of Reinforcement Learning agent implementations. This tool is designed for researchers and developers working in the field of AI, offering a variety of planning and learning algorithms. It serves as a valuable resource for experimentation and building new RL applications. The project's open-source nature fosters community contributions and allows for flexible integration into diverse research and development environments, making it suitable for both academic and practical applications in reinforcement learning.
slime
slime is an advanced post-training framework designed for Reinforcement Learning (RL) scaling, specifically tailored for large language models. It achieves high-performance training by seamlessly integrating Megatron with SGLang, enabling efficient and scalable operations. The framework supports flexible data generation through custom data workflows, allowing users to adapt to various training requirements. slime facilitates efficient training across different modes, making it a versatile solution for developers and researchers working with large language models and RL applications. Its focus on performance and flexibility makes it suitable for complex AI development tasks.
sentiment
sentiment is a Node.js module designed for efficient sentiment analysis, leveraging the AFINN-165 wordlist and Emoji Sentiment Ranking. It provides a robust solution for analyzing arbitrary blocks of input text, offering features like the ability to append and overwrite word/value pairs from the AFINN wordlist. The module also supports adding new languages and defining custom scoring strategies for negation and emphasis on a per-language basis. Benchmarks indicate that sentiment is significantly faster than alternative implementations, making it a strong choice for performance-critical applications. It also includes validation against UCI datasets to ensure accuracy.
stable-diffusion-webui-model-toolkit
stable-diffusion-webui-model-toolkit is a comprehensive toolkit designed for managing, editing, and creating models within the Stable Diffusion WebUI environment. It offers essential features such as cleaning and pruning models to reduce bloat, converting models to and from safetensors format, and extracting or replacing individual model components like VAE, UNET, and CLIP. The toolkit also assists in identifying and debugging model architectures, providing detailed reports on matched and rejected architectures. A unique metric system helps identify model weights, even for renamed components. This tool is invaluable for developers looking to optimize, customize, and troubleshoot their Stable Diffusion models.
Starter Template
Starter Template offers a foundational structure for initiating new projects within the CrewAI framework, designed to simplify the setup and development process. It provides fully functional CrewAI applications that serve as practical examples for building real-world AI agent orchestration solutions. This resource is part of a broader collection of examples, demonstrating end-to-end implementations and best practices for leveraging CrewAI's capabilities. Developers can utilize these templates to quickly prototype, learn, and deploy complex AI agent systems, accelerating their development cycles and ensuring adherence to effective architectural patterns within the CrewAI ecosystem.
TencentPretrain
TencentPretrain is a powerful PyTorch-based framework designed for pre-training and fine-tuning AI models, supporting various data modalities including text and vision. Its modular architecture facilitates the use of existing pre-training models and provides clear interfaces for users to further develop and customize their own models. This makes it an ideal solution for researchers and developers looking to experiment with or deploy advanced AI models. The framework emphasizes flexibility and extensibility, allowing for adaptation to diverse research and application needs in the AI domain.
UI-TARS-desktop
UI-TARS-desktop is an open-source multimodal AI agent stack designed to connect various AI models and agent infrastructure, enabling the creation of sophisticated GUI agents. This tool is particularly useful for integrating vision capabilities across different platforms, allowing for the development of AI-driven automated workflows. It provides a robust framework for developers to build and deploy intelligent applications, leveraging advanced AI functionalities to automate complex tasks and enhance user interfaces. The platform supports a wide range of features for managing code changes, automating workflows, and securing applications, making it a comprehensive solution for modern software development.
VideoLLaMA3
VideoLLaMA3 is an open-source project offering a series of multimodal foundation models designed for advanced image and video understanding. It provides models like VideoLLaMA3-7B and VideoLLaMA3-2B, which are capable of tasks ranging from general image and video comprehension to more specialized applications such as multi-image comparison, visual referring, and grounding. The project includes detailed instructions for inference, training, and evaluation, making it suitable for researchers and developers. It supports various benchmarks for performance assessment and offers a flexible framework for preparing custom training data. The models are available on Hugging Face, facilitating easy access and integration into AI development workflows.
SynRiva
SynRiva positions itself as an AI solution provider dedicated to creating a smarter future through next-generation AI solutions. While the specific offerings are not yet detailed, the company's website currently serves as a placeholder, stating that "Something Great Is On The Way." This suggests an upcoming launch of innovative AI products or services. The site includes a contact form for inquiries and an option to sign up for email updates, indicating an intent to engage with potential users and keep them informed about their developments. The company is copyrighted to 2025, hinting at a future release or significant update.
T-ROBOTICS
Trener Robotics develops Acteris, an AI software platform designed to transform standard industrial robots into intelligent, self-learning operators. Powered by Physical AI, Acteris enables robots to see, learn, and adapt to complex, high-mix, low-volume production scenarios where traditional automation often fails. The platform reduces deployment times, simplifies changeovers without reprogramming, and enhances robot autonomy with built-in recovery flows. It offers operational visibility through real-time production monitoring and is compatible with leading robot brands like ABB, Universal Robots, and FANUC, with plans for further expansion. Acteris addresses the challenges of rigid programming and high changeover costs in modern manufacturing.