On-Policy

Visit Tool

on-policy is an AI Agents & Automation tool that implements Multi-Agent PPO (MAPPO) for cooperative multi-agent games. It supports environments like StarCraftII, Hanabi, and Google Research Football.

Claim this tool

1View

At a glance

Pricing

Open Source

Free tier

Yes

API

Skill level

Technical

About

What is on-policy?

on-policy is the official implementation of Multi-Agent PPO (MAPPO), a multi-agent variant of Proximal Policy Optimization. This open-source tool is heavily based on an existing PyTorch A2C-PPO-ACKTR-GAIL implementation and is used in the paper "The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games." It supports various environments, including StarCraftII (SMAC and SMAC v2), Hanabi, Multiagent Particle-World Environments (MPEs), and Google Research Football (GRF). The repository provides core code for algorithms, environment wrappers, training rollouts, and policy updates, with default hyperparameters available for replication.

Best used for

Ideal for developers and data scientists who need to implement and test multi-agent reinforcement learning algorithms, replicate research findings in cooperative multi-agent games, and benchmark agent performance across diverse environments. Especially valuable for academic research and advanced AI development.

Common actions

implement multi-agent PPO

research reinforcement learning

benchmark AI agents

develop AI algorithms

github copilotface swapping"AI Agents"deepfakecollaborationworkflowsautomated workflowlow-code/no-codeopen-source

Capabilities

Key features

Multi-Agent PPO implementation
StarCraftII environment support
Hanabi environment support
MPEs environment support
Google Research Football support
Shared policy by agents
Customizable hyperparameters

Target Audience

developerdata scientist

Integrations

Not yet documented

Pricing & Plans

Open Source

Free

FAQs

What environments does on-policy support for multi-agent reinforcement learning?

on-policy supports several popular multi-agent environments, including StarCraftII (SMAC and SMAC v2), Hanabi, Multiagent Particle-World Environments (MPEs), and Google Research Football (GRF). This allows for broad testing and application of the MAPPO algorithm.

Is on-policy suitable for replicating research results from the associated paper?

Yes, on-policy is the official implementation used in the paper "The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games." It provides training scripts and hyperparameters to help users reproduce the reported results accurately.

Does on-policy use a shared policy for agents?

By default, on-policy assumes a shared policy where all agents utilize a single neural network. This design choice is fundamental to its implementation and is important to consider when setting up experiments.

Trending

Subcategories trending in AI Agents & Automation

AI Frameworks & Infra Chatbots & Conversational AI Workflow Agents Personal Assistants RAG & Document AI Voice Agents

Trending

Explore

Browse AI tools by category

Content & Design Productivity & Business Coding & Development AI Agents & Automation Research & Education Wellness & Lifestyle Career Development Marketing & Growth Data & Analytics Customer Support & CX Finance E-commerce