Ddpo
Visit Toolddpo provides training code for Denoising Diffusion Policy Optimization, a method for training diffusion models with reinforcement learning. It includes a PyTorch implementation supporting GPUs and LoRA.
At a glance
Trending
ddpo provides training code for Denoising Diffusion Policy Optimization, a method for training diffusion models with reinforcement learning. It includes a PyTorch implementation supporting GPUs and LoRA.
Trending
About
ddpo offers the training code for the Denoising Diffusion Policy Optimization (DDPO) paper, focusing on training diffusion models using reinforcement learning. The codebase has been rigorously tested on Google Cloud TPUs (v3 for RWR and v4 for DDPO) and includes a PyTorch implementation that extends support to GPUs and LoRA for efficient, low-memory training. Researchers can leverage this tool to experiment with different prompt distributions and reward functions, as defined in its configurable pipeline. It also supports RWR (Reward Weighted Regression) for various training strategies, including sparse RWR. The project provides detailed instructions for installation and running DDPO and RWR, making it a valuable resource for advanced AI research in diffusion models.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending