ShypdShypd.ai

MOSS-RLHF

Visit Tool

MOSS-RLHF is an open-source research tool that explores the secrets of Reinforcement Learning from Human Feedback (RLHF) in large language models. It implements the PPO algorithm and provides code for training reward models and policy models.

At a glance

Pricing
Open Source
Free tier
Yes
API
No
Skill level
Technical

Trending

      

Explore

Browse AI tools by category