Online-RLHF
Visit ToolOnline-RLHF is an AI Agents & Automation tool that provides a recipe for online iterative Reinforcement Learning from Human Feedback (RLHF). It enables the alignment of large language models (LLMs) and online iterative DPO.
At a glance
Trending