HomeToolsMCPHow It WorksStoriesPhilosophyArchitectureStar on GitHub
All Tools
O
Fine-tuningFreeOpen Source

OPENRLHF

Scale RLHF and agentic RL training with Ray, vLLM, and DeepSpeed

Apache-2.0

ABOUT

Building RLHF pipelines usually means wiring together separate tools for rollout, reward modeling, distributed training, inference, and checkpoint management across many GPUs. OpenRLHF packages those pieces into one framework so teams can run scalable SFT, preference optimization, and reinforcement learning workflows with less glue code.

INSTALL
pip install openrlhf[vllm]

INTEGRATION GUIDE

1. Train PPO, GRPO, or REINFORCE-style alignment workflows for open LLMs 2. Fine-tune models with SFT, reward modeling, and DPO in one framework 3. Run distributed multi-node RLHF jobs with Ray, vLLM, and DeepSpeed 4. Train vision-language models with image inputs inside reinforcement learning loops

TAGS

pythonfine-tuningrlhfreinforcement-learningllm-trainingdistributed-trainingrayvllm
OpenRLHF — AI Tool | Agentic AI For Good