Reinforcement Learning
All HubsReinforcement Learning Center
Master reinforcement learning for LLMs, from RLHF fundamentals to advanced training techniques
Training Cost Tools
RL Fundamentals
Training Pipelines & Methods
Replicate DeepSeek R1 with RL: A Guide
Build a complete RL pipeline from scratch using GRPO for advanced LLM reasoning
GRPO Training Pipeline: SFT to RL for Better Reasoning
Complete guide covering SFT with cold-start data, CoT prompting, and GRPOTrainer
Training a 671b LLM Reinforcement Learning
Insights into training large-scale models with reinforcement learning techniques
Reward Models & Optimization
Advanced Techniques
Fine-Tuning & SFT
Supervised Fine-Tuning: A Guide to LLM Reasoning
Complete SFT pipeline for enhancing LLM reasoning from SFT to knowledge distillation
SFT Flaw: A Learning Rate Tweak
Critical insights into learning rate optimization for supervised fine-tuning
Supervised Fine-Tuning (SFT) for LLMs: Practical Guide
Transform base models into chat assistants with datasets and best practices
Curated Resources
OpenAI Spinning Up in Deep RL
Educational resource for learning deep reinforcement learning fundamentals
Hugging Face TRL Documentation
Transformer Reinforcement Learning library for training LLMs with RL
DeepMind's RL Course
Comprehensive course on reinforcement learning from DeepMind