Back to Glossary

RLHFRLHF

人間のフィードバックによる強化学習(アールエルエイチエフ)

AdvancedModels & Architecture

Reinforcement Learning from Human Feedback — a training technique where human preferences guide the model to produce better, safer outputs.

Why It Matters

RLHF is how ChatGPT and Claude became helpful and safe — it aligns AI behavior with human values.

Example in Practice

Human raters comparing two AI responses and picking the better one, training the model to prefer helpful answers.

Want to understand AI, not just define it?

Our courses teach you to build with these concepts, not just memorize them.