Learn AI - Zero to Hero

Learn AI - Zero to Hero

#rlhf

Articles tagged with #rlhf

Pretraining, Finetuning, RLHF — The Three-Act Training Story
How a raw language model becomes ChatGPT. Three acts, three very different kinds of teacher, and one polite assistant who used to know how to swear.
Apr 12, 202611 min read1
Reinforcement Learning — The Dog-Treat Model of Learning
Nobody tells the puppy the rules. It just gets a treat when it does the right thing. That's the whole idea behind every game-playing AI — and the final training step of every chatbot you talk to.
Apr 12, 202610 min read2