Skip to main content

Command Palette

Search for a command to run...

Overfitting, Underfitting, and Why Your Model Lies to You

The single most important failure mode in all of machine learning, told through a student who memorised the practice test. Once you see it, you'll spot it in half the AI headlines you read.

Updated
10 min read
Overfitting, Underfitting, and Why Your Model Lies to You

Imagine a student who is, in one very specific sense, the best studier in her class. The teacher hands out a practice test before the real one. Our student takes it home and memorises it. Every question. Every answer. Every typo. If you ask her the third question from the top of page two, she can recite the answer without even reading the question.

The morning of the real test, she sits down, confident. She looks at the first question. It's slightly different from the practice version — same topic, same idea, but the numbers are new and the wording is rearranged. She panics. She doesn't know what to do. She has memorised an answer key, not a subject. By the end of the hour, her score is catastrophic.

That is overfitting. The whole concept. If you understand that student, you understand the single most important failure mode in machine learning — the one that every practitioner develops a twitchy paranoia about, for very good reason.

In the last three posts we met the three flavours of learning — supervised, unsupervised, reinforcement. Different examples, different grader, same fit-a-shape move. What this post is about is the one way that move can go spectacularly, invisibly wrong in any of them. No math. No code. One student and a practice test.


The move that breaks

Let's go back to the mental picture from M2.1: a scatter of dots on a page, and the machine's job is to fit a shape through them. The shape with the smallest error on those dots is what we call the "trained model."

Here's the trap. You can always drive the error on the training dots down to zero, if you're willing to make your shape wiggly enough. Imagine drawing a curve that passes exactly through every single dot — looping back on itself, jagging up and down, doing whatever it takes. On the training data, that shape is perfect. Zero errors. If you were graded only on the training dots, you would declare victory and go home.

The problem is the new dots. The ones you haven't seen yet. The whole point of a trained model was to predict those. And that wiggly shape, as it turns out, is much worse at predicting new dots than a simple straight line would have been. The wiggles fit the randomness of the training data, not the underlying pattern. A new dot lands, the wiggle sends the prediction wildly wrong, and the model you were so proud of is suddenly a liability.

That mismatch — looking great on training data, failing on anything new — is overfitting. And the one sentence to hold onto is this:

Overfitting is when a model memorises the examples instead of learning the pattern behind them.

The student memorised the practice test. The wiggly curve memorised the training dots. Same mistake. Same disaster.


The seesaw: too wiggly vs too straight

Overfitting has an opposite, and the two sit on a seesaw.

The opposite is underfitting — when your shape is too simple to capture the actual pattern. The weather forecaster who always predicts "tomorrow will be like today" is underfitting: the shape (yesterday repeats) is too blunt to catch a real storm moving in. A straight line drawn through a pile of dots that clearly have a curve to them is underfitting. The model isn't memorising anything; it just isn't trying hard enough.

Between those two extremes is a sweet spot: a shape complex enough to capture the real pattern, but not so complex that it starts chasing the random noise in the training data. That sweet spot is where every practitioner is trying to land, and the whole craft of ML — regularisation, early stopping, dropout, data augmentation, model selection, a dozen tricks with scary names — is, at heart, just different ways of keeping the shape on the right side of the seesaw.

The honest framing: a good model captures the signal; a bad model captures the signal plus the noise. Overfitting is what happens when you let it capture the noise. Underfitting is what happens when you don't let it capture enough of the signal. Every model is somewhere on that seesaw, and your job is to notice which side it's tipping.


Why overfitting is sneaky

Underfitting is easy to spot. Your model is bad on the training data. You can see it immediately — the shape clearly doesn't go through the dots. You fix it by making the model more expressive, and you move on.

Overfitting is the sneaky one. Because on the training data, an overfit model looks amazing. Zero error. Perfect score. Every example classified correctly. The scoreboard says you've won. If you stop there, you will ship a model that fails, often dramatically, the moment it meets a real user — and you won't find out until it's already embarrassing you.

This is why every ML practitioner alive has internalised the same rule: the score you care about is never the score on the training data. You have to check the model on examples it has never seen. Otherwise you are the student checking her own answers by looking at the practice test she memorised. You will always score 100%. You will always be wrong.

We'll get to exactly how you check (the train / validation / test discipline) in M2.6. For now, the paranoia is enough: anything that looks too good on training data is overfitting until proven otherwise.


The everyday faces of overfitting

Once you know the shape of this failure mode, you start seeing it everywhere, and not just in machine learning papers.

"It works great on our benchmark." Sometimes — not always, but sometimes — a paper will report dazzling numbers on a specific benchmark dataset and completely collapse on anything slightly different. That's overfitting to the benchmark. The model learned the idiosyncrasies of that particular test, not the underlying task. This is why the field has gotten increasingly strict about held-out test sets that researchers can't touch during training.

"I have a trading strategy that would have made a fortune over the last twenty years." This is the investment world's version, called backtest overfitting, and it ruins careers. If you try enough strategies on the same historical data, you will find one that looks spectacular — but only because it's memorised the particular noise of that history. The moment you deploy it on the next month of real data, it falls apart. The cemetery of quantitative finance is paved with curves that fit the past perfectly.

"Our model is 99% accurate." On what? If the answer is "on the training data," the number is meaningless and possibly a lie. If the answer is "on a held-out test set that nobody on the team has touched during development," the number means something. Most of the time, the claim doesn't specify. Most of the time, it isn't the held-out kind.

"We tuned our model until it worked." Tuned how? Against what? If you keep adjusting a model until it performs well on the same held-out data you keep checking it on, you've done something subtle and bad: you've slowly leaked that held-out set into your training, turning it into another practice test. The model overfits to the test set through the researcher's choices. This is one of the most common mistakes in professional ML, and it's the reason serious labs lock some data away and literally never look at it until the very end.

Hold onto the pattern: any time someone's performance number comes from the same data they've been optimising against, suspect overfitting. It's not always there. But often enough that you should always ask.


Why bigger models can — sort of — break the seesaw

Here's a subtlety worth naming, because it's where modern AI gets interesting.

Classical ML wisdom said: the more parameters you give your model, the more it will overfit. A model with a million knobs can memorise far more noise than a model with a hundred knobs. For decades, this was a reliable rule, and "keep your model small enough to generalise" was core dogma.

Then transformers arrived, and people started training models with hundreds of billions of parameters on genuinely enormous datasets. By the old wisdom, these should have overfit catastrophically. But — strangely, and not entirely understood even now — they often didn't. Giant models trained on giant data sometimes generalise better than smaller models, not worse. The seesaw got weirder. A whole research subfield ("double descent") exists to explain the pattern.

The short, honest version is: when you have an absurd amount of data, you can get away with an absurd number of parameters, because there's not enough room for the model to memorise noise even if it wanted to — the real signal is too overwhelming. But take any of those giant models and train it on a small dataset without enough care, and overfitting snaps right back. The rule hasn't been abolished. It's just been pushed into new territory.

So the modern mental model is: the seesaw still exists, but it moves with how much data you have, not just with how big your model is. More data lets you get away with bigger shapes. Less data demands smaller ones. A lot of "why this AI works" comes down to that trade.


The one habit that protects you

Overfitting can't be eliminated. It can only be caught. And the habit that catches it — the one every competent ML practitioner develops and every careless one skips — is absurdly simple:

Never, ever judge a model on data it was trained on.

That's the rule. The whole discipline we'll meet in the next post (train, validate, test) exists to enforce it. Every piece of the modern ML workflow is some version of making sure that the number on the scoreboard came from data the model had never seen. When that discipline is honoured, overfitting gets caught early. When it's skipped, the model ships, looks great, and fails quietly on real users — which is somehow the worst version, because nobody even knows to be embarrassed.

If you take one thing from this whole module into any future conversation about AI, make it this. When someone tells you a model is great, the first question to ask is: great on what data? If the answer is "the data we trained it on," you've just learned nothing. If the answer is "a held-out test set we never touched," you might have learned something — and even then, you should check how they built that test set.


What just changed in your head

You started this post thinking of "overfitting" as a word you'd heard in passing — one of those technical gotchas that might matter someday. You're ending it seeing it as the one failure mode that haunts every flavour of machine learning, every flavour of statistical claim, every "we backtested this strategy" slide deck, every paper with suspiciously clean numbers. The student who memorised the practice test is everywhere, once you know to look.

Here's the sentence that might stick: overfitting is the ML version of mistaking familiarity with mastery. The shape knows the training dots too well, and because of that, it doesn't know the world well enough. You can feel the echo of this in plenty of places that have nothing to do with AI — the employee who can do the specific task but not the underlying job, the student who aces exams but can't hold a conversation about the subject, the strategy that worked in the past and doesn't know why. Same shape, same failure. The word just happens to have its most precise home in machine learning.

In the next and final post of this module, we zoom out one more time. Before a model is a shape, before it has a loss or a grader, before any of the flavours kick in — there's the data. What you feed a model determines almost everything about what it can do. "Garbage in, garbage out" isn't a slogan; it's very close to the entire story. We'll end Module 2 on the quiet truth that everyone in ML eventually learns: the data diet decides the model.


Course navigation

⬅️ Previous📍 You are hereNext ➡️
⬅️ Previous
M2.4 · Reinforcement Learning
M2.5Next ➡️
M2.6 · Features, Labels, and the Data Diet

📚 AI Zero to Hero · Course Home — all 33 posts, six modules.


Cover photo via Unsplash. This post is part of the AI Zero to Hero series.

More from this blog

Learn AI - Zero to Hero

111 posts