Skip main navigation

The Line of Best Fit: A simple Guide

Here you will be introduced to the Line of Best Fit a fundamental concept you need to understand.

You have already been introduced to the Cartesian Plane and a line that represents the relationship between two sets of variables.

Here, we will discuss how this line, called the line of best fit, is plotted. This line comes as close as possible to all the dots on a graph, showing the relationship between two variables. We use it to see connections, predict future trends, and understand how one variable changes with another. Residuals, or “mistakes,” show how far each dot is from the line. Smaller mistakes (errors) mean a better fit. This line is usually more informative than just using the average, helping us understand real-life relationships.

What is it the Line of Best Fit?

Imagine you have a bunch of dots on a graph. The line of best fit is like drawing a straight line that tries to get as close to all these dots as possible.

Why do we use it?

  • See Trends: It helps us quickly see if there’s a positive, negative, or no relationship between two variables.
  • Make Predictions: We can use the line to guess what might happen for new data points.
  • Understand Relationships: It shows us how much one variable typically changes when the other changes.

How does it work?

Think of it like this:

If you had a piece of string and tried to lay it across all the dots on your graph so it’s close to as many as possible, that’s what the line of best fit does.

Residuals: The Building Blocks

Residuals are the differences between what our model predicts and what we actually observe. They’re like little “mistakes” our model makes.
  • If a residual is positive, our model predicted too low.
  • If a residual is negative, our model predicted too high.

SST: Measuring Total Variation

SST stands for Total Sum of Squares. It measures how spread out our data is from the average.
  • It’s like measuring how much our data varies overall before we try to explain it with our model.

Why it’s important:

  • SST gives us a baseline to compare our model against.
  • It represents the total amount of variation we’re trying to explain.
Image of a cartesian plane with the line of best fit and explanation of SST, SSE and residuals

SSR: Sum of Squares Regression

Sum of Squared Regression (SSR) quantifies how much of the total variation in the data is captured by the model’s predictions. It measures the improvement in prediction accuracy compared to just using the mean of the observed data.
  • What it does: SSR looks at the difference between the predicted values from our model and the average value of the data. The larger this difference, the better our model is at capturing relationships in the data.
  • Why it’s useful: SSR tells us how much of the variation in the outcome variable is explained by the independent variables in the model.

Why It’s Important:

  • SSR helps us assess how well our model is working. A larger SSR means the model explains more of the variation in the data.
  • It is used to compare the performance of different models and to see how well our predictors are explaining the changes in the outcome.
In summary, SSR highlights the strength of our model in explaining the data. The higher the SSR, the better our model is at capturing the relationships between the variables.
Image of a cartesian plane with the line of best fit and explanation of SST, SSE and residuals

SSE: Measuring Our Model’s Mistakes

SSE stands for Sum of Squared Errors (Residuals). It’s a way to add up all our model’s mistakes.
  • We square each residual (to make negatives positive) and add them all up.
  • A smaller SSE means our model is making smaller mistakes overall.

Why it’s important:

  • SSE helps us compare different models.
  • It tells us how much error is left in our model.
Image of a cartesian plane with the line of best fit and explanation of SST, SSE and residuals

Tying It All Together

R-squared or R2 is a measure that uses both SSE and SST to tell us how good our model is.
R-squared = 1 – (SSR / SST)
  • If SSE is small compared to SST, R-squared will be close to 1 (or 100%).
  • This means our model explains most of the variation in the data.

We will cover R2 in more detail next week.

Why This Matters

  1. Model Quality: By looking at residuals, SSE, and SST, we can judge how well our model fits the data.
  2. Improvement: If our SSE is large compared to SST (low R-squared), we know we need to improve our model.
  3. Prediction Accuracy: Smaller residuals (and thus smaller SSE) mean our predictions are more accurate.
  4. Understanding Relationships: A good model (low SSE, high R-squared) helps us understand how variables are related.

In Simple Terms

Imagine you’re trying to guess people’s weights based on their heights:

  • Residuals are how far off each guess is.
  • SSE is the total of all your mistaken guesses, squared.
  • SST is how much weights vary overall.
  • If your SSE is much smaller than SST, you’re doing a good job guessing!

By understanding residuals, SSE, and SST, we can tell how well our model explains the world around us, and how much we can trust its predictions.

The Mean Model vs. The Line of Best Fit

The mean model in statistics is a simple approach where we use the average value of a dataset as a baseline to predict outcomes. This model assumes that all observations will be the mean value, regardless of the input variables. It’s a flat line on a graph at the average value, representing the idea that without further information, the best guess for any new data point is the overall mean of the data.

Using the mean model is helpful for comparison purposes. It provides a benchmark to evaluate more complex models. For example, in linear regression, we can compare the performance of the regression line against the mean model to see how much better the regression line predicts the outcomes. The mean model’s simplicity makes it a good starting point, but it often lacks the nuance to capture the variability and trends in the data effectively.

In practice, while the mean model offers a straightforward understanding, more sophisticated models like the line of best fit (or linear regression) can provide deeper insights by showing how variables relate to one another, allowing for more accurate predictions and a better understanding of the underlying patterns in the data.

Why the Line of Best Fit is Usually Better

The line of best fit usually explains more about our dots than just using the average. It looks at how the dots change across the graph, not just up and down.

Real-Life Example

Let’s say we’re looking at how much time people spend studying and their test scores:

  • Each dot on our graph is one student.
  • The position left-to-right shows how long they studied.
  • The position up-and-down shows their test score.

Our line of best fit might show that, on average, studying more leads to better scores. But remember, it’s not perfect – some dots (students) will be above the line (did better than expected) and some below (did worse than expected).

In Simple Terms

The line of best fit is our best guess at showing how two things are related. It’s not perfect, but it’s usually better than just using the average. By looking at how close our line is to all the dots, we can tell how good our guess is and how strongly the two things are connected.

This article is from the free online

Introduction to Statistics without Maths: Regressions

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now