From the course: Excel Data Analysis for Supply Chain: Forecasting

Simple linear regression fundamentals - Microsoft Excel Tutorial

From the course: Excel Data Analysis for Supply Chain: Forecasting

Simple linear regression fundamentals

- [Presenter] Regression. It sounds scary, but what exactly is it? Simple linear regression analysis is simply a statistical attempt to understand the relationship between two variables. For example, what's the relationship between the miles on a certain type of car and the value of that car? In other words, how much value does my car lose every time I drive 500 miles? Well, suppose you are able to find relevant data for the type of car in question. Suppose all of these data points are for the same car, make and model, but each car with different mileage and therefore, a different value. With some real numbers, Excel would work to find the relationship between the two variables: car value and mileage. Excel would provide a line to fit the data, a regression line, and a formula for that regression line. Now, don't set your expectations too high. Sometimes the line and the data match up well. In other cases, the fit isn't very good. So let's discuss the regression line and the formula for that line. The formula Excel gives us is in the form of the classic equation of a line: y = mx + b. Suppose this is the formula for the regression line: Y = -0.10x + 21,600. First, let's consider B. B is called the intercept. If X was 0 miles, y would be 21,600. That's where the line intercepts or crosses the Y axis. How about m? m is the slope? m is equal to -0.10. It tells us how much the value of the car changes with every mile. We lose 10 cents for every mile the car drives. In other words, the car loses $1,000 in value when the mileage goes up by 10,000 miles. Our next term is R-squared, sometimes also called the coefficient of determination. R-square is a measure of how well the line fits the data points. R-square is a number between 0 and 1. An R-square of 1.0 means that our regression line is a perfect fit. The closer the R-square is to 1, the better the line fits the data. An R-square of 0 means the line is an extremely poor fit for our data points. As I said, the better the fit, the closer the R-square is to 1.0. Next, let's look at standard error. Standard error is the average distance from the line to the real data point. It's measured up and down. Obviously, the lower the standard error, the better. In this chapter, we'll be calculating standard error, so when we do, remember, it's just a measure of how far off the actual sales are from the predicted sales. And in case you were wondering, these simple regression lines will be what we're using to create forecasts in this chapter. We'll get some datasets. We'll have Excel do the hard work, the regression line, the formula for the line, and the R-square, and then we'll use that information to create a forecast. Okay, time to head over to Excel and explore some simple linear regressions.

Contents