Which of the following is the equation of the least-squares regression line?

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Imagine you have a scatterplot full of points, and you want to draw the line which will best fit your data. This best line is the Least Squares Regression Line (abbreviated as LSRL).

 

General LSRL Formula

Formula: \(\widehat{y} = a + bx\) 

This is true where \(\widehat{y} \) is the predicted y-value given x\(a\) is the y intercept, \(b\) and is the slope.

 

The Least Squares Regression Line Predicts \(\widehat{y} \) 

For every x-value, the Least Squares Regression Line makes a predicted y-value that is close to the observed y-value, but usually slightly off. This predicted y-value is called "y-hat" and symbolized as \(\widehat{y} \). The observed y-value is merely called "y." 

Which of the following is the equation of the least-squares regression line?

 

Residuals

Let's take a moment to notice the little gap between the observed y-value (the scatter point labelled y) and the predicted y-value (the point on the line labelled \(\widehat{y} \)). This gap is called the residual. The full definition and formula are below: 

Definition:  The residual is the vertical distance between the observed point and your predicted y-value. 

Formula:  residual = \(y - \widehat{y}\)

** REMEMBER: the residual is observed minus predicted!!

For more information on residuals, click here. 

 

The LSRL and Residuals

By now, you know that the Least Squares Regression Line goes through a scatterplot of points and predicts a y-value (\(\widehat{y} \)for any given x. You also know that the goal here is to create the best fitting line possible. This is where residuals come into play. The LSRL fits "best" because it reduces the residuals. 

The Least Squares Regression Line is the line that minimizes the sum of the residuals squared. 

In other words, for any other line other than the LSRL, the sum of the residuals squared will be greater. This is what makes the LSRL the sole best-fitting line. 

 

Calculating the Least Squares Regression Line

When given all of the data points, you can use your calculator to find the LSRL.

Step 1: Go to STAT, and click EDIT. Then enter all of the data points into lists 1 and 2.

Step 2: Go to STAT, and click right to CALC. Then hit LinReg. Hitting enter and running this function will give you the slope and y-intercept of your LSRL as well as the r and r2 values.

When you do not have the data points, there is a way to calculate the LSRL by hand. There are two key facts you need to know:

A least-squares regression method is a form of regression analysis that establishes the relationship between the dependent and independent variables along a linear line. This line refers to the “line of best fit.”

Regression analysis is a statistical method with the help of which one can estimate or predict the unknown values of one variable from the known values of another variable. The variable used to predict the variable interest is called the independent or explanatory variable, and the variable predicted is called the dependent or explained variable.

Let us consider two variables, x and y. These are plotted on a graph with values of x on the x-axis and y on the y-axis. The dots represent these values in the below graph. A straight line is drawn through the dots – referred to as the line of best fit.

Which of the following is the equation of the least-squares regression line?

The objective of least squares regression is to ensure that the line drawn through the set of values provided establishes the closest relationship between the values.

Table of contents
  • Least Squares Regression Method Definition
    • Least Squares Regression Formula
    • Line of Best Fit in the Least Square Regression
    • Examples of Least Squares Regression Line
      • Example #1
      • Example #2
    • Advantages
    • Disadvantages
    • Conclusion
    • Recommended Articles

Least Squares Regression Formula

The regression line under the least squares method one can calculate using the following formula:

ŷ = a + bx

Which of the following is the equation of the least-squares regression line?

You are free to use this image on your website, templates, etc., Please provide us with an attribution linkHow to Provide Attribution?Article Link to be Hyperlinked
For eg:
Source: Least Squares Regression (wallstreetmojo.com)

Where,

  • ŷ = dependent variable
  • x = independent variable
  • a = y-intercept
  • b = slope of the line

One can calculate the slope of line b using the following formula:

Which of the following is the equation of the least-squares regression line?

Or

Which of the following is the equation of the least-squares regression line?

Y-intercept, ‘a’ is calculated using the following formula:

Which of the following is the equation of the least-squares regression line?

Line of Best Fit in the Least Square Regression

The line of best fitLine Of Best FitThe line of best fit is a mathematical concept that correlates points scattered across a graph.read more

is a straight line drawn through a scatter of data points that best represents the relationship between them.

Let us consider the following graph wherein a data set plot along the x and y-axis. These data points represent using the blue dots. Three lines are drawn through these points – a green, a red, and a blue line. The green line passes through a single point, and the red line passes through three data points. However, the blue line passes through four data points, and the distance between the residual points and the blue line is minimal compared to the other two lines.

Which of the following is the equation of the least-squares regression line?

In the above graph, the blue line represents the line of best fit as it lies closest to all the values and the distance between the points outside the line to the line is minimal (the distance between the residuals to the line of best fit – also referred to as the sums of squares of residuals). However, in the other two lines, the orange and the green, the distance between the residuals and the lines is greater than the blue line.

The least-squares method provides the closest relationship between the dependent and independent variablesIndependent VariablesIndependent variable is an object or a time period or a input value, changes to which are used to assess the impact on an output value (i.e. the end objective) that is measured in mathematical or statistical or financial modeling.read more by minimizing the distance between the residuals, and the line of best fit, i.e., the sum of squares of residuals is minimal under this approach. Hence, the term “least squares.”

Examples of Least Squares Regression Line

Let us apply these formulae to the below question:

You can download this Least Squares Regression Excel Template here – Least Squares Regression Excel Template

Example #1

The details about technicians’ experience in a company (in several years) and their performance rating are in the table below. Using these values, estimate the performance rating for a technician with 20 years of experience.

Experience of Technician (in Years)Performance Rating16871288188946837810805751283

Solution –

To calculate the least squares first, we will calculate the Y-intercept (a) and slope of a line(b) as follows:

Which of the following is the equation of the least-squares regression line?

The slope of Line (b)

Which of the following is the equation of the least-squares regression line?
  • b = 6727 – [(80*648)/8] / 1018 – [(80)2/8]
  • = 247/218
  • = 1.13

Y-intercept (a)

Which of the following is the equation of the least-squares regression line?
  • a = 648 – (1.13)(80) /8
  • = 69.7

The regression line is calculated as follows:

Which of the following is the equation of the least-squares regression line?

Substituting 20 for the value of x in the formula,

  • ŷ = a + bx
  • ŷ = 69.7 + (1.13)(20)
  • ŷ = 92.3

The performance rating for a technician with 20 years of experience is estimated to be 92.3.

Example #2

Least Squares Regression Equation Using Excel

One can compute the least-squares regression equation using Excel by the following steps:

  • Insert data table in excelData Table In ExcelA data table in excel is a type of what-if analysis tool that allows you to compare variables and see how they impact the result and overall data. It can be found under the data tab in the what-if analysis section.read more.
Which of the following is the equation of the least-squares regression line?
  • Insert a scatter graph using the data points.
Which of the following is the equation of the least-squares regression line?
  • Insert a trendline within the scatter graph.
Which of the following is the equation of the least-squares regression line?
  • Under trendline options – select linear trendline and select “Display Equation on chart.”
Which of the following is the equation of the least-squares regression line?
  • The least-squares regression equation for the given set of Excel data is displayed on the chart.
Which of the following is the equation of the least-squares regression line?

Thus, one can calculate the least-squares regression equation for the Excel data set. Predictions and trend analyses one may make using the equation. Excel tools also provide detailed regression computations.

Advantages

  • The least-squares regression analysis method best suits prediction models and trend analysis. One may best use it in economics, finance, and stock markets, wherein the value of any future variable is predicted with the help of existing variables and the relationship between them.
  • The least-squares method provides the closest relationship between the variables. The difference between the sums of squares of residuals to the line of best fit is minimal under this method.
  • The computation mechanism is simple and easy to apply.

Disadvantages

  • The least-squares method establishes the closest relationship between a given set of variables. The computation mechanism is sensitive to the data, and in case of any outliers (exceptional data), results may affect majorly.
  • This type of calculation is best suited for linear models. For nonlinear equations, applied more exhaustive computation mechanisms.

Conclusion

The least-squares method is one of the most popular prediction models and trend analysisTrend AnalysisTrend analysis is an analysis of the company's trend by comparing its financial statements to analyze the market trend or analysis of the future based on past performance results, and it is an attempt to make the best decisions based on the results of the analysis done.read more methods. When calculated appropriately, it delivers the best results.

This article is a guide to Least Squares Regression Method and its definition. Here, we discuss the formula to calculate the least-squares regression line along with Excel examples. You can learn more from the following articles: –

Which of the following is the equation of the regression line?

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).

What is a least squares regression line example?

Examples of Least Squares Regression Line Substituting 20 for the value of x in the formula, ŷ = a + bx. ŷ = 69.7 + (1.13)(20) ŷ = 92.3.

Is the least squares line the regression line?

Least Squares Regression Line If the data shows a leaner relationship between two variables, the line that best fits this linear relationship is known as a least-squares regression line, which minimizes the vertical distance from the data points to the regression line.

What is the formula for the equation of the least squares regression line quizlet?

The equation of the least squares regression line is given by y = a+ bx, where a represents the y-intercept of the line and b the slope. The intercept of the regression line is just the predicted value for y, when x is 0.