Reading 7: Introduction to Linear Regression

50 questions available

Module 7.1: Linear Regression Introduction and Assumptions10 min
Simple linear regression explains the variation in a dependent variable using a single independent variable. The regression line, or line of best fit, is determined by minimizing the Sum of Squared Errors (SSE). The slope coefficient indicates the expected change in the dependent variable for a one-unit change in the independent variable. The intercept is the expected value of the dependent variable when the independent variable is zero. Key assumptions for the model include a linear relationship, constant variance of residuals (homoskedasticity), independently distributed residuals, and normally distributed residuals.

Key Points

  • Dependent variable (Y) is explained; Independent variable (X) is the predictor.
  • OLS minimizes the sum of squared vertical distances (residuals) between observed and predicted Y values.
  • Slope (b1) = Cov(X,Y) / Var(X).
  • Intercept (b0) = Mean(Y) - b1 * Mean(X).
  • Assumptions: Linearity, Homoskedasticity, Independence, Normality.
Module 7.2: Goodness of Fit and Hypothesis Tests15 min
This section covers the Analysis of Variance (ANOVA) table, which breaks down Total Sum of Squares (SST) into Regression Sum of Squares (RSS) and Sum of Squared Errors (SSE). Key metrics include the Coefficient of Determination (R-squared), representing the percentage of explained variation, and the Standard Error of Estimate (SEE), which indicates the fit's precision. Hypothesis tests using the t-statistic determine if the slope coefficient is significantly different from zero, while the F-test assesses the overall model significance.

Key Points

  • SST = RSS + SSE.
  • R-squared = RSS / SST; measures explained variation.
  • SEE = square root of (SSE / (n - 2)).
  • F-statistic = MSR / MSE; tests if the model explains significant variation.
  • t-test determines if the slope is statistically significant.
Module 7.3: Prediction and Functional Forms10 min
The regression model allows for predicting the dependent variable given a value for the independent variable. Confidence intervals provide a range for these predictions, accounting for uncertainty. When variables do not have a linear relationship, transformations like natural logarithms can be used. The interpretation of the slope coefficient changes based on the functional form: Log-lin (relative change in Y for absolute change in X), Lin-log (absolute change in Y for relative change in X), and Log-log (relative change in Y for relative change in X).

Key Points

  • Predicted Y is calculated using estimated b0 and b1 with a given X.
  • Prediction intervals depend on the standard error of the forecast.
  • Log-lin model: Slope represents percentage change in Y for unit change in X.
  • Lin-log model: Slope represents unit change in Y for percentage change in X.
  • Log-log model: Slope represents percentage change in Y for percentage change in X (elasticity).

Questions

Question 1

In a simple linear regression model, the variable that serves as the predictor is best described as the:

View answer and explanation
Question 2

Which of the following best describes the method of Ordinary Least Squares (OLS)?

View answer and explanation
Question 3

If the covariance between variables X and Y is 20 and the variance of X is 10, what is the estimated slope coefficient for the regression of Y on X?

View answer and explanation
Question 4

Given a regression equation Y = 0.5 + 1.2X, how is the slope coefficient interpreted?

View answer and explanation
Question 5

Calculate the intercept of a regression line where the mean of Y is 15, the mean of X is 5, and the estimated slope coefficient is 2.

View answer and explanation
Question 6

Which of the following is NOT a necessary assumption of simple linear regression?

View answer and explanation
Question 7

The condition where the variance of the error term is not constant across all observations is known as:

View answer and explanation
Question 8

In the context of regression analysis, the term 'residuals' refers to:

View answer and explanation
Question 9

If the Regression Sum of Squares (RSS) is 60 and the Sum of Squared Errors (SSE) is 40, what is the Total Sum of Squares (SST)?

View answer and explanation
Question 10

The coefficient of determination (R-squared) measures:

View answer and explanation
Question 11

If a regression model has an RSS of 45 and an SST of 100, what is the value of R-squared?

View answer and explanation
Question 12

The Standard Error of Estimate (SEE) is calculated as the square root of which value?

View answer and explanation
Question 13

In a simple linear regression with 32 observations, what are the degrees of freedom for the error term?

View answer and explanation
Question 14

Calculate the F-statistic if the Mean Regression Sum of Squares (MSR) is 100 and the Mean Squared Error (MSE) is 20.

View answer and explanation
Question 15

In a simple linear regression, the F-test is equivalent to determining the statistical significance of:

View answer and explanation
Question 16

The appropriate degrees of freedom for the numerator in the F-test for a simple linear regression is:

View answer and explanation
Question 17

If the calculated t-statistic for a slope coefficient is 2.5 and the critical t-value is 2.0, what is the appropriate conclusion?

View answer and explanation
Question 18

For a simple linear regression, the null hypothesis used to test the statistical significance of the slope coefficient is:

View answer and explanation
Question 19

If the standard error of the slope coefficient is 0.5 and the estimated slope is 1.5, what is the calculated t-statistic for testing if the slope is zero?

View answer and explanation
Question 20

Given a regression equation Y = 2 + 3X, what is the predicted value of Y when X is 4?

View answer and explanation
Question 21

In a log-lin model where ln(Y) = b0 + b1*X, how is the slope coefficient (b1) interpreted?

View answer and explanation
Question 22

In a lin-log model where Y = b0 + b1*ln(X), the slope coefficient indicates:

View answer and explanation
Question 23

Which functional form involves taking the natural logarithm of both the dependent and independent variables?

View answer and explanation
Question 24

If a stock's return is explained by the market return, the stock's return is the:

View answer and explanation
Question 25

If the correlation coefficient between two variables is 0.7, what is the coefficient of determination (R-squared)?

View answer and explanation
Question 26

What is the relationship between Total Sum of Squares (SST), Regression Sum of Squares (RSS), and Sum of Squared Errors (SSE)?

View answer and explanation
Question 27

In an ANOVA table, the Mean Squared Error (MSE) is calculated by dividing:

View answer and explanation
Question 28

A residual plot that shows residuals increasing in magnitude as the independent variable increases indicates:

View answer and explanation
Question 29

If the confidence interval for a slope coefficient includes zero, we can conclude that:

View answer and explanation
Question 30

Which of the following values can the coefficient of determination (R-squared) NOT take?

View answer and explanation
Question 31

Standard Error of Estimate (SEE) measures:

View answer and explanation
Question 32

If a regression has an SSE of 200 and n = 22, what is the MSE?

View answer and explanation
Question 33

In simple linear regression, the F-statistic is always a:

View answer and explanation
Question 34

The intercept term in a regression is best interpreted as:

View answer and explanation
Question 35

Using a 5 percent significance level with 34 degrees of freedom, the critical t-value is approximately 2.03. If the calculated t-statistic is 2.46, you should:

View answer and explanation
Question 36

Which of the following indicates a stronger linear relationship?

View answer and explanation
Question 37

If a regression has R-squared of 0.64, what is the correlation coefficient?

View answer and explanation
Question 38

The Sum of Squared Errors (SSE) measures:

View answer and explanation
Question 39

A confidence interval for a predicted Y-value relies on:

View answer and explanation
Question 40

In a regression model Y = b0 + b1*X + error, the term 'error' represents:

View answer and explanation
Question 41

If a slope coefficient is 0.64, it implies that:

View answer and explanation
Question 42

In the context of the F-test, Mean Regression Sum of Squares (MSR) is calculated as:

View answer and explanation
Question 43

If the correlation between X and Y is negative, the slope coefficient of the regression of Y on X must be:

View answer and explanation
Question 44

A confidence interval for the predicted value of Y becomes wider when:

View answer and explanation
Question 45

Which assumption is violated if the variance of the residuals increases as the predicted values increase?

View answer and explanation
Question 46

Calculate the t-statistic for a slope of 1.2 with a standard error of 0.4.

View answer and explanation
Question 47

In the ANOVA table for simple regression, the Total Sum of Squares (SST) degrees of freedom is:

View answer and explanation
Question 48

If RSS = 80 and SSE = 20, what is the value of R-squared?

View answer and explanation
Question 49

Prediction intervals are generally wider than confidence intervals because:

View answer and explanation
Question 50

If a company uses a regression model to predict sales based on advertising spend, 'Sales' is the:

View answer and explanation