Basics of Simple Linear Regression5 min
Regression analysis creates a mathematical model to explain a dependent variable using an independent variable. In simple linear regression, this relationship is a straight line: Y = a + b*X + e. The coefficient 'b' is the slope, representing the change in Y for a one-unit change in X. It is calculated as Cov(X,Y)/Var(X). The intercept 'a' is the value of Y when X is zero. The method minimizes the Sum of Squared Errors (SSE), known as the Ordinary Least Squares (OLS) technique.

Key Points

  • Dependent Variable (Y) vs. Independent Variable (X).
  • Slope formula: Covariance(X,Y) / Variance(X).
  • Intercept formula uses mean X and mean Y.
  • OLS minimizes squared residuals.
Assumptions of the Model5 min
Valid inference relies on four assumptions: Linearity (parameters are linear), Homoskedasticity (constant variance of error terms), Independence (residuals are uncorrelated), and Normality (residuals are normally distributed). Violations, such as heteroskedasticity or non-normality, can invalidate hypothesis tests and confidence intervals.

Key Points

  • Linearity in parameters.
  • Homoskedasticity vs. Heteroskedasticity.
  • Independence of observations.
  • Normality of residuals.
ANOVA and Measures of Fit6 min
The ANOVA table breaks down total variation (SST) into explained (SSR) and unexplained (SSE) components. The Coefficient of Determination (R-squared) is SSR/SST and measures the proportion of variance explained. For simple regression, R-squared is the square of the correlation coefficient. The Standard Error of Estimate (SEE) is the square root of the Mean Squared Error (MSE) and gauges the fit's precision.

Key Points

  • SST = SSR + SSE.
  • R-squared = SSR / SST = correlation^2.
  • SEE = Square Root of MSE.
  • F-statistic = MSR / MSE.
Hypothesis Testing and Prediction6 min
We test if the slope is statistically significant using a t-test with n-2 degrees of freedom. A significant slope implies a linear relationship. The F-test yields the same result for simple regression (F = t^2). The model is used for prediction, where the Standard Error of Forecast accounts for both the uncertainty in the estimated line and the specific error term, resulting in prediction intervals that are wider than confidence intervals.

Key Points

  • t-statistic = (estimated slope - hypothesized slope) / standard error of slope.
  • Prediction intervals account for SEE and estimation error.
  • F-test evaluates overall significance.
  • Standard error of forecast increases as X deviates from its mean.
Functional Forms and Dummy Variables4 min
When relationships are not linear, variables can be transformed using natural logs. A Log-Lin model has a dependent log variable; Lin-Log has an independent log variable. Log-Log models are used for constant elasticity. Dummy variables are binary (0 or 1) indicators used to incorporate qualitative independent variables into the regression.

Key Points

  • Log-Lin: ln(Y) = b0 + b1*X.
  • Lin-Log: Y = b0 + b1*ln(X).
  • Log-Log: ln(Y) = b0 + b1*ln(X) (Double Log).
  • Dummy variables represent categories like 'Pass/Fail'.

Questions

Question 1

In a simple linear regression equation Y = b0 + b1*X + epsilon, what does the term 'b1' represent?

View answer and explanation
Question 2

Which variable is the one you are seeking to explain in a regression analysis?

View answer and explanation
Question 3

If the Covariance between X and Y is 20 and the Variance of X is 10, what is the slope coefficient (b1)?

View answer and explanation
Question 4

Calculate the slope coefficient if the correlation between X and Y is 0.8, the standard deviation of Y is 10, and the standard deviation of X is 4.

View answer and explanation
Question 5

Ordinary Least Squares (OLS) regression minimizes which of the following?

View answer and explanation
Question 6

Which assumption states that the variance of the regression residuals is the same for all observations?

View answer and explanation
Question 7

What is the total variation of the dependent variable Y referred to as?

View answer and explanation
Question 8

In the ANOVA equation, SST equals which of the following?

View answer and explanation
Question 9

What is the formula for the Coefficient of Determination (R-squared)?

View answer and explanation
Question 10

If the correlation coefficient between X and Y is 0.9, what is the Coefficient of Determination (R-squared) for a simple linear regression?

View answer and explanation
Question 11

How is the Mean Sum of Squares Regression (MSR) calculated in simple linear regression?

View answer and explanation
Question 12

For a simple linear regression with n observations, what are the degrees of freedom for the Mean Squared Error (MSE)?

View answer and explanation
Question 13

Given SSE = 50 and n = 27 in a simple linear regression, what is the Mean Squared Error (MSE)?

View answer and explanation
Question 14

The Standard Error of Estimate (SEE) is calculated as:

View answer and explanation
Question 15

In a regression with SSR = 100 and SSE = 20, what is the value of the F-statistic if n=22?

View answer and explanation
Question 16

What is the relationship between the F-statistic and the t-statistic for the slope in a simple linear regression?

View answer and explanation
Question 17

A dummy variable (indicator variable) can take which values?

View answer and explanation
Question 18

If a regression equation is Y = 5 + 1.2*X, what is the predicted value of Y when X is 10?

View answer and explanation
Question 19

The standard error of the forecast (sf) is generally:

View answer and explanation
Question 20

In the regression equation ln(Y) = b0 + b1*X, what is this functional form called?

View answer and explanation
Question 21

Which regression model is most useful for calculating elasticities?

View answer and explanation
Question 22

If the p-value of a test statistic is 0.03 and the level of significance is 0.05, what is the decision?

View answer and explanation
Question 23

Calculate the degrees of freedom for the t-test on the slope coefficient if the sample size is 25.

View answer and explanation
Question 24

If the estimated slope is 1.2, the hypothesized value is 0, and the standard error of the slope is 0.2520, what is the t-statistic?

View answer and explanation
Question 25

What does a 95 percent prediction interval imply compared to a confidence interval for the same X?

View answer and explanation
Question 26

In a Lin-Log model (Y = b0 + b1*lnX), how is the slope coefficient interpreted?

View answer and explanation
Question 27

Which assumption implies that regression residuals are uncorrelated across observations?

View answer and explanation
Question 28

If the 95 percent confidence interval for a slope coefficient includes zero, what should you conclude?

View answer and explanation
Question 29

What term describes the condition where the variance of residuals differs across observations?

View answer and explanation
Question 30

For a regression with n=25, what is the critical t-value for a two-tailed test at the 5 percent significance level (approximate based on standard tables)?

View answer and explanation
Question 31

If Mean Sum of Squares Regression (MSR) is 151.42 and Mean Sum of Squares Error (MSE) is 1.05, what is the F-statistic?

View answer and explanation
Question 32

Calculate the Intercept if Mean Y = 9, Mean X = 2, and Slope = 3.05.

View answer and explanation
Question 33

In a cross-sectional regression, data involves:

View answer and explanation
Question 34

When sample size is large, which assumption can be relaxed by appealing to the Central Limit Theorem?

View answer and explanation
Question 35

What is the variation of X often referred to as?

View answer and explanation
Question 36

If the p-value is 0.001 and significance level is 0.05, this indicates:

View answer and explanation
Question 37

If a regression has SSR = 159.6 and SST = 162.8, what is the R-squared?

View answer and explanation
Question 38

The intercept in a regression model represents:

View answer and explanation
Question 39

Calculate the degrees of freedom for the F-statistic numerator in a simple linear regression.

View answer and explanation
Question 40

If the slope coefficient is 1.5 and the standard error of the slope is 0.5, what is the t-statistic for testing the hypothesis that the slope is zero?

View answer and explanation
Question 41

What does 'homoskedasticity' imply about the regression residuals?

View answer and explanation
Question 42

Given a sample size of 22 and SEE of 3, calculate the margin of error for a prediction interval if the t-critical value is 2.086 and the standard error of forecast (sf) is calculated to be 4.

View answer and explanation
Question 43

Which value indicates the strongest linear relationship between X and Y?

View answer and explanation
Question 44

In a Log-Log model, if the coefficient is 1.5, what is the interpretation?

View answer and explanation
Question 45

Which test statistic is used to test the overall fit of the simple linear regression model?

View answer and explanation
Question 46

If a slope coefficient is statistically significantly different from zero, what can we conclude about the correlation coefficient?

View answer and explanation
Question 47

Sum of Squares Error (SSE) is calculated as the sum of:

View answer and explanation
Question 48

If the confidence interval for the slope is [0.5, 1.5], what can be said about the hypothesis H0: slope = 0?

View answer and explanation
Question 49

What happens to the standard error of the forecast as the value of the independent variable (X) moves further from its mean?

View answer and explanation
Question 50

If n = 50 and k = 1, what are the degrees of freedom for the denominator of the F-test?

View answer and explanation