Parametric and Non-Parametric Tests of Independence

50 questions available

Parametric Correlation Testing (Pearson)5 min
This section focuses on testing the significance of the linear relationship between two variables using the Pearson correlation coefficient (r). It assumes the variables are normally distributed. The null hypothesis typically states that the population correlation is zero. The test statistic follows a t-distribution with n-2 degrees of freedom.

Key Points

  • Measures linear relationship between continuous variables.
  • Relies on the assumption of bivariate normality.
  • Test statistic: t = r * sqrt(n-2) / sqrt(1-r^2).
  • Degrees of freedom: n - 2.
Non-Parametric Correlation (Spearman Rank)7 min
The Spearman rank correlation is used when parametric assumptions are not met, such as with non-normal distributions, outliers, or ordinal data. The raw data is converted to ranks, and the correlation is derived from the differences in these ranks. For large samples (n >= 30), a t-test is used for significance.

Key Points

  • Suitable for non-normal, outlier-prone, or ordinal data.
  • Calculated using rank differences: r_s = 1 - [6 * sum(d^2) / (n * (n^2 - 1))].
  • Uses t-statistic for hypothesis testing when n >= 30.
  • Requires specialized tables for small samples (n < 30).
Chi-Square Test of Independence10 min
This test determines if two categorical variables are independent. Data is displayed in a contingency table. The test compares observed frequencies (O) with expected frequencies (E) derived under the assumption of independence. A large Chi-Square statistic suggests a relationship exists.

Key Points

  • Analyzes categorical or discrete data.
  • Expected Frequency = (Row Total * Column Total) / Grand Total.
  • Chi-Square Statistic = Sum of [(O - E)^2 / E].
  • Degrees of Freedom = (number of rows - 1) * (number of columns - 1).
  • Always a right-tailed test.
Visualization with Mosaic Plots3 min
Mosaic plots provide a visual representation of the data in a contingency table. The size of the tiles corresponds to cell frequencies. Shading indicates the magnitude of the standardized residual, highlighting cells that deviate significantly from independence.

Key Points

  • Visualizes contingency table data.
  • Tile area represents frequency.
  • Standardized residual = (Observed - Expected) / sqrt(Expected).
  • Shading highlights deviations from independence.

Questions

Question 1

Which of the following conditions most strongly suggests the use of a non-parametric test rather than a parametric test?

View answer and explanation
Question 2

Calculate the degrees of freedom for a t-test of a Pearson correlation coefficient with a sample size of 25.

View answer and explanation
Question 3

Given a sample correlation coefficient (r) of 0.50 and a sample size (n) of 18, what is the calculated t-statistic for testing the null hypothesis that the population correlation is zero?

View answer and explanation
Question 4

Which of the following is an assumption required for the t-test of the Pearson correlation coefficient?

View answer and explanation
Question 5

The Spearman rank correlation coefficient is best described as:

View answer and explanation
Question 6

When calculating the Spearman rank correlation, if two observations have the same value, how should their ranks be assigned?

View answer and explanation
Question 7

Calculate the Spearman rank correlation coefficient given n=5 and the sum of squared differences in ranks is 8.

View answer and explanation
Question 8

For a sample size of 40, which test statistic is used to test the significance of the Spearman rank correlation?

View answer and explanation
Question 9

Which test is most appropriate for determining if 'Investment Strategy' (Growth vs. Value) is independent of 'Fund Size' (Small vs. Large)?

View answer and explanation
Question 10

In a contingency table, how is the expected frequency for a cell calculated?

View answer and explanation
Question 11

What are the degrees of freedom for a Chi-Square test of independence on a contingency table with 3 rows and 4 columns?

View answer and explanation
Question 12

Consider a contingency table cell with an Observed Frequency of 50 and an Expected Frequency of 40. What is the contribution of this cell to the Chi-Square statistic?

View answer and explanation
Question 13

The Chi-Square test of independence is typically:

View answer and explanation
Question 14

In a Mosaic plot, what does the size (area) of each tile represent?

View answer and explanation
Question 15

What does the standardized residual in a Mosaic plot analysis help identify?

View answer and explanation
Question 16

Calculate the standardized residual for a cell with Observed = 100 and Expected = 81.

View answer and explanation
Question 17

Under what circumstance would the Spearman rank correlation be preferred over Pearson even if the data is continuous?

View answer and explanation
Question 18

What is the critical t-value for a two-tailed Pearson correlation test at a 5% significance level with n=25? (Approximate using standard t-distribution knowledge)

View answer and explanation
Question 19

A Chi-Square test results in a statistic of 15.5 with 4 degrees of freedom. If the critical value at 5% significance is 9.49, what is the conclusion?

View answer and explanation
Question 20

If the Spearman rank correlation is 1.0, what does this indicate about the two variables?

View answer and explanation
Question 21

In the context of hypothesis testing for correlation, the null hypothesis H0 usually states:

View answer and explanation
Question 22

If a sample size is small (n < 30) for a Spearman rank correlation test, what is required to determine the critical value?

View answer and explanation
Question 23

Calculate the Chi-Square contribution for a cell where Observed = 20 and Expected = 25.

View answer and explanation
Question 24

If a contingency table has 2 rows and 2 columns, what is the degrees of freedom?

View answer and explanation
Question 25

Which of the following describes a 'Type I Error' in the context of these tests?

View answer and explanation
Question 26

Given: Row 1 Total = 30, Column 1 Total = 20, Grand Total = 50. Calculate the Expected Frequency for cell (1,1).

View answer and explanation
Question 27

In a Pearson correlation t-test, if the calculated t-value is 1.5 and the critical value is 2.1, what is the decision?

View answer and explanation
Question 28

When transforming data to ranks for the Spearman correlation, the smallest value is typically assigned rank:

View answer and explanation
Question 29

What is the primary advantage of using a Mosaic plot over a simple contingency table?

View answer and explanation
Question 30

Given n=50 and r=0.3. Calculate the t-statistic.

View answer and explanation
Question 31

If variables X and Y are independent, the population correlation coefficient rho is:

View answer and explanation
Question 32

Which distribution does the Chi-Square statistic follow?

View answer and explanation
Question 33

In a Spearman rank calculation, if d=0 for all pairs, then:

View answer and explanation
Question 34

What is the range of possible values for the Chi-Square statistic?

View answer and explanation
Question 35

If a Chi-Square test yields a p-value of 0.03 and we use a 5% significance level, we should:

View answer and explanation
Question 36

Calculating Spearman rank: If Sum(d^2) = 20 and n = 6, what is r_s?

View answer and explanation
Question 37

Which of the following is NOT a requirement for the Chi-Square test of independence?

View answer and explanation
Question 38

For a contingency table with Row Totals of 10 and 20, and Column Totals of 15 and 15 (Grand Total 30), calculate the Expected Frequency for cell (2, 2).

View answer and explanation
Question 39

What happens to the critical t-value for a Pearson correlation test as the sample size increases (holding significance level constant)?

View answer and explanation
Question 40

A standardized residual of -3.0 in a Mosaic plot indicates:

View answer and explanation
Question 41

Pearson correlation is sensitive to:

View answer and explanation
Question 42

If a dataset has ranks for X: 1, 2, 3 and ranks for Y: 3, 2, 1. What is the Spearman correlation?

View answer and explanation
Question 43

The null hypothesis for a Chi-Square test of independence typically asserts:

View answer and explanation
Question 44

What is the denominator in the formula for the t-statistic of the Pearson correlation?

View answer and explanation
Question 45

Spearman rank correlation is a specific case of:

View answer and explanation
Question 46

Contingency table analysis is typically used for what kind of variables?

View answer and explanation
Question 47

If observed frequency equals expected frequency for all cells, the Chi-Square statistic is:

View answer and explanation
Question 48

In a 3x3 contingency table, what is the critical value of Chi-Square at 5% significance (approx)?

View answer and explanation
Question 49

Which correlation coefficient should be used if the data is ordinal (e.g., credit ratings AAA, AA, A)?

View answer and explanation
Question 50

The term 'contingency table' is synonymous with:

View answer and explanation