Ed 602 - Lesson 15. Testing the significance of correlation coefficients, choosing the proper statistical test

Lesson 15 will consist of the following topics

Text Assignment for Lesson 15

For lesson 15, read pages 187-188 in Practical Statistics for Educators, Third Edition by Ruth Ravid (2005, University Press of America) You may also wish to look at the example and scenarios with answers on pages 189-193.
or read pages 478-489 in Basic Statistics for Behavioral Science Research 2nd ed by Mary B. Harris (1998, Allyn and Bacon)
or read pages 295-298 in Practical Statistics for Educators, 2nd Edition by Ruth Ravid (2000, University Press of America). You may also wish to look at the simulated problems with answers on pages 299-334.
or Read pages 134 and 267-274 in Practical Statistics for Educators by Ruth Ravid (1994, University Press of America). You may also wish to look at the simulated problems with answers on pages 274-307.

Using Statistical Tests Involving Correlation

In lesson 8 we discussed correlation or measures of association. At that time we discussed how to calculate the Pearson Product Moment Correlation Coefficient, r, which is used to show the degree of relationship between two variables when the dependent variable is at the interval or ratio level. The Pearson r is thus a parametric statistic. In this lesson we will show how to use the Pearson r, as an inferential statistic to test a statistical hypothesis in a research design.

In lesson 8 we also discussed the calculation of the Spearman Rank-Difference Correlation Coefficient, rS, which is used to show the relationship between two variables which are expressed as ranks (the ordinal level of measurement). The Spearman rS is thus a non-parametric statistic. In this course we have thus far discussed two non-parametic statistics, the Spearman Rank Difference Correlation Coefficient and Chi-Square. All of the rest of the statistics we have discussed in the course are parametric statistics. In this lesson we will show how to use the Spearman, rS as an inferential statistic to test a statistical hypothesis.

Example of a Statistical Test using the Pearson Product-Moment Correlation Coefficient

A researcher wishes to establish the concurrent validity for the Perceived Stress Checklist by corrrelating it with a stress test with known validity, the Teacher's Stress Test. To measure the degree of association between the two measures of stress, the researcher has a group of pre-service teachers complete each instrument. The reseacher wishes to know if there is a signficant positive correlation between the two measures of stress. The group of 26 subjects obtained the following scores on the two measures.

Scores on Teacher's Stress Test and Stress Checklist for 26 Pre-Service Teachers
Subject
ID Number
Score on
Teacher's Stress Test
Score on
Perceived Stress Checklist
10 16 5
11 24 3
20 11 0
27 13 1
28 17 5
30 17 1
34 12 3
35 14 5
43 31 2
44 15 2
46 18 7
50 17 5
54 29 1
58 20 9
59 25 4
60 18 2
64 46 8
66 34 8
68 23 4
75 17 8
88 18 0
85 11 5
90 19 3
91 9 0
92 46 8
95 13 9

We can enter this data into an Excel spreadsheet, and then select Data Analysis from the Tools menu. In the Data Analysis window select Correlation and find the value of r to be 0.379389094 which we can round to 0.38

The null hypothesis for a correlation problem is r = 0 (the hypothesis of no relationship) and the alternative hypothesis can take one of three forms depending on the problem.

Since the problem is concerned only with a significant positive relationship between the two variables, we would use the first variant of the alternative hypothesis.

We can use the table in Appendix B (Values of the Correlation Coefficient (Pearson's r) for Different Levels of Significance) on page 317 of the text to find the significant level of r.

The degrees of freedom for the Pearson r is the number of subjects (pairs of scores) minus 2 or for our problem:

df = N - 2 = 26 - 2 = 24

If we look in the .05 column of the table on page 317 and the row for 24 df, we find the level at which r is significant is .388

However, this results is for a two-tailed test. The text table for Values of the Correlation Coefficient for Different levels of Significance is for two-tailed tests. To use this table for one-tailed tests proceed as follows:

  1. Use the .10 column for alpha = .10 two-tailed or alpha = .05 one-tailed
  2. Use the .05 column for alpha = .05 two-tailed or alpha = .025 one-tailed
  3. Use the .02 column for alpha = .02 two-tailed or alpha = .01 one-tailed
  4. Use the .01 column for alpha = .01 two-tailed or alpha = .005 one-tailed

So for our one-tailed test at alpha = .05 with 24 degrees of freedom, we use the .10 column of the table and find that an r of .330 is significant at the .05 level for a one-tailed test.

We now have the information we need to complete the six step process for testing statistical hypotheses for our research problem.

  1. State the null hypothesis and the alternative hypothesis based on your research question.

    H0: r = 0

    H1: r > 0

    Note: Our null hypothesis, for the Pearson r, states that r is 0. The alternative hypothesis states that r has a significant positive value.

  2. Set the alpha level.

    Note: As usual we will set our alpha level at .05, we have 5 chances in 100 of making a type I error.

  3. Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary.

    r = .38

    df = N - 2 = 26 - 2 = 24

  4. Write the decision rule for rejecting the null hypothesis.

    Reject H0 if r >= .330

    Note: To write the decision rule we had to know the critical value for r, with an alpha level of .05 (one-tailed test), and 24 degrees of freedom. We can do this by looking at Appendix Table B and noting the tabled value for the column for the .10 level and the row for 24 df.

  5. Write a summary statement based on the decision.
    Reject H0, p < .05, one-tailed
    Note: Since our calculated value of r (.38 actually .3794) is greater than .330, we reject the null hypothesis and accept the alternative hypothesis.

  6. Write a statement of results in standard English.
    There is a significant positive correlation between the Teacher's Stress Test and the Perceived Stress Checklist.

Another way of looking at a correlation coefficient, is to estimate the amount of common variance between the two variables that is acounted for by the relationship. This quantity (proportion of common variance) is the square of the correlation coefficient.

For our problem the proportion of common variance = r2 = (.3794)2 = .1439 or the two variables are approximately 14% the same but 84% (100 - 14) different.

Example of Statistical Test using the Spearman Rank-Difference Correlation Coefficient

Is there a significant positive correlation between the rankings of 10 children on a reading test and their teacher's ranking of their reading ability? In this problem we are relating a set of scores (interval level of measurement) with the teacher's ranking of the children in reading (ordinal level of measurement). To do this we first convert the reading test scores to ranks by assigning the highest score a rank of 1, the next highest a rank of 2, etc. Now we are looking at rankings on two variables and can use the Spearman Rank-Difference Correlation Coefficient to test the significance of the relationship. The two set of ranks, as well as the difference between the pairs of ranks (D) and the differences squared (D2), are shown in the following table.

Worksheet to Calculate the Correlation between Students Ranks on a Reading Test and Teacher's Ranking of the Students on Reading
Reading Test
Score Rank
Teacher's Ranking
on Reading
D D2
1 3 -2 4
2 2 0 0
3 1 2 4
4 4 0 0
5 5 0 0
6 6 0 0
7 8 -1 1
8 7 1 1
9 10 -1 1
10 9 1 1
Total

12

From the table we can see that:

df = N - 2 = 10 - 2 = 8

We now have the information we need to complete the six step process for testing statistical hypotheses for our research problem.

  1. State the null hypothesis and the alternative hypothesis based on your research question.

    H0: rS = 0

    H1: rS > 0

    Note: Our null hypothesis states that there is no significant relationship between the two variables. The alternative hypothesis states that there is a significant positive correlation between the two variables.

  2. Set the alpha level.

    Note: As usual we will set our alpha level at .05, we have 5 chances in 100 of making a type I error.

  3. Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary.

    rS = .93

    df = N - 2 = 10 - 2 = 8

  4. Write the decision rule for rejecting the null hypothesis.

    Reject H0 if rS >= .549

    Note: To write the decision rule we had to know the critical value for rS, with an alpha level of .05, and 8 degrees of freedom. We can do this by looking at Appendix Table B, this is the same table we used for the Pearson r, and noting the tabled value for the column for the .10 level and the row for 8 df (.549).

    Note: We used the .10 column because we are doing a one-tailed test with an alpha of .05 As noted in our problem above the the Pearson r, in the table of critical values for r, the .10 column is used for alpha = .10 (two-tailed test) and for alpha = .05 (one-tailed test).

  5. Write a summary statement based on the decision.
    Reject H0, p < .05, one-tailed
    Note: Since our calculated value of rS (.93) is greater than .549, we reject the null hypothesis and accept the alternative hypothesis.

  6. Write a statement of results in standard English.
    There is a significant positive correlation between the children's ranks on a reading test and their teacher's ranking of them on reading.

Choosing the Proper Statistical Test

Let's finish our discussion of inferential statistics with a summary of all the inferential statistics we have discussed and look at the conditions under which we would use each of these statistics. Generally if we know the number of groups or samples in our research design and the level of measurement of the dependent variable we will know which inferential statistic to use.First let us look at statistical hypotheses in research designs where the dependent variable is at the interval or ratio level. These statistics are known as parametric statistics and we have used the following:

We also looked at two other statistics we could use with data that was not at the interval or ratio level of measurement. These statistics are called non-parametric statistics.

The information we have discussed above can be put into the following table. The table also includes other statistics that we have not included in this course. If you think you may need one of the statistics we did not cover in your research design, please send e-mail to the instructor and I will give you a reference to the calculation and interpretation of that statistic. I wish you the best as you complete the final examination for this course and as you apply the information from this course to your own research design.

Selecting a Statistical Test
Level of
Measurement
Sample Characteristics
One-Sample
Statistical
Tests
Two-Sample
Statistical
Tests
Multiple Sample
Statistical
Tests
Measures of
Association
(one-sample, more
than one variable)
Independent
Samples
Non-independent
Samples
Nominal or
Categorical
(frequencies)
Chi-Square Chi-Square McNemar
Change Test
Chi-Square Phi Coefficient
Ordinal
(Ranks)
Kolmagorov-Smirnov
One-Sample
Test
Mann Whitney
U-Test
Wilcoxon
Matched Pairs
Signed-Rank
Test
Krushcal-Wallis
One-Way
Analysis of
Variance
Spearman rho
rS
Interval
or Ratio
Z test

One-Sample
t-Test
Independent
t-test
Dependent
t-test
Simple
Analysis of Variance

Factorial
Analysis of Variance

Scheffe Tests

Analysis of Covariance
Pearson r

Multiple
Regression

Lesson 15 Assignment

Lesson 15 Quiz

Please send electronic mail to the course instructor if you have any questions about this lesson or other concerns.

Return to Ed 602 Home Page

Return to Previous Lesson

Go to Final Exam