Ed 602 - Lesson 13 - Analysis of Variance

Lesson 13 will consist of the following topics

Text Assignment for Lesson 13

For lesson 13, read pages 135-152 in Practical Statistics for Educators, Third Edition by Ruth Ravid (2005, University Press of America)
or read pages 334-357 in Basic Statistics for Behavioral Science Research 2nd ed by Mary B. Harris (1998, Allyn and Bacon)
or read pages 203-228 in Practical Statistics for Educators, 2nd Edition by Ruth Ravid (2000, University Press of America)
or read pages 191-219 in Practical Statistics for Educators by Ruth Ravid (1994, University Press of America).

Analysis of Variance (ANOVA)

In our last three lessons we discussed setting up statistical tests in a variety of situations:

  1. If we are making inferences about a single score we can use the z-score test.
  2. If we are making inferences about a single sample we can use the z-test (if we know the population variance) or the single sample t-test (if we do not know the population variance).
  3. If we are making inferences about two samples we can use the independent t-test (if the two samples are independent of one another) or the dependent t-test (if the two samples are related to one another as in matched samples or a pre-test post-test situation).

When we wish to look at differences among three or more sample means, we use a statistical test called analysis of variance or ANOVA. Analysis of variance yields a statistic, F, which indicates if there is a significant difference among three or more sample means. When conducting an analysis of variance, we divide the variance (or spreadoutedness) of the scores into two componants.

  1. The variance between groups, that is the variability among the three or more group means.
  2. The variance within the groups, or how the individual scores within each group vary around the mean of the group.

We measure these variances by calculating SSB, the sum of squares between groups, and SSW, the sum of squares within groups.

Each of these sum of squares is divided by its degrees of freedom, (dfB, degrees of freedom between, and dfW, degrees of freedom within) to calculate the mean square between groups, MSB, and the mean square within groups, MSW.

finally we calculate F, the F-ratio, which is the ratio of the mean square between groups to the mean square within groups. We then test the significance of F to complete our analysis of variance.

Let's look at the formula's for, and the calculation of each of these quantities in the context of a sample problem.

Example problem using One-way Analysis of Variance

Three groups of students, 5 in each group, were receiving therapy for severe test anxiety. Group 1 received 5 hours of therapy, group 2 - 10 hours and group 3 - 15 hours. At the end of therapy each subject completed an evaluation of test anxiety (the dependent variable in the study). Did the amount of therapy have an effect on the level of test anxiety?

The three groups of students received the following scores on the Test Anxiety Index (TAI) at the end of treatment.

TAI Scores for Three Groups of Students
Group 1 - 5 hours Group 2 - 10 hours Group 3 - 15 hours
48 55 51
50 52 52
53 53 50
52 55 53
50 53 50

The following table contains the quantities we need to calculate the means for the three groups, the sum of squares, and the degrees of freedom:

Worksheet for Test Anxiety Study
Group 1 - 5 hours Group 2 - 10 hours Group 3 - 15 hours
X1 (X1)2 X2 (X2)2 X3 (X3)2
48 2304 55 3025 51 2601
50 2500 52 2704 52 2704
53 2809 53 2809 50 2500
52 2704 55 3025 53 2809
50 2500 53 2809 50 2500
---------- ---------- ---------- ---------- ---------- ----------
253 12817 268 14372 256 13114

The mean for group 1 is 253/5 = 50.6, the mean for group 2 is 268/5 = 53.6, and the mean for group 3 is 256/5 = 51.2

Is the differences between these three means significant? We can use analysis of variance to answer that question. Since we only have one independent variable, amount of therapy, we will use one-way analysis of variance. If we were concerned with the effect of two independent variables on the dependent variable, then we would use two-way analysis of variance.

First we will calculate SSB, the sum of squares between groups, where X1 is a score from Group 1, X2 is a score from Group 2, X3 is a score from Group 3, n1 is the number of subjects in group 1, n2 is the number of subjects in group 2, n3 is the number of subjects in group 3, XT is a score from any subject in the total group of subjects, and NT is the total number of subjects in all groups.

The degrees of freedom between groups is:

dfB = K - 1 = 3 - 1 = 2

Where K is the number of groups.

Next we calculate SSW, the sum of squares within groups.

The degrees of freedom within groups is:

dfW = NT - K = 15 - 3 = 12

Where NT is the total number of subjects.

Finally, we will calculate SST, the total sum of squares.

As a check SST = SSB + SSW

54.4 = 25.2 + 29.2

We can now calculate MSB, the mean square between groups, MSW, the mean square within groups, and F, the F ratio.

To test the significance of the F value we obtained, we need to compare it with the critical F value with an alpha level of .05, 2 degrees of freedom between groups (or degrees of freedom in the numerator of the F ratio), and 12 degrees of freedom within groups (or degrees of freedom in the denominator of the F ratio). We can look up the critical value of F in Appendix Table D of the text book (The 5 percent (Lightface Type) and 1 percent (Boldface Type) points for the Distribution of F), pages 319-326. Look in the table under column 2 (2 degrees of freedom for the numerator) and row 12 (12 degrees of freedom for the denominator) and read the non-boldfaced entry (for .05 level) of 3.88 - this is the critical value for F.

One way of indicating this critical value of F at the .05 level, with 2 degrees of freedom between groups and 12 degrees of freedom within groups is

F.05(2,12) = 3.88

When using analysis of variance, it is a common practice to present the results of the analysis in an analysis of variance table. This table which shows the source of variation, the sum of squares, the degrees of freedom, the mean squares, and the probability is sometimes presented in a research article. The analysis of variance table for our problem would appear as follows:

Analysis of Variance Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Square
F Ratio p
Between Groups 25.20 2 12.60 5.178 <.05
Within Groups 29.20 12 2.43

Total 54.40 14


We now have the information we need to complete the six step process for testing statistical hypotheses for our research problem. We will also be adding another analysis of the individual means.

  1. State the null hypothesis and the alternative hypothesis based on your research question.


    Note: Our null hypothesis, for the F test, states that there are no differences among the three means. The alternate hypothesis states that there are significant differences among some or all of the individual means. An unequivocal way of stating this is not H0.
  2. Set the alpha level.

    Note: As usual we will set our alpha level at .05, we have 5 chances in 100 of making a type I error.
  3. Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary.
    F(2,12) = 5.178
    Note: We have indicated the value of F from our analysis of variance table. We have also indicated by (2,12) that there are 2 degrees of freedom between groups, and 12 degress of freedom within groups.
  4. Write the decision rule for rejecting the null hypothesis.
    Reject H0 if F is >= 3.88
    Note: To write the decision rule we had to know the critical value for F, with an alpha level of .05, 2 degrees of freedom in the numerator (df between groups) and 12 degrees of freedom in the denominator (df within groups). We can do this by looking at Appendix Table D and noting the tabled value for the .05 level in the column for 2 df and the row for 12 df.
  5. Write a summary statement based on the decision.
    Reject H0, p < .05
    Note: Since our calculated value of F (5.178) is greater than 3.88, we reject the null hypothesis and accept the alternative hypothesis.
  6. Write a statement of results in standard English.
    There is a significant difference among the scores the three groups of students received on the Test Anxiety Index.

In the problem above, we rejected the null hypothesis and found that there is indeed a significant difference among the three cell means. We know that Group 1 had the lowest mean (50.6), while group 3 had a higher mean (51.2) while group 2 had the highest mean of all (53.6). We would like to know which of these differences in means are significant. We can analyze the significance of the difference between pairs of means in analysis of variance by the use of post hoc (after the fact) comparisons. We only do these post hoc comparisons when there is a significant F ratio. It would make no sense to look for differences with a post hoc test if no differences exist. In the next section we will make these comparisons by a a method know as the Scheffe Test.

Making Post Hoc Comparisons Among Pairs of Means with the Scheffe Test

In analysis of variance, if F is significant, we can use the Scheffe test to see which specific cell mean differs from which other specific cell mean. To do this we calculate an F ratio for the difference between the means of two cells and then test the significance of this F value.

We calculate F12 to see if there is a significant difference between the means of groups 1 and 2.

We calculate F13 to see if there is a significant difference between the means of groups 1 and 3.

We calculate F23 to see if there is a significant difference between the means of groups 2 and 3.

The formulas for these tests and their application to the anova problem we just finished are:

Summary of Scheffe Test Results

Group One versus Group Two 4.62
Group One versus Group Three 0.18
Group Two versus Group Three 2.96

We compare these values with the critical value for F.05(2,12) = 3.88, and note that the only significant difference is between group one and group two (4.62 is greater than 3.88)

Now that we know how to conduct an ad hoc analysis of the significance of the differences between pairs of group means, we should modify steps 3 and 6 of our hypothesis testing strategy to include the results of the post hoc analysis.

The amended steps 3 and 6 are as follows:

Step 3: Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary.
F(2,12) = 5.178, value of the F ratio
F.05(2,12) = 3.88, critical value of F
F12 = 4.630, Scheffe test value for comparing means 1 and 2
F13 = 0.185, Scheffe test value for comparing means 1 and 3
F23 = 2.963, Scheffe test value for comparing means 2 and 3

Step 6: Write a statement of results in standard English.
There is a significant difference among the scores the three groups of students received on the Test Anxiety Index.
Group 1 (the five hour therapy group) has a significantly lower score on the TAI than does Group 2 (the ten hour therapy group).

Using the Excel Spreadsheet program to calculate One-way Analysis of Variance

Additional problem using One-way Analysis of Variance

Two-way Analysis of Variance (Optional)

Lesson 13 Assignment

Lesson 13 Quiz

Please send electronic mail to the course instructor if you have any questions about this lesson or other concerns.

Return to Ed 602 Home Page

Return to Previous Lesson

Go to Next Lesson