Ed 602 - Lesson 3 - Frequency Distributions

Lesson 3 will consist of the following topics

Text Assignment for Lesson 3

For lesson 3, read pages 43-46 in Practical Statistics for Educators, Third Edition by Ruth Ravid (2005, University Press of America)
or read pages 73-81 in Basic Statistics for Behavioral Science Research 2nd ed by Mary B. Harris (1998, Allyn and Bacon)
or read pages 63-69 in Practical Statistics for Educators, 2nd Edition by Ruth Ravid (2000, University Press of America)
or read pages 35-37 in Practical Statistics for Educators by Ruth Ravid (1994, University Press of America).

Overview of descriptive statistics

In lessons 3 through 6 and 8 we will be discussing descriptive statistics. Before we start that discussion, with frequency distributions, let's consider an overview of descriptive statistics.

Descriptive statistics are a way of summarizing data or letting one number stand for a group of numbers. There are three ways we can summarize data.

  1. Tabular representation of data - we can summarize data by making a table of the data. In statistics we call these tables frequency distributions and in this lesson we will look at four different types of frequency distributions.
  2. Graphical representation of data - we can make a graph of the data. In lesson 4 we will consider four types of graphs.
  3. Numerical representation of data - we can use a single number to represent many numbers. We will discuss three types of numerical representation of data in lessons 5, 6, and 8.

Frequency distribution

Consider the following set of data which are the high temperatures recorded for 30 consequetive days. We wish to summarize this data by creating a frequency distribution of the temperatures.

Data Set - High Temperatures for 30 Days
50 45 49 50 43
49 50 49 45 49
47 47 44 51 51
44 47 46 50 44
51 49 43 43 49
45 46 45 51 46

To create a frequency distribtion from this data we proceed as follows:

  1. Identify the highest and lowest values in the data set. For our temperatures the highest temperature is 51 and the lowest temperature is 43.
  2. Create a column with the title of the variable we are using, in this case temperature. Enter the highest score at the top, and include all values within the range from the highest score to the lowest score.
  3. Create a tally column to keep track of the scores as you enter them into the frequency distribution. Once the frequency distribution is completed you can omit this column. Most printed frequency distributions do not retain the tally column in their final form.
  4. Create a frequency column, with the frequency of each value, as show in the tally column, recorded.
  5. At the bottow of the frequency column record the total frequency for the distribution proceeded by N =
  6. Enter the name of the frequency distribution at the top of the table.

If we applied these steps to the temperature data we would have the following frequency distribution.

Frequency Distribution for High Temperatures
Temperature Tally Frequency
51 //// 4
50 //// 4
49 ////// 6
48
0
47 /// 3
46 /// 3
45 //// 4
44 /// 3
43 /// 3

N = 30

Cummulative Frequency Distribution

A cummulative frequency distribution can be created from a frequency distribution by adding an additional column called "Cummulative Frequency." For each score value the cummulative frequency for that score value is the frequency up to and including the frequency for that value. In the cummulative frequency distribution for the high temperatures data below, notice that the cummulative frequency for the lowest temperature (43) is 3, and that the cummulative frequency for the temperature 44 is 3+3 or 6. The cummulative frequency for a given value can also be obtained by adding the frequency for the value to the cummulative value for the value below the given value. For example the cummulative frequency for 45 is 10 which is the cummulative frequency for 44 (6) plus the frequency for 45 (4). Finally, notice that the cummulative frequency for the highest value (51 in the current case) should be the same as the total of the frequency column (30 in the case of the temperature data).

Cummulative Frequency Distribution for High Temperatures
Temperature Tally Frequency Cummulative Frequency
51 //// 4 30
50 //// 4 26
49 ////// 6 22
48
0 16
47 /// 3 16
46 /// 3 13
45 //// 4 10
44 /// 3 6
43 /// 3 3

N = 30

In summary then, to create a cummulative frequency distribution:

  1. Create a frequency distribution
  2. Add a column entitled cummulative frequency
  3. The cummulative frequency for each score is the frequency up to and including the frequency for that score
  4. The highest cummulative frequency should equal N (the total of the frequency column)

Grouped frequency distribution

In some cases it is necessary to group the values of the data to summarize the data properly. For example, you wish to create a frequency distribution for the IQ scores in your class of 30 pupils. The IQ scores in your class range from 73 to 139. To include these scores in a frequency distribution you would need 67 different score values (139 down to 73). This would not summarize the data very much. To solve this problem we would group scores together and create a grouped frequency distribution.

If your data has more than 20 score values, you should create a grouped frequency distribution by grouping score values together into class intervals. To create a grouped frequency distribution:

  1. select an interval size so that you have 7-20 class intervals
  2. create a class interval column and list each of the class intervals
  3. each interval must be the same size, they must not overlap, there may be no gaps within the range of class intervals
  4. create a tally column (optional)
  5. create a midpoint column for interval midpoints
  6. create a frequency column
  7. enter N = some value at the bottom of the frequency column

Look at the following data of high temperatures for 50 days. The highest temperature is 59 and the lowest temperature is 39. If we were to create a simple frequency distribution of this data we would have 21 temperature values. This is greater than 20 values so we should create a grouped frequency distribution.

Data Set - High Temperatures for 50 Days
57 39 52 52 43
50 53 42 58 55
58 50 53 50 49
45 49 51 44 54
49 57 55 59 45
50 45 51 54 58
53 49 52 51 41
52 40 44 49 45
43 47 47 43 51
55 55 46 54 41

If we use this data and follow the suggestions for creation of a grouped frequency distribution, we would create the following grouped frequency distribution. Note that we use an interval size of three so that each class interval includes three score values. Also note that we have included an interval midpoint column, this is the middle of each interval.

Click on Guidelines for Creating Class Intervals for further information on creating class intervals (optional).

Grouped Frequency Distribution for High Temperatures
Class Interval Tally Interval Midpoint Frequency
57-59 ////// 58 6
54-56 /////// 55 7
51-53 /////////// 52 11
48-50 ///////// 49 9
45-47 /////// 46 7
42-44 ////// 43 6
39-41 //// 40 4


N = 50

Cumulative grouped frequency distribution

It is a simple matter to create a cumulative grouped frequency distribution. We just add a cumulative frequency column to the grouped frequency distribution and we have a cumulative grouped frequency distribution. The cumulative grouped frequency distribution below was created by adding a cumulative frequency column.

Cumulative Grouped Frequency Distribution for High Temperatures
Class Interval Tally Interval Midpoint Frequency Cumulative Frequency
57-59 ////// 58 6 50
54-56 /////// 55 7 44
51-53 /////////// 52 11 37
48-50 ///////// 49 9 26
45-47 /////// 46 7 17
42-44 ////// 43 6 10
39-41 //// 40 4 4


N = 50

Lesson 3 Assignment

Lesson 3 Quiz

Please send electronic mail to the course instructor if you have any questions about this lesson or other concerns.

Return to Ed 602 Home Page

Return to Previous Lesson

Go to Next Lesson