KNR 445
Statistical Applications in Science & Technology
Frequency Tables & Distributions
The purpose of this lab is to acquaint you with more of the powerful data handling
features available in SPSS for Windows, and to introduce the procedures for creating
frequency tables and frequency distributions. You will use the data file created in
assignment 1, when you added the variable "Region" to the CDC data file.
The Statistics Menu
One of the most exciting aspects of this stats course is that it is impossible to adequately cover all of the features of SPSS. The Statistics menu presents the wide variety of analyses possible, many more than we can cover in class time. Feel free to explore these on your own.
Frequency distributions and frequency tables are graphic and tabular methods, respectively, to summarize data. These output are used to reduce a large, hard to manage data set to a format that is easier to interpret. Both frequency distributions and frequency tables are created with procedure Descriptive Statistics.
Homework:
1. Create a stem and leaf plot of the variable Smoker Death. What descriptive statistics
are output when the S&L plot is requested?
SPSS Output
2. Create a frequency table, bar chart and histogram of the variable Region (with six
categories).
a. Which region has the most states in it?
b. Which has the least states in it?
c. Are your frequency counts the same as other people in the class? If not, explain why
not.
d. What differences are evident between the bar chart and the histogram?
e. Explain why (or why not) the table and graphs have made the data easier to interpret.
SPSS Output
3. Create a frequency table, bar chart and histogram of the variable Smoker Death.
a. What differences are evident between the bar chart and the histogram?
b. How do the bar chart and histogram compare to the stem & leaf plot in question 1?
c. Edit the histogram to increase the number of categories on the horizontal axis to 20.
Reduce the number of categories to 5 and create another histogram. Explain why (or why
not) the recoded tables and graphs have made the data easier to interpret.
4. Recode the data file using the variable Smoker Death into 10 categories of width 30 of
a new variable called "Death". The lowest category should enclose 201-230
deaths, and the highest category should enclose 471 to 500 deaths. After completing this
procedure, look carefully at the data set. Were all states assigned to one of the new
categories? If not, why not? Correct this.
a. Create a frequency table, bar chart and histogram of the variable Death. What
differences are evident between the bar chart and the histogram?
b. Compare the new table, chart and histogram to those in Q3. Explain why (or why not) the
recoded table and graphs have made the data easier to interpret.
c. Explain whether 10 categories are adequate for this data set. If you do not think it is
adequate, create and justify a new histogram with another number of horizontal categories.
5. Complete Exercises 1 to 8 in Interpreting Basic Statistics.