KNR 445
Statistical Applications in Science & Technology
Correlation
The purpose of this assignment is to demonstrate how the Pearson Product Moment
Correlation Coefficient (r) is used to quantify the degree of association
(relationship) between two variables. You will do a hand calculation of r to
emphasize that the measure accounts for the relative position of each of the pair of
variables relative to the mean of that variable. You will also use SPSS to create a
correlation matrix, scattergrams, and to fit a regression line through the points of data
used in the scattergram. The data to be used is both the CDC smoking data and the rain data from the Pantagraph.
Homework
Textbook Questions: you should be able to answer them all.
1. Identify all the procedures in SPSS that can be used to calculate the Pearson-product
moment correlation coefficient (r).
2. Using the data rain.sav (created last assignment),
a. create a scattergram and calculate the Pearson product-moment
correlation coefficient between:
i. Year and Year-to-date
rainfall.
ii. Summer rainfall and Year-to-date
rainfall.
b. From the scattergram, explain if the relationship is positive (one
variable increases while the other increases) or negative (one variable increases while
the other decreases).
c. Is the interpretation of the relationship between Year
and Year-to-date rainfall meaningful? Explain.
d. Is the interpretation of the relationship between Summer
rainfall and Year-to-date rainfall meaningful? Explain.
SPSS Output for Question 2.
3. In the editorial and the letter to the editor of the Indianapolis Star,
questions are raised regarding the relationship among tax per pack of cigarettes, the
number of persons of legal age who smoke, and the death rate related to cigarettes. Use
the CDC smoking data for the following questions investigating the hypothesized
relationships:
a. Using hand calculations ONLY (even for mean values), calculate r
between TaxRate and SmokerDeath for the Northwest and Southeast regions. Show all of your
calculations.
b. Using SPSS, calculate a matrix of correlation coefficients (r
values) for the variables TaxRate, SmokerDeath and Smkr18.
i. What value is on the diagonal (from top left
to bottom right)? Why is this value along the diagonal?
ii. What do you notice about the values below
the diagonal and those above the diagonal?
iii. To present a matrix of r values
in a report, what changes would you make to the SPSS printout?
iv. One of the options when creating a scattergram is "Matrix". Select this
option, and enter the variables TaxRate, SmokerDeath and Smkr18 in the same order as you
did for procedure correlate. Comment on any similarity between the matrix of scattergrams
created and the matrix of correlation coefficents created in step b. above.
c. Create z-scores for TaxRate, SmokerDeath and %smokers
d. Create a scattergram using the z-scores for the variables
TaxRate and SmokerDeath.
e. Create a scattergram for the variables TaxRate and SmokerDeath.
f. Are scattergrams in 3d and 3e the same or different? Explain why
this happens.
g. Create a scattergram using the z-scores for the variables
TaxRate and %smokers.
h. Create a scattergram for the variables TaxRate and %smokers.
i. Edit the scattergrams created in g and h to
add lines representing the mean values of each variable. (After getting to Chart
Editor, select Chart, choose Reference Line and play with this
option).
i. Does the alignment relative to the means of
individual data points representing pairs of scores concur with r?
j. Edit axis titles on scattergram 3.h, and provide a more descriptive
title for the entire scattergram.
k. Edit the scattergram created in 3.h to draw in the regression line
(After getting to Chart Editor, select Chart, choose Chart Options,
toggle Fit Line and play with this option).
i. Describe the slope of the regression line in
each scattergram. Does the slope concur with the calculated r value?
l. Using the r and r2
values, interpret the relationship between:
i. TaxRate and SmokerDeath
ii. TaxRate and %smokers
iii. %smokers and SmokerDeath
m. How do your interpretations compare to those of the editorial writer
and the writer of the letter to the editor?
i. What advice would you give to someone
wanting to draw a conclusion regarding a relationship between two or more variables?
SPSS Output for Question 3