KNR 445
Statistical Applications in Science & Technology

Assignment
Standard Deviation, Normal Curve & z-scores

The purpose of this lab is to demonstrate how the mathematical properties of the normal curve make it useful for looking at the relative position of a single score in a distribution of scores. The concept of z-scores will be exhibited here, providing a foundation for our consideration of relationship in statistics. You will use the data file created earlier when you added the variable "Region" to the CDC data file, and will create a new data file using the rain data from the Pantagraph.

Standard Deviation

A measure of variability, or dispersion of scores around the measure of central tendency. Calculated using deviation scores about the mean, the SD has characteristics that make it useful in descriptive and inferential statistics.

Normal Curve
A mathematically described curve that is a foundation of statistics. The frequency distribution of many naturally occurring phenomenon is believed to exhibit the shape of the normal curve.

Z-score
A standardized score expressing the value of an individual case in units of the standard deviation. Use SPSS procedure Descriptive to calculate and save z-scores within a data file.

Homework

1.      =82 and S=12 for the distribution of scores on a test of introversion-extroversion that is completed by a large group of college students (high scores are in the direction of introversion). Convert each of the following scores to a z score:
(a)          70
(b)         90
(c)          106
(d)         100
(e)          62
(f)           80

2.      Convert the following z scores back to introversion-extroversion scores from the distribution of Problem 1 (round answers to the nearest whole number):
(a)          0
(b)         –2.10
(c)          +1.82
(d)         –.75
(e)          +.25
(f)           +3.10

 3.      Make a careful sketch of the normal curve.   For each of the z scores of Problem 2, pinpoint as accurately as you can its location on that distribution.

 4.      In a normal distribution, what proportion of cases fall (report to four decimal places): 
(a)     above z = +1.00?
(b)     below z = –2.00?
(c)     above z = +3.00?
(d)     below z = 0?
(e)     above z = –1.28?
(f)      below z = +1.62?

 5.      In a normal distribution, what proportion of cases fall between:
(a)     z = –1.00 and z = +1.00?
(b)     z = –1.50 and z = +1.50?
(c)     z = –2.28 and z = 0?
(d)     z = 0 and z = +.50?
(e)     z = +.75 and z = +1.25?
(f)      z = –.80 and z = –1.60?

 6.      In a normal distribution, what proportion of cases fall:
(a)     outside the limits z = –1.00 and z = +1.00?
(b)     outside the limits z = –.50 and z =  +50?
(c)     outside the limits z = –1.26 and z = +1.83?
(d)     outside the limits z = –1.96 and z = +1.96?

 7.      In a normal distribution, what z scores:
(a)          enclose the middle 99% of cases?
(b)     enclose the middle 95% of cases?
(c)     enclose the middle 75% of cases?
(d)     enclose the middle 50% of cases

8.      In a normal distribution, what is the z score?
(a)          above which the top 5% of the cases fall?
(b)         above which the top 1% of the cases fall?
(c)          below which the bottom 5% of the cases fall?
(d)         below which the bottom 75% of the cases fall?

 10.    The Maine Educational Assessment (MEA), a test completed annually by all students in the state in select grades, has  = 250 and S = 50.  

(a)          What MEA score separates the upper 30% of the cases from the lower 70%?
(b)         What score is the 70th percentile (P70)?
(c)          What score corresponds to the 40th percentile (P40)?
(d)         Between what two MEA scores do the central 80% of scores fall?

 11.       The mean of a set of z scores is always zero.  Does this suggest that half of a set of z scores will always be negative and half always positive? (Explain.)

SPSS Questions

1. a. enter the rain data from the article in the Pantagraph. Save the file.
    b. Create a histogram of the Summer Rainfall. Use the SPSS option to draw the normal curve on the histogram.
    c. Within what rainfall values would we find approximately 68% of the summer rainfall values?
    d. By hand, calculate the z-scores for the summer rainfall 1991-96, inclusive.
    e. For each of the years between 1991-1996 inclusive, use the z-score and the abbreviated table of the normal curve to calculate the percentage of the summer rainfalls that would exhibit more rain, and what percent would exhibit less rain.
    f. Use SPSS to calculate the z-scores for the summer and year to date rainfall values in your data set. Have SPSS save the standardized scores as new variables. What are the new variables called? Why do you think they are given this name?
    g. Comment on the headline from the Pantagraph. Is the headline appropriate, based on your understanding of z-scores and the normal curve?
SPSS Output for Question 1

2. Using the CDC smoking data:
    a. Use SPSS to calculate the z-scores for death, tax and smokers. Save the standardized scores as part of the data file.
    b. Look at the z-scores for death and tax. Do the z-scores exhibit a relationship of any sort (ie are those for death positive when tax is negative, or vice versa?)
    c. Use SPSS procedure Graphs ==> Scatter ==> Simple (then follow directions) to create a Scatter Diagram of z-scores of death and tax.
    d. Using the tax and death values for the states of Oregon, Illinois, Vermont and Texas, calculate the percentage of scores that
        i. Fall between the population mean and the state value
        ii. Fall below the state value
        iii. Fall above the state value
e. What percentage of the states have a smoking related death rate GREATER than 250 per 100,000?
f. What percentage of the states have a tax per pack LESS than 50 cents per pack?
SPSS Output for Question 2