CHAPTER 10

Site hosted by Angelfire.com: Build your free website today!

CHAPTER 10

Correlation

Correlation is simply a measure of the strength of a relationship between variables. There are procedures available for continuous scale, for rank-order, and for nominal scale data. Whenever you have continuous scale data and two variables, X and Y, then the scatterplot is a great place to start.

SCATTERPLOTS

A scatterplot is a graph of data for a correlation study. The X variable is plotted from the horizontal axis on your graph, and the Y variable is plotted from the vertical axis. Each case is represented as a point at the intersection of its X and Y values. Scatterplots give us a great deal of information about the relationship in an easy-to-use form.

Let's look at how you would make a scatterplot from a set of data. Suppose you're interested in the relationship between height and weight in men. Let's call the variable, height in inches, X, and the variable, weight in pounds, Y. Here are your data from a sample of 20 men:

Case #	X	Y	Case #	X	Y
1	65	160	11	76	283
2	69	165	12	71	185
3	70	173	13	69	180
4	73	182	14	73	188
5	75	199	15	72	179
6	68	173	16	67	159
7	70	215	17	68	178
8	67	168	18	65	156
9	72	230	19	76	258
10	71	203	20	63	156

To make a scatterplot, we need to set up a graph with height in inches on the X, or horizontal axis. The range of values we need to include is 63 to 76 inches. We'll put weight in pounds on the Y, or vertical axis. The range of values here is 156 to 283 pounds. Once we've set up our graph, we'll begin to plot our data. For case #1, we'll find 65 on the X axis and 160 on the Y axis. Then we'll move up from 65 on the X axis until we come to a point even with 160 on the Y axis and make a dot at this point of intersection. Here's our first point:

You can plot the other 19 cases from this study in just the same way. Try this; the results are below:

We can gather a great deal of useful information from a scatterplot. For example, the one above tells us something about the direction of the relationship between weight and height and something about the strength of the relationship.

Direction of Relationship

We can try to draw a line that fits through the points of the scatterplot as nearly as possible. This line summarizes the relationship between the two variables. It is called a regression line. When we get serious about regression lines, there are rules for making the perfect line. For our purposes, we can just sketch a line that looks like a pretty good fit. The slope of the regression line tells us the direction of the relationship between X and Y. Look at a possible regression line for our scatterplot of weight and height in men:

The line slopes in a positive direction; that is, as X increases, so does Y. This indicates a positive correlation. The opposite, a negative correlation, occurs when Y decreases as X increases (and vice versa). Here's a scatterplot that shows negative correlation:

So the direction of the relationship is shown by a scatterplot.

Strength of Relationship

The strength of a relationship is also indicated by a scatterplot. Let's compare these three plots:

As you can see, the amount of "scatter" indicates the strength of the association between the X variable and the Y variable. Let's look at a couple of plots that show no scatter at all:

The one on the left is a plot of the relationship between the number of pounds of hamburger purchased and the cost in dollars of the purchase. A regression line fitted to this plot would pass through every one of the plotted points. When this is true, the correlation is said to be perfect; and a perfect correlation means it is possible to predict with perfect accuracy either variable from the other. Because hamburger is sold for a set price per pound, this is true. If we know the number of pounds, we can predict the cost; if we know the total cost, we can predict the number of pounds. Because the relationship is a positive one, that is, as the number of pounds increases, so does the total purchase price, this plot shows a perfect positive correlation.

The plot on the right shows a perfect negative correlation, the relationship between distance from the source of a smell and strength of the smell. Once again, it is possible to fit a regression line to the plot that passes through every one of the points, and it is possible to predict one variable from the other with perfect accuracy.

There are two situations in which a regression line will pass through every point plotted, but where there is no correlation at all. Let's look at these:

On the left, we're studying a group of 42 year-old women and examining the relationship between age in years and weight in pounds. Even though a regression line can be fitted to every point on the plot, it is clear we have no predictive ability at all. Knowing a woman's age doesn't help us at all to predict her weight because the Y variable doesn't vary in this sample--everyone is 42 years old.

On the right, we have a similar problem. Here we're studying couples married 5 years and examining the relationship between years of marriage and age in years. Once again, we have no predictive ability because the X variable doesn't vary. So horizontal and vertical lines are exceptions to the straight-line-means-perfect-correlation rule. The reason for this is that, when one of the variables doesn't vary, we have no predictive ability at all.

Linearity

A scatterplot also tells us whether the relationship between X and Y is linear, that is, the best fit is accomplished by a straight line. Nonlinear relationships are ones in which a curve gives the best fit. We've looked at several linear relationships in earlier scatterplots. Now let's take a look at a couple of nonlinear ones:

On the left is a scatterplot of the relationship between driving speed and gas mileage. Note that there is an optimum speed which gives the best gas mileage, but speeds below and above this optimum give lower gas mileage. This nonlinear relationship would be best fitted with a curve. On the right is a scatterplot of the relationship between time spent shopping and the cost savings that are realized. Here it is clear that up to a point, there's a substantial gain in savings by spending more time shopping; after that point the returns start to level off for additional time spent--the famous point of diminishing returns. This plot is also best fitted with a curve.

Homoscedasticity/Heteroscedasticity

These interesting words refer to whether the scatter around a regression line is about the same through the entire range of values of the variables. When the scatter is pretty even throughout the range, the plot is said to be homoscedastic (homo-si-das-tic). When the scatter is wider at some points than others, the plot is called heteroscedastic (hetero-si-das-tic). Here's an example:

The plot shows the relationship between annual income and nutritional status. Note that the relationship between the variables is quite close at the lower end of the income/nutritional status scale, but becomes increasingly weak at the high end. This may be because once a certain level of income is achieved, there is enough money to maintain proper nutrition; after that point the matter is one of choice. Heteroscedastic relationships are more complicated to analyze than homoscedastic ones.

Outliers

A scatterplot also helps us to spot outliers, cases which are very different from the rest of cases in their combined values for X and Y. It is important to note that the value for X may be well within the range of other cases, and the value for Y may also be well within the range of other cases; it is the combined values that mark the case as different. Outliers show up on a scatterplot because they're well separated from the rest of the cases. Let's look at a scatterplot with an outlier:

The outlier, outlined in red, is way off in the corner by itself. They're easy to spot in a scatterplot. Outliers are of interest for a number of reasons: they might represent a data-entry error--always worth a check when something like this shows up; they may represent an exception to whatever relationship the correlation shows; and they distort computed correlation.

PEARSON PRODUCT-MOMENT CORRELATION

The Pearson product-moment correlation is the most widely used correlational procedure. It measures the strength of the relationship between two variables by measuring the degree to which scores on those variables agree. It is a parametric procedure which assumes that the data are continuous and that the variables are normally distributed.

Computing the Pearson Correlation (r_p)

The Pearson r is based on the z-scores of the values of X and Y for each case. Since a z-score expresses the value in terms of how many standard deviations it falls from the mean of the distribution, it standardizes the scales on which the variables were measured.

To compute the Pearson r, we need a z-score for each value of X and Y. Here's the formula:

The z_Xz_y product is called a cross-product. We compute a cross-product for each case and then add up the cross products, then divide by N. As agreement between values of X and Y increases, the cross-products become increasingly positive; as inverse agreement increases, the cross-products become increasingly negative. The Pearson r is really just the mean of the cross-products (Add them all up and divide by N.), so it is the mean amount of agreement between locations of cases on X and Y.

Sample Problem

Let's find the Pearson r for the data on men's height and weight. We need to compute a z-score for each value of X and Y. Remembering that a z-score expresses a score's value in terms of how many standard deviations it is above or below the mean, let's revisit that formula:

Of course, to find z-scores for values of X and Y, we need to know the standard deviation of each set of scores. Looks like we have our work cut out for us. Let's get started:

Case	X
1	65	-5	25
2	69	-1	1
3	70	0	0
4	73	3	9
5	75	5	25
6	68	-2	4
7	70	0	0
8	67	-3	9
9	72	2	4
10	71	1	1
11	76	6	36
12	71	1	1
13	69	-1	1
14	73	3	9
15	72	2	4
16	67	-3	9
17	68	-2	4
18	65	-5	25
19	76	6	36
20	63	-7	49
	1400		252

Case	Y
1	160	-29.5	870.25
2	165	-24.5	600.25
3	173	-16.5	272.25
4	182	-7.5	56.25
5	199	9.5	90.25
6	173	-16.5	272.25
7	215	25.5	650.25
8	168	-21.5	462.25
9	230	40.5	1640.25
10	203	13.5	182.25
11	283	93.5	8742.25
12	185	-4.5	20.25
13	180	-9.5	90.25
14	188	-1.5	2.25
15	179	-10.5	110.25
16	159	-30.5	930.25
17	178	-11.5	132.25
18	156	-33.5	1122.25
19	258	68.5	4692.25
20	156	-33.5	1122.25
	3790		22061.00

Case	X	z_x	Y	z_y	z_xz_y
1	65	-1.41	160	-0.89	1.25
2	69	-0.28	165	-0.74	0.21
3	70	0	173	-0.50	0
4	73	0.85	182	-0.23	-0.20
5	75	1.41	199	0.29	0.41
6	68	-0.56	173	-0.50	0.28
7	70	0	215	0.77	0
8	67	-0.85	168	-0.65	0.55
9	72	0.56	230	1.22	0.68
10	71	0.28	203	0.41	0.11
11	76	1.69	283	2.82	4.77
12	71	0.28	185	-0.14	-0.04
13	69	-0.28	180	-0.29	0.08
14	73	0.85	188	-0.05	-0.04
15	72	0.56	179	-0.32	-0.18
16	67	-0.85	159	-0.92	0.78
17	68	-0.56	178	-0.35	0.20
18	65	-1.41	156	-1.01	1.42
19	76	1.69	258	2.06	3.48
20	63	-1.97	156	-1.01	1.99
					15.75

The Pearson r is also called the Pearson coefficient of correlation. Perfect positive correlation between the values of X and Y for the cases in a sample would result in a value of r_p = +1. Perfect negative correlation would result in r_p= -1. If there is no relationship at all between values of X and Y, r_p = 0. So the range of possible values for r_p is from -1 to +1, with values close to either extreme indicating stronger correlation and values toward the middle (0) indicating weaker correlation.

So our Pearson r of +0.79 is a fairly strong correlation, indicating a strong relationship between height and weight in our sample of men.

The Pearson r therefore tells us two things: the direction of the relationship, whether negative or positive; and the strength of the relationship, from -1 to +1. Additionally, an even more useful measure when you get interested in regression (prediction) is r_p². This value tells the proportion of variance in Y that is explained by X, and vice versa. For our height-weight study, the value of r_p² is 0.624, which means that 62.4% of the variance in height is explained by weight and 62.4% of the variance in weight is explained by height. That's quite a bit.

Limitations

There are some problems with the Pearson r that limit its usefulness in certain circumstances. One difficulty is with restricted variance, that is, when either X or Y is not free to vary fully. Restricting variance will reduce the value of r_p, no matter how closely rrelated the variables might really be. Note from our scatterplots of a vertical and a horizontal line, that allowing no variance at all for either X or Y reduces correlation to zero. Well, even restricting variance causes problems. So in studies where a portion of the possible values of one variable are excluded, the Pearson r will give falsely low results. This might happen, for example, if you look at GPA of graduating students; you've restricted the GPA variable to those over (usually) 2.0 because those with lower GPAs don't graduate.

Another problem is with nonlinear relationships. The Pearson r is designed only to measure linear relationships and will have a falsely low value when used to evaluate a relationship that is nonlinear. You will find that, even though the measure is insensitive to nonlinear relationships, it is widely used in many such cases where there is at least some linear component. I think that's because people are used to it, and because it is widely considered to be the best measure of correlation, even when it is inappropriately used.

And the Pearson r overreacts to outliers. Remember our discussions of how an extreme value affects a mean? Well, the Pearson r is simply a mean, so it is disproportionately influenced by extreme values (outliers).

The Statistical Significance of r_p

We are faced with the same question when evaluating correlation that we were faced with when doing significant difference testing: is the result we have a probable result of sampling error? We can use much the same set of procedures to make this determination.

Hypotheses. We can write two hypotheses to explain the relationship we are seeing:

H₀: The correlation observed is no greater than would be expected in a sample drawn from a population in which the variables X and Y are unrelated. The correlation is an effect of sampling error and is nonsignificant. The variables X and Y are unrelated.

H₁: The correlation observed is greater than would be expected in a sample drawn from a population in which the variables X and Y are unrelated. It is unlikely to be an effect of sampling error and is significant. The variables X and Y are related.

The Test Statistic. We need a test statistic which varies in size with the strength of the relationship being tested. Since r_p does just that, we can use it as our test statistic.

The Sampling Distribution of r_p. Here we draw all possible samples of size N from a population in which X and Y are unrelated and compute the Pearson r for each sample. The resulting distribution of values of r_p is our sampling distribution. Once again, the mean and most likely value for r_p is 0, with larger values up to +1 and smaller values down to -1 less and less frequent. If N is large enough, the sampling distribution is nearly normal; with smaller Ns the shape will flatten. Once again, we have access to critical values of r_p for various values of N, which are provided in a table, Table 8, Critical Values of the Pearson Product-Moment Correlation, on page 417 in your textbook. We enter the table with either a one-tail or two-tail test at a given degrees of freedom, df = N - 2, and a given significance level.

We can use the critical value for r_p to mark the critical region, then determine whether our obtained r_p is within that critical region. For our study of height and weight in men, a two-tail test at a significance level 0.05 with df = N - 2 = 20 - 2 = 18, the critical value for r_p is 0.4438. Our obtained r_p of 0.79 then falls in a critical region and is significant. We will accept H₁ and reject H₀, concluding that our correlation is not likely to be due to sampling error and that our variables, X and Y, are indeed correlated in the population from which our sample was drawn.

One word about significance tests on coefficients of correlation: when N is large enough, even a very small correlation is found significant. For a one-tail test, with N = 102, an r_p of 0.16 is significant. While this means the correlation is reliable, that is, likely to be seen again in another sample, this doesn't mean the relationship is a strong one. And, in fact, 0.16 is a pretty weak correlation. Don't confuse strength of association with significance. They are two different things.

THE SPEARMAN RANK-ORDER CORRELATION

When would we need an alternative to the Pearson product-moment correlation? Perhaps when our data don't meet the assumptions of the Pearson correlation: the variables aren't distributed normally in the population or when we don't have continuous scale data. Other circumstances in which we'd need an alternative include those which cause problems for the Pearson correlation: situations with restricted variance, nonlinear relationships, or when there is an outlier.

The alternative is the Spearman rank-order correlation. This procedure, as its name indicates, uses ordinal scale data. If you have continuous scale data, you can scale it down to ordinal scale by rank-ordering the values of X and Y.

Computing the Spearman correlation (r_s)

We've already talked about the first step in computing the Spearman r; it involves rank-ordering the values of X and the values of Y, if this is not already done. X and Y are ranked separately. You already know the rules for ranking from working with the Mann-Whitney U and the Wilcoxon T in Chapter 7--ascending order (lowest to highest) and handling ties.

Once this is done, you will compute a difference score (D) for each case. D is the difference between the ranking on X and on Y (rank for X - rank for Y) for each case. Having that, you use the formula:

Don't forget the subtracting-from-1 step; lots of times people get so wrapped up in the difference scores and all the squaring and adding up and dividing that they forget to do this last important step.

Sample Problem

Let's use for our example the same weight-height study we've already looked at. There isn't any reason we can't do a Spearman r on these data for illustration; the data are suitable for this procedure too, even though the Pearson r is preferred. Here are the data again:

Case #	X	Y	Case #	X	Y
1	65	160	11	76	283
2	69	165	12	71	185
3	70	173	13	69	180
4	73	182	14	73	188
5	75	199	15	72	179
6	68	173	16	67	159
7	70	215	17	68	178
8	67	168	18	65	156
9	72	230	19	76	258
10	71	203	20	63	156

Now we need to rank the X and the Y scores:

X	Rank	Y	Rank
63	1	156	1.5
65	2.5	156	1.5
65	2.5	159	3
67	4.5	160	4
67	4.5	165	5
68	6.5	168	6
68	6.5	173	7.5
69	8.5	173	7.5
69	8.5	178	9
70	10.5	179	10
70	10.5	180	11
71	12.5	182	12
71	12.5	185	13
72	14.5	188	14
72	14.5	199	15
73	16.5	203	16
73	16.5	115	17
75	18	130	18
76	19.5	258	19
76	19.5	283	20

And let's put these rankings back into our original set of data so that we can compute difference (D) and squared difference (D²).

Case	X	Rank_X	Y	Rank_Y	D	D²
1	65	2.5	160	4	-1.5	2.25
2	69	8.5	165	5	3.5	12.25
3	70	10.5	173	7.5	3.0	9.0
4	73	16.5	182	12	4.5	20.25
5	75	18	199	15	3.0	9.0
6	68	6.5	173	7.5	-1.0	1.0
7	70	10.5	215	17	-6.5	42.25
8	67	4.5	168	6	-1.5	2.25
9	72	14.5	230	18	-3.5	12.25
10	71	12.5	203	16	-3.5	12.25
11	76	19.5	283	20	-0.5	0.25
12	71	12.5	185	13	-0.5	0.25
13	69	8.5	180	11	-2.5	6.25
14	73	16.5	188	14	2.5	6.25
15	72	14.5	179	10	4.5	20.25
16	67	4.5	159	3	1.5	2.25
17	68	6.5	178	9	-3.5	12.25
18	65	2.5	156	1.5	1.0	1.0
19	76	19.5	258	19	0.5	0.25
20	76	1	258	1.5	0.5	0.25
						172.00

Having that done, we can fit things into our formula and compute r_s:

Evaluating Significance

Significance of the Spearman r is tested just as is the significance of the Pearson r. We write hypotheses in the same manner; our test statistic is the coefficient of correlation, this time, r_s; and we use tabled values from a sampling distribution (this time of r_s)to identify the critical value. Since this is a different sampling distribution, we read a different table, this time Table 9, Values of the Spearman Rank-Order Correlation, on page 418 in the textbook.

Hypotheses. We write two hypotheses to explain the correlation seen:

The Significance Test. Entering the table with a two-tail test at significance level 0.05 with df = N = 20, we find a critical value of r_sas 0.450. Our value is in a critical region, so it is significant. We accept H₁ and reject H₀.

It is useful to note that when there are lots of tied ranks, the Spearman r will be falsely a little high. We see this effect here; there were many tied ranks in the X values and a few in the Y values, and the Spearman r is quite a bit higher than the Pearson r. A solution to this problem (which you do NOT have to know for the test) is to find ranks just as you did in this problem, then use the Pearson formula to find a coefficient of correlation on these ranks. This method gives a fairer picture of the correlation in this circumstance.

THE CHI-SQUARE TEST OF ASSOCIATION

We've dealt with continuous and ordinal scale variables; now we need a means to evaluate the relationship between nominal scale variables. And that takes us back into very familiar territory--the Chi-square test of association. If you think about it, this makes sense; when we did Chi-square tests in Chapters 6 and 7, it was to evaluate the relationship between variables there too. You'll find that the procedure is pretty much the same as it was when you used Chi-square for two-sample significance tests.

Statistical Significance

This is essentially a significance test, just like the ones we did in Chapter 7. First we write two hypotheses. Then the contingency table is set up with categories for one variable across the top and categories for the other variable down the left side. You then fill in observed frequencies (actual counts from your sample) in each cell of the table. It is important to note that, just as with two-sample Chi-square tests every case must fit into only one cell--overlap messes up the test. Then you compute row and column totals so that you can find expected frequencies. From there, obtained Chi-square is computed and the significance test performed. To refresh your memory, here's the formula for obtaining Chi-square:

It is useful to remember that all the old rules about small expected frequencies and Chi-square still apply. In a 2X2 table, there should be no expected frequencies below 5; in a larger table at least 80% of the cells should contain expected frequencies of at least 5, and no cell should have an expected frequency less than 1.

Sample Problem

Before I start assuming too much about just what you remember from Chapter 7, let's run a sample problem just to make sure we're all on the same page. Here goes:

Suppose we're looking at marketing for a kitchen cabinet company. We know that in some households the buying decisions are made primarily by a man, and in others they're made primarily by a woman. We're interested in determining whether male and female buyers have different buying preferences, so that we know whether we have to market separately or differently to men and to women. So we want to investigate the relationship between gender of the primary decision-maker and preferred cabinet materials.

Our sample is composed of people who purchased cabinets during a three-month survey period. We asked the buyers whether the buying decision was primarily made by a man or by a woman, and we asked these buyers which cabinet materials they preferred, considering price, appearance, and functionality. The results are as follows:

Our sample consisted of 154 shoppers, 49 male decision-makers and 105 female decision-makers. Among the men 14 preferred natural wood cabinets, 26 preferred stained wood, and 9 preferred laminated surfaces. Among the women 31 preferred natural wood, 20 preferred stained wood, and 54 preferred laminates.

Hypotheses. First we write our 2 hypotheses:

H₀: The value of Chi-square obtained from this sample is no greater than would be expected due to sampling error. A sample drawn from a population in which the variables are unrelated would be likely to show a value for Chi-square as large as this one. The differences seen are nonsignificant and are due to sampling error. Male and female shoppers do not differ in their cabinet material preferences.

H₁: The Chi-square obtained from this population is unlikely to be due to sampling error. It is not likely that a population in which these variables are unrelated would show a value for Chi-square as large as this one. The differences seen are significant and are not due to sampling error. Male and female shoppers do differ in their cabinet material preference.

Setting up the Contingency Table. Now we set up our table with categories for one variable across the top and categories for the other variable down the left side. When that's done, we can fill in the observed frequencies from our sample data.

	Male	Female	Row Totals	Percentages
Natural Wood	14	31
Stained Wood	26	20
Laminates	9	54
Column Totals

Computing Expected Frequencies. To compute expected frequencies, remember that the goal is to determine how many men and women would be expected to display each preference if there were no difference between men and women--in other words, if the variables were unrelated. To do this, we need column totals and row totals and percentages.

	Male	Female	Row Totals	Percentages
Natural Wood	14	31	45	.292
Stained Wood	26	20	46	.299
Laminates	9	54	63	.409
Column Totals	49	105	154	1.000

So we know that 29.2% of the entire survey group (men and women) preferred natural wood cabinets. That means that, if women and men do not differ on this preference, 29.2% of the men (or 14.3) and 29.2% of the women (or 30.7) will prefer natural wood. These are our expected frequencies. We will undertake the same steps to find the expected frequencies in the remaining cells, using the row percentages to determine the number of men and women with each preferences. Go ahead and do this. The results follow:

	Male	Female	Row Totals	Percentages
Natural Wood	14 (14.3)	31 (30.7)	45	.292
Stained Wood	26 (14.7)	20 (31.4)	46	.299
Laminates	9 (20.0)	54 (42.9)	63	.409
Column Totals	49	105	154	1.000

Finding Obtained Chi-square. We use the formula to compute Chi-square.

f_o	f_e	f_o-f_e	(f_o-f_e)²
14	14.3	-0.3	0.09	0.0063
31	30.7	0.3	0.09	0.0029
26	14.7	11.3	127.69	8.6864
20	31.4	-11.4	129.96	4.1389
9	20	-11	121	6.0500
54	42.9	11.1	123.21	2.8720
	154.0			20.7565

Statistical Significance. Now we can proceed to test statistical significance just as we did in Chapter 7. We need to determine the critical value for Chi-square, use it to mark off the critical region, and determine whether our obtained Chi-square falls within this critical region. We use the same table to find the critical value for Chi-square as we've used all along. Degrees of freedom are done as before [df = (R - 1) (C - 1)]. So for this problem df = (2) (1) = 1. For a significance level of 0.05, the critical value of Chi-square is 5.991. That means our obtained Chi-square of 20.7565 is in the critical region and is significant. We'll accept H₁ and reject H₀. Gender and cabinet material preference are related.

Now what we've determined is that there is a significant difference between men and women in their cabinet material preference. This means that the difference is reliable--that we could expect to see such a difference between men and women if we studied another sample entirely. With nominal scale variables, we still have a job to do; and that is to describe the sort of difference we see. It looks as though there is a strong tendency for women to prefer laminates, particularly over stained wood, while there is a strong tendency for men to prefer stained wood, particularly over laminates.

Strength of Association

For starters, let's compare Chi-square with the coefficient of correlation (both Pearson and Spearman). The coefficient of correlation has a range of values from -1 to +1, with both of these values indicating a perfect correlation and 0 indicating no relationship at all.

Chi-square can never be negative; this makes sense when you realize that it results from squaring differences. Since squared numbers are always positive, there's no way to get a negative Chi-square. When you think about it, that's OK because the sign (+ or -) indicates the direction of the relationship--whether Y increases with X or increases while X decreases. With nominal scale variables, there is no increasing and decreasing because nominal scale variables don't have any quantitative component. We can't say that cabinets increase from natural wood to stained wood to laminates--doesn't make any sense. So to talk about the direction of the relationship is nonsensical. That means that there's no need for negative values of Chi-square; they wouldn't tell us anything anyhow.

So what value of Chi-square indicates a perfect correlation? Well . . . . . that's very difficult to say. This is because the value of Chi-square gets larger as N gets larger. Really big samples could have really big values for obtained Chi-square--the larger the number of cases, the larger the observed and expected frequencies and the larger the obtained Chi-square. So a Chi-square of 0 means there is no relationship between the variables, just as it does for coefficients of correlation. But if we ask ourselves what number would reflect perfect correlation, there isn't an answer. It depends on N.

Chi-square reflects the strength of association between nominal scale variables, but it's hard to interpret. It would be nice to have a measure that works like coefficients of correlation, topping out at +1 for a perfect correlation. Fortunately, there is such a measure. It is called Cramer's V. Its formula follows:

Note that a lower case n is part of the mix now; this stands for the number of rows or the number of columns, whichever is smaller. Note also that we're now accounting for the sample size. This puts the obtained Chi-square into a form that is more directly comparable with a coefficient of correlation. Possible values for Cramer's V range from 0, which indicates no relationship at all, to +1, which indicates a perfect relationship.

Let's apply the Cramer's V computation to the results from the gender and cabinet material study. Since we have 3 rows and 2 columns and 2 is smaller than 3, n = 2.

Even though the significance test indicates that the association between gender and cabinet material preference is reliable, it isn't terribly strong. And that's much easier to see with Cramer's V than it was by looking simply at the obtained value of Chi-square.

CORRELATION AND CAUSATION

It is very tempting to assume, when we find a correlation between two variables, that one causes the other. We talked about examples of this back in Chapter 1 (if you can remember back that far), and your book has a wonderful example related to ice cream consumption and drownings (p. 334). Let's look at another example that has real-life significance.

It has been known for a long time that the incidence of colds increases in cold weather; that's where they got the name, cold. Most people think that being cold makes you more likely to get a cold. Parents are forever telling their kids to "put on a coat or you'll catch your death of cold." Laying aside the fact that people don't die from colds--ever, let's look at the easy assumption of causation here. It's pretty clear that these parents think being cold gives you a cold.

Now, do you remember from Chapter 1 what we said about demonstrating causation--that one variable causes another? We said that there is only one way to prove causation; and that way is to do an experimental study. That's a study where we deliberately manipulate one variable (the independent variable) and look for its effect on the other variable (the dependent variable). Well, that study's been done, repeatedly, mostly at the Salisbury Cold Institute in England. Here's what they did: they set up two groups of subjects. One group would get wetted down and put in a wind tunnel at low temperatures so they got thoroughly chilled. The other group would be kept warm and dry. Then both groups of subjects were exposed to identical doses of cold viruses and the numbers of colds that resulted were counted. Over and over again, the results of these studies were the same. The people who were chilled didn't get any more colds than those who were kept warm. The conclusion: we did NOT prove causation; in fact, we proved that being cold does NOT give you a cold or make you more susceptible to cold viruses or make you more likely to get a cold.

What's happened over the years is that people all over the world have mistaken correlation for causation. The fact that cold weather and colds occur together (are correlated) does not inescapably lead to the conclusion that being cold causes colds. In fact, we now know that the reason colds are more common in cold weather is that when it's cold outside people tend to congregate inside buildings where they share the air and are closer together. So when one person gets a cold, the chances are good that the cold viruses will be transferred to others. When the weather is warmer and people hang around outside more, the chances that your sneeze will land on someone else are much smaller than they would be if you were together in a room. That's all.

Statistical Procedures

Now, I'm going to ask you to remember way back to Chapter 1 again. Do you remember how you were encouraged to link significant difference tests with causation studies? Now you might wonder whether you could just as well use a correlation analysis to demonstrate causation in an experimental study. The answer is, "Absolutely!" Either procedure will do nicely. So why do we rely on significant difference tests instead of correlation tests for these studies? Good question. The reason appears to be that people like it that way. There's no big advantage for one over the other, just the way things are done.

You might also have noticed when we were looking at significant difference tests that some of these weren't experimental studies at all; they were studies done on preexisting groups like smokers and nonsmokers or men and women or old people and young people. In fact significant difference tests work fine in these kinds of studies; but in these studies they don't demonstrate causation--just show there's a relationship.

This means that the distinction between statistical procedures you were given in Chapter 1 was a little artificial. We can use significant difference tests and correlation tests pretty much interchangeably. (Why did we do this? To keep things simple at the beginning; it was confusing enough then, wasn't it?)

The real bottom line is this: whether we can prove causation depends, not on the statistical procedure we use, but on the kind of study that generated the data we're analyzing. If the study is experimental, we can prove (or disprove) causation. If the study was not, we can't. Period.

STATISTICAL VERSUS PRACTICAL SIGNIFICANCE

The issue of statistical and practical significance has come up before too. It is a particular problem with correlation because N so directly affects the likelihood that we'll find a given correlation significant. Once again, significance only lets us know whether a correlation is reliable; it doesn't tell us whether the correlation is important. For that, we have to use common sense as we look over our data and our conclusions.

CONCLUSION

You've finished Chapter 10. If you haven't gone through the Comprehension Checks in the chapter, I encourage you to do so. Once you've worked your way through these and checked your answers, try the Review Exercises at the end of the chapter. Remember, the answers for these are provided in your book too. This gives you many opportunities to practice and to check your understanding.

When you've finished all of the practice problems in the textbook, request a Chapter 10 worksheet via e-mail. It will be sent along with answers, so that you may check your work. When you feel ready, request the test.

You've now finished the entire course. As soon as you take the Chapter 10 test, you're on your way! We'll handle testing and retesting in the same manner as we have for every previous chapter; the big difference now is that, when you've finished with this, there won't be another chapter waiting for you. I hope you've enjoyed your brief trip through this foreign land and understand the basic concepts of statistics well enough to help you interpret things you read in your personal and professional life. Good luck with the rest of your education!