ChiSquare Test
Index Notes Labs Web Quests Assignments Quizzes Links Student Work
Statistics can be used to determine if differences among groups are significant, or simply the result of predictable error. The statistical test most frequently used to determine whether data obtained experimentally provide a good fit, or approximation to the expected or theoretical data is the ChiSquare test. This test can be used to determine if deviations from the expected values are due to chance alone, or to some other circumstance. For example, consider corn seedlings resulting from an F1 cross between parents heterozygous for color.
A Punnett square of the F1 cross GgXGg would predict that the expected proportion of green:albino seedlings would be 3:1. Use this information to fill in the expected (e) column and the (oe) column in the table below
Phenotype 
Genotype 
#Observed
(o) 
Expected
(e) 
(oe) 
Green 
GG or Gg 
72 


Albino 
gg 
12 



Total 
84 


There is a small difference between the observed and expected results, but are these data close enough that the difference can be explained by random chance or variation in the sample?
To determine if the observed data fall within acceptable limits, a Chisquare analysis is performed to test the validity of a null hypothesis; that there is no statistically significant difference between the observed and expected data. If the Chisquare analysis indicates that the data vary too much from the expected 3:1, an alternative hypothesis is accepted.
The formula for Chisquare is
X_{2} = E (oe)
E
Where o=observed number of individuals
e=expected number individuals
E=the sum of values (in this case, the differences, squared, divided by the
number expected)
a. this statistical test will examine the null hypothesis, which predicts that the data from the experimental cross above will be expected to fit the 3:1 ratio.
b. Copy the data from the above table to complete the table below.
Phenotype 
Observations(o) 
Expected
(e) 
(oe) 
(oe)^{2} 
(oe)^{2} e 
Green 
72 




Albino 
12 








Chi square 

c. Your calculations should give you a value for Chisquare as 5.14. This value is then compared to the following table.
Critical Values of the Chisquare Distribution


Degrees 
Of 
Freedom (df) 

Probability(p) 
1 
2 
3 
4 
5 
0.05 
3.84 
5.99 
7.82 
9.49 
11.1 
0.01 
6.64 
9.21 
11.3 
13.2 
15.1 
0.001 
10.8 
13.8 
16.3 
18.5 
20.5 
How to Use the Critical Values Table:
1. Determine the degrees of freedom for your experiment. It is the number of categories minus 1. Since there are two possible genotypes, for this experiment df=1 (2samples – 1). If the experiment gathered data for a dihybrid cross, there would be four possible phenotypes, and therefore 3 degrees of freedom.
2. Find the p value. Under the 1 df column, find the critical value in the probability (p) = 0.05 row: it is 3.84. What does this mean? If the calculated Chisquare value is greater than or equal to the critical value from the table, then the null hypothesis is rejected. In other words, chance alone cannot explain the deviations we observed and there is therefore reason to doubt our original hypothesis (or to question our data collection accuracy.) The minimum probability for rejecting a null hypothesis in the sciences is generally 0.05.
3. These results are said to be significant at a probability of p=0.05. This means that only 5% of the time would you expect to see similar data if the null hypothesis were correct; thus you are 95% sure that data do not fit into a 3:1 ratio.
4. If the calculated value was 7.0, then the null hypothesis would still be rejected, but this time at a probability of p=0.01. This means that less than 1% of the time would you expect to collect the observed data if the null hypothesis were correct. Put another way, you would be 99% sure your data do not fit the expected 3:1 ratio.
5. Since these data do not fit the expected 3:1 ratio, you must consider reasons for this variation. Additional experimentation would be necessary. Perhaps the sample size is too small, or errors were made in data collection. In this example, perhaps the albino seedlings are underrepresented because they died before the counting was performed.
Example 2: In a study of incomplete dominance in tobacco seedlings, the following counts were made from a cross between two heterozygous (Gg) plants:
Phenotype 
Genotype 
Observed 
Green 
GG 
22 
Yellowgreen 
Gg 
50 
Albino 
gg 
12 

Total 
84 
Complete the ChiSquare Test