Analysis of Variance Review Questions

1.     What are the 2 big “R’s” of experimental design, and why are they important?

Randomization – insures that each treatment has an equal chance of being assigned to each experimental unit.  Needed in order to guarantee unbiased estimates of treatment effects and experimental error.

Replication – each treatment level should be applied to multiple experimental units.  Needed in order to construct an estimate of experimental error.

2.     What is the between group sum of squares measuring?  What is the within group sum of squares measuring?

The between group sum of squares is measuring the deviation of the treatment means from the overall mean.  That is, it is measuring the “separation” between the treatment groups.  The within group sum of squares is measuring the deviation of the individual observations from their respective treatment means.  It is measuring the variability within the treatment groups.  That is, it provides a measure of the homogeneity of the treatment groups.

3.     How can we use the between group and within group sums of squares to determine whether there is a difference between the treatment means for the different levels of our treatment?

If the between group variability is large in comparison to the within group variability, that would indicate that there is a true difference between the means for the different levels of our treatment groups.

4.     The data for this example come from Exercise 7.4.1 (p.150) in Steele, Torrie, and Dickey.  I have coded the data in SAS as follows:

WEIGHT represents the average plant weight in grams of red clover

BREED represents the level of inbreeding with

1 = no inbreeding, 2 = slight inbreeding, 3 = moderate inbreeding,

4 = strong inbreeding

I ran the following SAS code to generate the analysis of variance shown below:

proc glm ;

class breed ;

model weight = breed ;

contrast 'contrast 1' breed 1 -1 0 0 ;

contrast 'contrast 2' breed 1 0 -1 0 ;

contrast 'contrast 3' breed 1 0 0 -1 ;

run ;

The GLM Procedure

Class Level Information

Class         Levels    Values

breed              4    1 2 3 4

Number of observations    39

The GLM Procedure

Dependent Variable: weight

Sum of

Source                      DF         Squares     Mean Square    F Value    Pr > F

Model                        3     56411.78419     18803.92806      23.06    <.0001

Error                       35     28545.13889       815.57540

Corrected Total             38     84956.92308

R-Square     Coeff Var      Root MSE    weight Mean

0.664005      13.58425      28.55828       210.2308

Source                      DF       Type I SS     Mean Square    F Value    Pr > F

breed                        3     56411.78419     18803.92806      23.06    <.0001

Source                      DF     Type III SS     Mean Square    F Value    Pr > F

breed                        3     56411.78419     18803.92806      23.06    <.0001

Contrast                    DF     Contrast SS     Mean Square    F Value    Pr > F

contrast 1                   1      9322.84444      9322.84444      11.43    0.0018

contrast 2                   1     25844.06349     25844.06349      31.69    <.0001

contrast 3                   1     54531.14683     54531.14683      66.86    <.0001

Based on this code and output, answer the following questions:

a.      Report the Between group sum of squares.  Report the Within group sum of squares.

The Between group sum of squares is the model sum of squares:  56,411.78

The Within group sum of squares is the error sum of squares:  28,545.14

b.     Conduct a hypothesis test to determine whether or not the level of inbreeding has any effect on the weight.

Ho:  All treatment means are the same

Ha:  At least one treatment mean is different

F =23.06,  p-value<0.0001

At the 5% significance level, since the p-value is less than 0.05, we reject the null hypothesis and conclude that there are differences between the mean weights for the different levels of inbreeding.

c.     The mean of treatment group 1 (no inbreeding) is 271.56, and the mean of treatment group 2 (slight inbreeding) is 220.67.  Construct a 95% confidence interval for the difference between the mean weight for plants with no inbreeding and the mean weight for plants with slight inbreeding.  Based on this confidence interval, does there appear to be a significant difference between the average weights for these 2 groups?  Why, or why not?

(271.56 – 220.67) +/- (2.0315)*sqrt{[(815.5754)/9] + [(815.5754)/6]}

50.89 +/- (2.0315)*sqrt(90.62 + 135.93)

50.89 +/- (2.0315)*(15.05)

50.89 +/- 30.57

So, we are 95% confident that the true difference between the treatment means is between 20.32 and 81.46.  And, since this interval does NOT contain 0, we can say that there is a significant difference between the average weights for plants with no inbreeding and plants with slight inbreeding.

d.     What hypothesis is being tested by contrast 1?  Based on the p-value given for the F-test, what is your conclusion?  How does this compare to the conclusion from part (c)?

Ho: Mno inbreeding - Mslight inbreeding = 0

Ha: Mno inbreeding - Mslight inbreeding not = 0

p-value = 0.0018

At the 5% significance level, we see that 0.0018 < 0.05.  Therefore, we reject the null hypothesis and conclude that there is a significant difference between the mean weight for plants with no inbreeding and the mean weight for plants with slight inbreeding.

Part (c) gave us an alternative way of determining whether or not there is a significant difference between the 2 treatment means.  Our conclusion was the same – there is a significant difference between the mean weights for plants with no inbreeding and plants with slight inbreeding.

e.      How would you define a contrast to test the hypothesis that the mean for plants with moderate inbreeding is the same as the mean for plants with strong inbreeding?  Report the coefficients for each treatment level and state the null and alternative hypotheses for your test.

Coefficients: 0  0  1  -1

Ho:  Mmoderate inbreeding - Mstrong inbreeding = 0

Ha:  Mmoderate inbreeding - Mstrong inbreeding = 0

Or, equivalently:

Ho:  Mmoderate inbreeding = Mstrong inbreeding

Ha:  Mmoderate inbreeding not =  Mstrong inbreeding