Up to this point, our goal has been to estimate a population parameter from a sample statistic. BUT, sometimes you have a different goal in mind. Suppose that you want to use the data to assess the validity of some claim being made about the population. To do this, you use a hypothesis test.

Example:

A manufacturer claims that a particular sport utility vehicle (SUV) gets 22 mpg on the highway. You suspect that the true highway gas mileage is less than that. The manufacturer also reports that the standard deviation for mileage is 3 mpg. You decide to test drive 30 of these SUVs and find out that the average highway mileage for your sample is 19.2 mpg. Does this sample give you enough evidence to support your theory that the manufacturer’s claim is false, or was it simply the result of chance?

For this problem, the central limit theorem tells us that has a normal distribution since n>=30. If we assume that the manufacturer’s claim is true, then the distribution has a mean of 22 and a standard deviation of 3/sqrt(30)=0.5. If we sketch a normal curve, we notice that our sample mean of 19.2 falls approximately 6 standard deviations away from the mean. So, our sample mean appears to be a very extreme value for this distribution.

Question: How extreme does your sample value have to be in order to reject the manufacturer’s claim with some degree of confidence?

Hypothesis tests tell us how to answer that question.

Definitions:

H_{o} = null hypothesis – what’s supposed to be true about the
data. We’re trying to find evidence against the null hypothesis.

H_{a} = alternative hypothesis – what you suspect is true of
the data – what you’re trying to prove.

The null hypothesis will have the form:

H_{o: m}= m_{o}

The alternative hypothesis will have one of the following forms:

One-sided alternatives:

H_{a}: m > m_{o}

H_{a}: m < m_{o}

Two-sided alternative:

H_{a}: m ¹
m_{o}

For our example, then, the null and alternative hypotheses will be:

H_{o: m}= 22

H_{a}: m < 22

We need to answer the question: "Can we reject the null hypothesis in favor of the alternative?"

If we assume that the null hypothesis is true, we could calculate the probability that we could have gotten a sample value this small simply by chance:

Pr(x-bar<=19.2) = Pr{Z<=(19.2-22)/0.5}=Pr(Z<=-5.6) = 0

So, there is practically no chance that we could have gotten a sample value this small if the null hypothesis is really true.

We call this probability the p-value.

p-value: Assume that the claim made under the null hypothesis is true. Then, the p-value is the probability that you could have gotten a value more extreme than the one computed from your sample.

BOTTOM LINE: The smaller the p-value, the stronger the evidence AGAINST the null hypothesis.

So, the basic steps for conducting a hypothesis test are:

- Describe your claim in terms of a population parameter.
- State the null hypothesis.
- State the alternative hypothesis (claim you believe to be true).
- Calculate an estimate of the population parameter from your sample.
- Assess whether your sample statistic is extreme enough to reject the null hypothesis by evaluating the p-value.

Suppose that you have a simple random sample of size n. To test the null hypothesis

H_{o: m}= m_{o}
at the a -significance level:

1. Compute the test statistic:

z = (x-bar - m_{o} ) / {s
/ sqrt(n)}

- Calculate the p-value as follows (note that the form of the p-value depends on the form of the alternative hypothesis):
- Compare the p-value to a . If the p-value is less than a , then you REJECT the null hypothesis. If the p-value is greater than a , then you FAIL TO REJECT the null hypothesis.

H_{a}: m > m_{o
}p-value = Pr{ Z >= (x-bar - m_{o}
) / [s / sqrt(n)] }

H_{a}: m < m_{o}
p-value = Pr{ Z <= (x-bar - m_{o}
) / [s / sqrt(n)] }

H_{a}: m ¹
m_{o} p-value = 2*Pr{ Z >= | (x-bar
- m_{o} ) / [s
/ sqrt(n)] | }

Example:

Define the null and alternative Hypotheses for each of the following scenarios:

1. Past experience indicates that the average time for high school students to complete a standardized test is 35 minutes. The average time for a random sample of 20 students to complete the test is 33.1 minutes. Is there evidence to suggest that the average time it takes for a high school student to copmlete the test is less than 35 minutes?

H_{o}: m = 35_{}

H_{a}: m <= 35_{}

2. A manufacturer of sports equipment has developed a new synthetic fishing line that the claims has a mean breaking strength of 8kg with a standard deviation of 0.5 kg. A random sample of 50 lines is tested and found to have a mean breaking strength of 7.8kg. Perform a hypothesis test to determine whether or not the manufacturer's claim is true.

H_{o}: m = 8kg_{}

H_{a}: m ¹
8kg

3. The Edison Electric Institute has published figures that claim that a trash compactor is run an average of 125 hours per year. If random sample of 49 homes equipped with a trash compactor indicates an annual average usage of 126.9 hours per year with a standard deviation of 8.4 hours, does this suggest that trash compactors are used on the average more than 125 hours per year?

H_{o}: m = 125_{}

H_{a}: m >= 125

Conduct hypothesis tests for the following two scenarios:

1. In 1998, Habitat for Humanity published figures about fair market rents for 2-bedroom apartments. The fair market rent for Maine was listed as $590. You can assume that the standard deviation for rents on 2-bedroom apartments in Maine is $73.10. A sample of 32 2-bedroom apartments in Maine was taken and the average rent for the sample was $602.28. At the 5% significance level, do the data suggest that the mean monthly rent for 2-bedroom apartments in Maine differs from the fair market rent of $590?

- Is the sampling distribution of x-bar normal?
- Is s known? If so, what is its value?
- State the null and alternative hypotheses.
- Calculate the test statistic.
- Calculate the p-value.
- At what significance level should the test be performed (e.g., what is a )?
- Use the p-value to draw a conclusion about the null hypothesis.
- Interpret your conclusion in non-statistical terms.

Yes – the sample size is greater than 30 so we can apply the CLT.

Yes – s =73.10

H_{o: }m= 590

H_{a}: m ¹
590

z = (x-bar - m_{o} ) /
{s / sqrt(n)}={602.28 - 590} / {73.10/sqrt(32)}
= 0.95

p-value = 2*Pr(Z >= |0.95|) = 2*{1 - Pr(Z<=0.95)} = 2*(1-0.8289)
= 0.3422

a =0.05

The p-value is greater than a . Therefore,
we FAIL TO REJECT the null hypothesis.

- Is the sampling distribution of x-bar normal?
- Is s known? If so, what is its value?
- State the null and alternative hypotheses.
- Calculate the test statistic.
- Calculate the p-value.
- At what significance level should the test be performed (e.g., what is a )?
- Use the p-value to draw a conclusion about the null hypothesis.
- Interpret your conclusion in non-statistical terms.

Yes – the sample size is greater than 30 so we can apply the CLT.

Yes – s =7.61

H_{o: }m= 40.69

H_{a}: m > 40.69

z = (x-bar - m_{o} ) /
{s / sqrt(n)}={44.12 - 40.69} / {7.61/sqrt(40)}
= 2.85

p-value = Pr(Z >=2.85) = 1 - Pr(Z<=2.85) = 1-0.9978 = 0.0022

a =0.01

The p-value is less than a . Therefore,
we REJECT the null hypothesis.