Sampling Distribution of the Mean

The heights, in inches of the five starting players, whom we will call A, B, C, D, and E, on a men’s basketball team are displayed in the following table. Here, the population of interest consists of the five players and the variable under consideration is height.

 Player A B C D E Height 76 78 79 81 86

1. Obtain the sampling distribution of the mean for samples of size two.
2. Determine the probability that, for a random sample of size two, the sampling error made in estimating the population mean by the sample mean will be 1 inch or less; that is, determine the probability that the sample mean x-bar will be within 1 inch of the population mean mu.

For part a, here are the samples of size 2 and their means:

 Sample Heights x-bar A, B 76, 78 77.0 A, C 76, 79 77.5 A, D 76, 81 78.5 A, E 76, 86 81.0 B, C 78, 79 78.5 B, D 78, 81 79.5 B, E 78, 86 82.0 C, D 79, 81 80.0 C, E 79, 86 82.5 D, E 81, 86 83.5

Notice that each mean is slightly different from the others. If we calculate the mean of the population, we find that the mean height for the entire population is 80. So, we also see that each sample gives an estimate that is slightly different from the population mean.

We could calculate the probability that the sample mean will exactly equal the population mean by counting the number of times that it equals the population mean and dividing by the total number of samples. This will tell us that there is a 1/10 probability that the sample mean will exactly equal the population mean.

For part (b) of the question, if we count up the number of sample means that are within 1 inch of the population mean, we find that 3 out of 10 are within 1 inch. Therefore, there is a 3/10 probability that the sample mean will be within 1 inch of the population mean.

FACT: For any sample size, the mean of all possible sample means equals the population mean.

This fact can be illustrated numerically by calculating the average of our 10 sample means and noticing that it is exactly equal to 80 (the population mean).

FACT: The standard deviation of all possible sample means is equal to sigma/sqrt(n).

So, we see that the mean of SRS of size n drawn from a large population:

• The larger the sample size, the smaller the standard deviation of the sample mean.
• The smaller the standard deviation of the sample mean, the more closely the sample means will cluster around the population mean.
• The sample mean is an unbiased estimator of the population mean (e.g., the mean of all possible sample means is equal to the population mean.
So, we know the mean and standard deviation of the sample mean. Now, what is it’s shape?

FACT: Suppose that we have a SRS of size n from a population that is NORMALLY distributed with mean m and standard deviation sigma . Then, x-bar, the sample mean, will be normally distributed with mean mu and standard deviation sigma/sqrt(n).

What if the population is NOT normal?

CENTRAL LIMIT THEOREM (CLT):

Suppose that we have a SRS of size n from ANY population with mean mu and standard deviation sigma . Then, for large samples (n>=30), x-bar, the sample mean, will be approximately normally distributed with mean mu and standard deviation sigma/sqrt(n).

Examples:

Scores on an ACT test have a normal distribution with a mean of 18.6 and a standard deviation of 5.9.

1. What’s the probability that a single student scores at least 21 on the test?

2.

Use the distribution of the POPULATION to answer this question. This question deals with a SINGLE observation from the POPULATION.

P(X>=21) = (standardize and change to z-score) P[Z>=(21 – 18.6)/5.9] =
P(Z>= 0.41) = 1 – P(Z<=0.41) = 1 – 0.6591 = 0.3409

3. For a SRS of size 50, what is the probability that the average score is greater than or equal to 21?
Use the distribution of the SAMPLE MEAN to answer this question. Since the population is normally distributed, the sample mean will also be normally distributed. The mean of the distribution of the sample means will be 18.6 and the standard deviation of the sample mean will be 5.9/sqrt(50) = 0.835.

P(x-bar>=21) = (standardize) P[Z>= (21 – 18.6)/0.835] = P(Z>=2.87) = 1-P(Z<=2.87) = 1-0.9979 = 0.0021

The flaws per square yard in carpet are not normally distributed but have a mean of 1.6 and a standard deviation of 1.2. A sample of 200 square yards of carpet was taken. What is the probability that the mean number of flaws per square yard exceeds 2?

By the CLT, the sampling distribution of the sample mean is normal with mean = 1.6 and standard deviation = 1.2 / sqrt(200)=0.085 (since n>=30)

P(x-bar>=2)= (standardize) P[Z >= (2 – 1.6)/0.085] = P(Z >= 4.71) = 1 – P(Z<=4.71) = 1-1 = 0