# Homework 3

This homework is a modified version of Homework 1 from Lab 3.

**Questions **

After lab 3, you should be able to plot the data (always a good idea to do before

doing anything else), evaluate how the variables are correlated

with one another, and do a regression to determine which variables

are important. Note that SAS produces a LOT of output. Please

label carefully or (preferably) cut out and attach relevant

parts (or file output, then pull it into program

editor to delete unnecessary parts)

A moving company gives you the following dataset:

Weight Distance Moved Damage

(1,000 lbs.) (1,000 miles) (dollars)

4 1.5 160

3 2.2 112

1.6 1.0 69

1.2 2.0 90

3.4 0.8 123

4.8 1.6 186

3.2 0.9 120

1. Create a SAS data set called hw3.

2. Plot damage*weight. Use appropriate titles for your plot.
Does there appear to be a relationship?
Plot damage*distance. Again, does there appear to be a
relationship? Based on these results what do you expect to
find from your correlation analysis and regression analysis?
Run PROC CORR and comment on how the correlation values relate

to what you see in the plots.

3. Regress damage on weight and distance. Assume the following
equation: y=b0 + b1x1 + b2x2 + e. Report the prediction

equation (based on the SAS output) and report the MSE.

4. You are planning to move from Raleigh to Kansas City
(about 1100 miles) and the weight of load is about
2,000 lbs. How much damage (in dollars) do you
expect to incur? Construct a 95% prediction interval
for your answer. (Hint: You can answer both of these

questions by creating a new data set using the following

data step code and then running a regression (with the CLI

option) on the new data set.)

Data Step Code for Question 4 (Assumes that your original

data set is called hw3):

DATA NEWVAL ;

INPUT WEIGHT DISTANCE DAMAGE ;

CARDS ;

2 1.1 .

;

DATA BOTH ;

SET HW3 NEWVAL ;

Now run the regression on the data set BOTH.

5. Do either weight or distance or both significantly affect damage?

Suport your answer with the appropriate test results from your analyses.

6. What percent of the variation seen in damage is explained by the

regression on weight and distance?

7. Is there any evidence of multicollinearity in this regression?