Homework 1
Linear Regression
The data for this problem illustrate the relationship between the average monthly outdoor temperature (in degrees Fahrenheit) and the average monthly gas consumption (in therms) for a household. Use the attached SAS output to answer the following questions.
Yes, it appears that
it is appropriate to fit a linear regression model to these data. The data show a fairly strong decreasing
linear trend. The correlation (r) is
-0.9030, which also supports our findings from the scatter plot.
Based on the values
from the parameter estimates section of the output, the prediction equation is
given by:
Y = 17.37 0.24x
The value of the
slope is -0.24. This value tells us that
as temperature increases by 1 degree, gas consumption should decrease by 0.24 therms.
Beta1 +/- t(0.025, 14)*s(beta1)
-0.24 +/- (2.145 )(0.03)
-0.24 +/- 0.06435
(-0.30435
, -0.17565)
We are 95% confident
that the true value of the slope parameter is between -0.30435
and -0.17565).
To answer this
question, we need to test the hypothesis that the slope is equal to 0.
Ho: beta1 = 0
Ha: beta1 not equal 0
The value of the
test statistic for this test is t=-7.86.
The p-value for the test is given to be <0.0001. Therefore, at the 5% significance level, we
see that the p-value is less than the significance level (e.g., 0.0001 <
0.05), and we reject the null hypothesis.
Thus, we conclude that there is a significant linear relationship
between average gas consumption and average temperature.
Plug 45 into the
prediction equation.
Y = 17.37 0.24*45
= 17.37 10.8 = 6.57
No. That would be extrapolation since the data
that were used to construct the model only had average temperatures ranging
from 29 degrees to 71 degrees. 85
degrees is outside of the range of these data.
The value of
R-square is 0.8154. This value tells us
that 81.54% of the variability in average gas consumption can be explained by
the linear relationship with average temperature.
The plot of
residuals vs. predicted values allows us to check for violations of the
following assumptions:
·
Outliers
·
Non-constant
variance
·
Non-linearity
If the residuals are
randomly scattered about 0, then the regression assumptions have not been
violated. In this plot, it appears that
there may be a slight megaphone effect which indicates that there may be non-constant
variance.
The sequence plot of
the residuals vs. time allows us to look for the possibility of correlated
error terms. Patterns or trends in this
plot indicate the presence of correlation.
Although there is a lot of noise in this plot, it does appear that
there may be a cyclical pattern to the residuals. This pattern suggests that we may need to
apply some time series methods to adjust for temporal correlation before we can
fit a regression model to the data.
The SAS System
Plot of
avgas*avtemp.
Legend: A = 1 obs, B = 2 obs,
etc.
12
A
A
A
10
A
A
A
8 A
A A
avgas
6
A
A
4
A
A
2
A
A
A
0
25
30 35 40
45 50 55
60 65 70
75
avtemp
The SAS
System
Obs time avtemp avgas
1 0
29 8.9
2 1
30 11.6
3 2
31 10.7
4 3
37 11.6
5 4
48 7.5
6 5
57 3.5
7 6
68 1.5
8 8 71
0.8
9 10
53 1.9
10 11
40 5.0
11 12
39 7.3
12 13
29 9.3
13 14
36 9.7
14 15
37 7.9
15 16
46 5.8
16 17
56 3.2
The
CORR Procedure
2 Variables: avgas
avtemp
Pearson
Correlation Coefficients, N = 16
Prob > |r| under H0:
avgas avtemp
avgas
1.00000 -0.90300
avtemp -0.90300 1.00000
The
REG Procedure
Model: MODEL1
Dependent
Variable: avgas
Analysis
of Variance
Sum of Mean
Source DF Squares Square F Value
Pr > F
Model 1 161.51498 161.51498 61.85
<.0001
Error 14 36.56252 2.61161
Corrected Total 15 198.07750
Root MSE 1.61605 R-Square
0.8154
Dependent Mean 6.63750 Adj R-Sq 0.8022
Coeff
Var
24.34723
Parameter
Estimates
Parameter Standard
Variable DF
Estimate Error t Value
Pr > |t|
Intercept 1
17.37277 1.42362 12.20
<.0001
avtemp
1 -0.24295 0.03089 -7.86
<.0001
The
SAS System
Plot
of res*pred. Legend: A = 1 obs,
B = 2 obs, etc.
4
A
3
2
A
A
R
e 1
A
s
A
i
A A
d
u
a
l 0 A
A
A A
A
-1
A
A
-2
A A
-3
0 2 4 6 8 10 12
Predicted Value of avgas
Plot of res*time. Legend: A = 1 obs,
B = 2 obs, etc.
4
A
3
2
A
A
R
e 1 A
s
A
i A A
d
u
a
l 0 A
A
A
A
A
-1 A
A
-2
A A
-3
0
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17
time