cs-5821/hw2/answers
2017-01-30 01:45:45 -05:00

61 lines
2.5 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

1. Describe the null hypotheses to which the p-values given in Table 3.4
correspond. Explain what conclusions you can draw based on these
p-values. Your explanation should be phrased in terms of sales , TV ,
radio , and newspaper , rather than in terms of the coefficients of the
linear model.
3. Suppose we have a data set with five predictors, X 1 = GPA, X 2 = IQ,
X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction between
GPA and IQ, and X 5 = Interaction between GPA and Gender. The
response is starting salary after graduation (in thousands of dollars).
Suppose we use least squares to fit the model, and get β 0 = 50, β 1 =
20, β 2 = 0.07, β 3 = 35, β 4 = 0.01, β 5 = 10.
(a) Which answer is correct, and why?
i. For a fixed value of IQ and GPA, males earn more on average
than females.
ii. For a fixed value of IQ and GPA, females earn more on
average than males.
iii. For a fixed value of IQ and GPA, males earn more on average
than females provided that the GPA is high enough.
iv. For a fixed value of IQ and GPA, females earn more on
average than males provided that the GPA is high enough.
(b) Predict the salary of a female with IQ of 110 and a GPA of 4.0.
(c) True or false: Since the coefficient for the GPA/IQ interaction
term is very small, there is very little evidence of an interaction
effect. Justify your answer.
4. I collect a set of data (n = 100 observations) containing a single
predictor and a quantitative response. I then fit a linear regression
model to the data, as well as a separate cubic regression, i.e. Y =
β 0 + β 1 X + β 2 X 2 + β 3 X 3 + .
(a) Suppose that the true relationship between X and Y is linear,
i.e. Y = β 0 + β 1 X + . Consider the training residual sum of
squares (RSS) for the linear regression, and also the training
RSS for the cubic regression. Would we expect one to be lower
than the other, would we expect them to be the same, or is there
not enough information to tell? Justify your answer.
(b) Answer (a) using test rather than training RSS.
(c) Suppose that the true relationship between X and Y is not linear,
but we dont know how far it is from linear. Consider the training
RSS for the linear regression, and also the training RSS for the
cubic regression. Would we expect one to be lower than the
other, would we expect them to be the same, or is there not
enough information to tell? Justify your answer.
(d) Answer (c) using test rather than training RSS.