cs-5821/hw2/answers

86 lines
3.8 KiB
Plaintext
Raw Normal View History

2017-01-30 06:45:45 +00:00
2017-02-02 23:54:50 +00:00
1. Describe the null hypotheses to which the p-values given in Table
3.4 correspond. Explain what conclusions you can draw based on
these p-values. Your explanation should be phrased in terms of
sales , TV , radio , and newspaper , rather than in terms of the
coefficients of the linear model.
P-values that are very small indicate that the model for that
predictor is likely to account for a significant amount of the
association between the predictor and the response. If that is
true, then, we reject the null hypothesis, and conclude that a
relationship exists between the predictor and the response. The
p-values computed from the response of sales to marketing budget
for each marketing paradigm indicate will give us insight into
which of these predictors have a strong relationship with sales
of this product.
TV marketing and radio marketing both have a strong relationship
to sales, according to their linear regression p-values, but
newspaper advertising does not appear to be effective, given
that the linear model does not account for much of the variation
in sales across that domain. We can conclude that cutting back
on newspaper advertising will likely have little effect on the
sales of the product, and that increasing TV and radio
advertising budgets likely will have an effect. Furthermore, we
can see that radio advertising spending has a stronger
relationship with sales, as the best-fit slope is significantly
more positive than the best fit for TV advertising spending, so
increasing the radio advertising budget will likely be more
effective.
3. Suppose we have a data set with five predictors, X₁ = GPA, X₂ =
IQ, X₃ = Gender (1 for Female and 0 for Male), X₄ = Interaction
between GPA and IQ, and X₅ = Interaction between GPA and Gender.
The response is starting salary after graduation (in thousands of
dollars). Suppose we use least squares to fit the model, and get
β₀ = 50, β₁ = 20, β₂ = 0.07, β₃ = 35, β₄ = 0.01, β₅ = 10.
2017-01-30 06:45:45 +00:00
(a) Which answer is correct, and why?
2017-02-02 23:54:50 +00:00
i. For a fixed value of IQ and GPA, males earn more on
average than females.
2017-01-30 06:45:45 +00:00
ii. For a fixed value of IQ and GPA, females earn more on
average than males.
2017-02-02 23:54:50 +00:00
iii. For a fixed value of IQ and GPA, males earn more on
average than females provided that the GPA is high enough.
2017-01-30 06:45:45 +00:00
iv. For a fixed value of IQ and GPA, females earn more on
average than males provided that the GPA is high enough.
2017-02-02 23:54:50 +00:00
(b) Predict the salary of a female with IQ of 110 and a GPA of
4.0.
2017-01-30 06:45:45 +00:00
2017-02-02 23:54:50 +00:00
(c) True or false: Since the coefficient for the GPA/IQ
interaction term is very small, there is very little evidence of
an interaction effect. Justify your answer.
2017-01-30 06:45:45 +00:00
2017-02-02 23:54:50 +00:00
4. I collect a set of data (n = 100 observations) containing a
single predictor and a quantitative response. I then fit a linear
regression model to the data, as well as a separate cubic
regression, i.e. Y = β₀ + β₁ X + β₂ X² + β₃ X³ + .
2017-01-30 06:45:45 +00:00
2017-02-02 23:54:50 +00:00
(a) Suppose that the true relationship between X and Y is
linear, i.e. Y = β₀ + β₁ X + . Consider the training residual
sum of squares (RSS) for the linear regression, and also the
training RSS for the cubic regression. Would we expect one to be
lower than the other, would we expect them to be the same, or is
there not enough information to tell? Justify your answer.
2017-01-30 06:45:45 +00:00
(b) Answer (a) using test rather than training RSS.
2017-02-02 23:54:50 +00:00
(c) Suppose that the true relationship between X and Y is not
linear, but we dont know how far it is from linear. Consider
the training RSS for the linear regression, and also the
training RSS for the cubic regression. Would we expect one to be
lower than the other, would we expect them to be the same, or is
there not enough information to tell? Justify your answer. (d)
Answer (c) using test rather than training RSS.