From 3b3f51d41b62f76656a625806c2f49b73d79b925 Mon Sep 17 00:00:00 2001 From: caes Date: Thu, 2 Feb 2017 18:54:50 -0500 Subject: [PATCH] new answers --- hw2/answers | 99 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 62 insertions(+), 37 deletions(-) diff --git a/hw2/answers b/hw2/answers index 30f36ac..0228780 100644 --- a/hw2/answers +++ b/hw2/answers @@ -1,61 +1,86 @@ -1. Describe the null hypotheses to which the p-values given in Table 3.4 -correspond. Explain what conclusions you can draw based on these -p-values. Your explanation should be phrased in terms of sales , TV , -radio , and newspaper , rather than in terms of the coefficients of the -linear model. +1. Describe the null hypotheses to which the p-values given in Table + 3.4 correspond. Explain what conclusions you can draw based on + these p-values. Your explanation should be phrased in terms of + sales , TV , radio , and newspaper , rather than in terms of the + coefficients of the linear model. + + + P-values that are very small indicate that the model for that + predictor is likely to account for a significant amount of the + association between the predictor and the response. If that is + true, then, we reject the null hypothesis, and conclude that a + relationship exists between the predictor and the response. The + p-values computed from the response of sales to marketing budget + for each marketing paradigm indicate will give us insight into + which of these predictors have a strong relationship with sales + of this product. + + TV marketing and radio marketing both have a strong relationship + to sales, according to their linear regression p-values, but + newspaper advertising does not appear to be effective, given + that the linear model does not account for much of the variation + in sales across that domain. We can conclude that cutting back + on newspaper advertising will likely have little effect on the + sales of the product, and that increasing TV and radio + advertising budgets likely will have an effect. Furthermore, we + can see that radio advertising spending has a stronger + relationship with sales, as the best-fit slope is significantly + more positive than the best fit for TV advertising spending, so + increasing the radio advertising budget will likely be more + effective. - -3. Suppose we have a data set with five predictors, X 1 = GPA, X 2 = IQ, -X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction between -GPA and IQ, and X 5 = Interaction between GPA and Gender. The -response is starting salary after graduation (in thousands of dollars). -Suppose we use least squares to fit the model, and get β₀ = 50, β₁ = -20, β₂ = 0.07, β₃ = 35, β₄ = 0.01, β₅ = −10. +3. Suppose we have a data set with five predictors, X₁ = GPA, X₂ = + IQ, X₃ = Gender (1 for Female and 0 for Male), X₄ = Interaction + between GPA and IQ, and X₅ = Interaction between GPA and Gender. + The response is starting salary after graduation (in thousands of + dollars). Suppose we use least squares to fit the model, and get + β₀ = 50, β₁ = 20, β₂ = 0.07, β₃ = 35, β₄ = 0.01, β₅ = −10. (a) Which answer is correct, and why? - i. For a fixed value of IQ and GPA, males earn more on average - than females. + i. For a fixed value of IQ and GPA, males earn more on + average than females. ii. For a fixed value of IQ and GPA, females earn more on average than males. - iii. For a fixed value of IQ and GPA, males earn more on average - than females provided that the GPA is high enough. + iii. For a fixed value of IQ and GPA, males earn more on + average than females provided that the GPA is high enough. iv. For a fixed value of IQ and GPA, females earn more on average than males provided that the GPA is high enough. - (b) Predict the salary of a female with IQ of 110 and a GPA of 4.0. + (b) Predict the salary of a female with IQ of 110 and a GPA of + 4.0. - (c) True or false: Since the coefficient for the GPA/IQ interaction - term is very small, there is very little evidence of an interaction - effect. Justify your answer. + (c) True or false: Since the coefficient for the GPA/IQ + interaction term is very small, there is very little evidence of + an interaction effect. Justify your answer. -4. I collect a set of data (n = 100 observations) containing a single -predictor and a quantitative response. I then fit a linear regression -model to the data, as well as a separate cubic regression, i.e. Y = -β₀ + β₁ X + β₂ X² + β₃ X³ + . +4. I collect a set of data (n = 100 observations) containing a + single predictor and a quantitative response. I then fit a linear + regression model to the data, as well as a separate cubic + regression, i.e. Y = β₀ + β₁ X + β₂ X² + β₃ X³ + . - (a) Suppose that the true relationship between X and Y is linear, - i.e. Y = β₀ + β₁ X + . Consider the training residual sum of - squares (RSS) for the linear regression, and also the training - RSS for the cubic regression. Would we expect one to be lower - than the other, would we expect them to be the same, or is there - not enough information to tell? Justify your answer. + (a) Suppose that the true relationship between X and Y is + linear, i.e. Y = β₀ + β₁ X + . Consider the training residual + sum of squares (RSS) for the linear regression, and also the + training RSS for the cubic regression. Would we expect one to be + lower than the other, would we expect them to be the same, or is + there not enough information to tell? Justify your answer. (b) Answer (a) using test rather than training RSS. - (c) Suppose that the true relationship between X and Y is not linear, - but we don’t know how far it is from linear. Consider the training - RSS for the linear regression, and also the training RSS for the - cubic regression. Would we expect one to be lower than the - other, would we expect them to be the same, or is there not - enough information to tell? Justify your answer. - (d) Answer (c) using test rather than training RSS. \ No newline at end of file + (c) Suppose that the true relationship between X and Y is not + linear, but we don’t know how far it is from linear. Consider + the training RSS for the linear regression, and also the + training RSS for the cubic regression. Would we expect one to be + lower than the other, would we expect them to be the same, or is + there not enough information to tell? Justify your answer. (d) + Answer (c) using test rather than training RSS. \ No newline at end of file