new answers

2025-08-17 18:33:27 +00:00 · 2017-02-02 18:54:50 -05:00 · 2017-02-02 18:54:50 -05:00 · 3b3f51d41b
commit 3b3f51d41b
parent 19c2c8704e
1 changed files with 62 additions and 37 deletions
--- a/hw2/answers
+++ b/hw2/answers
@ -1,61 +1,86 @@
-1. Describe the null hypotheses to which the p-values given in Table 3.4
+1. Describe the null hypotheses to which the p-values given in Table
-correspond. Explain what conclusions you can draw based on these
+   3.4 correspond. Explain what conclusions you can draw based on
-p-values. Your explanation should be phrased in terms of sales , TV ,
+   these p-values. Your explanation should be phrased in terms of
-radio , and newspaper , rather than in terms of the coefficients of the
+   sales , TV , radio , and newspaper , rather than in terms of the
-linear model.
+   coefficients of the linear model.
 	P-values that are very small indicate that the model for that
 	predictor is likely to account for a significant amount of the
 	association between the predictor and the response. If that is
 	true, then, we reject the null hypothesis, and conclude that a
 	relationship exists between the predictor and the response. The
 	p-values computed from the response of sales to marketing budget
 	for each marketing paradigm indicate will give us insight into
 	which of these predictors have a strong relationship with sales
 	of this product.
 	TV marketing and radio marketing both have a strong relationship
 	to sales, according to their linear regression p-values, but
 	newspaper advertising does not appear to be effective, given
 	that the linear model does not account for much of the variation
 	in sales across that domain. We can conclude that cutting back
 	on newspaper advertising will likely have little effect on the
 	sales of the product, and that increasing TV and radio
 	advertising budgets likely will have an effect. Furthermore, we
 	can see that radio advertising spending has a stronger
 	relationship with sales, as the best-fit slope is significantly
 	more positive than the best fit for TV advertising spending, so
 	increasing the radio advertising budget will likely be more
 	effective.
-
+3. Suppose we have a data set with five predictors, X₁ = GPA, X₂ =
-3. Suppose we have a data set with five predictors, X 1 = GPA, X 2 = IQ,
+   IQ, X₃ = Gender (1 for Female and 0 for Male), X₄ = Interaction
-X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction between
+   between GPA and IQ, and X₅ = Interaction between GPA and Gender.
-GPA and IQ, and X 5 = Interaction between GPA and Gender. The
+   The response is starting salary after graduation (in thousands of
-response is starting salary after graduation (in thousands of dollars).
+   dollars). Suppose we use least squares to fit the model, and get
-Suppose we use least squares to fit the model, and get β₀ = 50, β₁ =
+   β₀ = 50, β₁ = 20, β₂ = 0.07, β₃ = 35, β₄ = 0.01, β₅ = −10.
 20, β₂ = 0.07, β₃ = 35, β₄ = 0.01, β₅ = −10.
 	(a) Which answer is correct, and why?
-		i. For a fixed value of IQ and GPA, males earn more on average
+		i. For a fixed value of IQ and GPA, males earn more on
-		than females.
+		   average than females.
 		ii. For a fixed value of IQ and GPA, females earn more on
 		average than males.
-		iii. For a fixed value of IQ and GPA, males earn more on average
+		iii. For a fixed value of IQ and GPA, males earn more on
-		than females provided that the GPA is high enough.
+		average than females provided that the GPA is high enough.
 		iv. For a fixed value of IQ and GPA, females earn more on
 		average than males provided that the GPA is high enough.
-	(b) Predict the salary of a female with IQ of 110 and a GPA of 4.0.
+	(b) Predict the salary of a female with IQ of 110 and a GPA of
 	4.0.
-	(c) True or false: Since the coefficient for the GPA/IQ interaction
+	(c) True or false: Since the coefficient for the GPA/IQ
-	term is very small, there is very little evidence of an interaction
+	interaction term is very small, there is very little evidence of
-	effect. Justify your answer.
+	an interaction effect. Justify your answer.
-4. I collect a set of data (n = 100 observations) containing a single
+4. I collect a set of data (n = 100 observations) containing a
-predictor and a quantitative response. I then fit a linear regression
+   single predictor and a quantitative response. I then fit a linear
-model to the data, as well as a separate cubic regression, i.e. Y =
+   regression model to the data, as well as a separate cubic
-β₀ + β₁ X + β₂ X² + β₃ X³ + .
+   regression, i.e. Y = β₀ + β₁ X + β₂ X² + β₃ X³ + .
-	(a) Suppose that the true relationship between X and Y is linear,
+	(a) Suppose that the true relationship between X and Y is
-	i.e. Y = β₀ + β₁ X + . Consider the training residual sum of
+	linear, i.e. Y = β₀ + β₁ X + . Consider the training residual
-	squares (RSS) for the linear regression, and also the training
+	sum of squares (RSS) for the linear regression, and also the
-	RSS for the cubic regression. Would we expect one to be lower
+	training RSS for the cubic regression. Would we expect one to be
-	than the other, would we expect them to be the same, or is there
+	lower than the other, would we expect them to be the same, or is
-	not enough information to tell? Justify your answer.
+	there not enough information to tell? Justify your answer.
 	(b) Answer (a) using test rather than training RSS.
-	(c) Suppose that the true relationship between X and Y is not linear,
+	(c) Suppose that the true relationship between X and Y is not
-	but we don’t know how far it is from linear. Consider the training
+	linear, but we don’t know how far it is from linear. Consider
-	RSS for the linear regression, and also the training RSS for the
+	the training RSS for the linear regression, and also the
-	cubic regression. Would we expect one to be lower than the
+	training RSS for the cubic regression. Would we expect one to be
-	other, would we expect them to be the same, or is there not
+	lower than the other, would we expect them to be the same, or is
-	enough information to tell? Justify your answer.
+	there not enough information to tell? Justify your answer. (d)
-	(d) Answer (c) using test rather than training RSS.
+	Answer (c) using test rather than training RSS.