STA 4234/STA 5236 Regression Analysis Homework 3 Instructions: a. Must show all necessary work to get full credit. b. Present each problem in order. c. Attach relevant intermediate calculation, output, figures and/or tables to each corresponding question. d. You may use any software for all parts of this homework, unless stated otherwise. Attach program code, if any, to the end of the homework (code only, output attached to each question). Problem #1. Show the hat matrix H=X’(X’X)-1X is symmetric (i.e. H’=H) and idempotent (i.e. H2=H). If you cannot show in mathematics, verify using the first three observations in any of the data sets used in this homework. Notice X should include a column of 1s. (Stat grads are required to show in both ways.) Problem #2. Problem 4.2 page 165. Data in Table B.1 page 554. Consider the multiple linear regression model relating the number of games won to the team’s passing yardage (x2), the percentage of rushing plays (x7), and the opponents’ yards rushing (x8). Problem #3. Consider the clathrate formation data in Table B.8 page 560. Fit two different models (i) based on x1 and x2 (ii) based on only x2. For each model, (a) Construct Normal residual plots and residual vs fitted value plots. Comment on any issue regarding model adequacy. (b) Perform the appropriate lack of fit test and make a conclusion. [For multiple regression, the ANOVA type model can be fit in R using the formula y~factor(x1+x2)]. If a lack of fit test cannot be performed, explain why. (c) Compute the PRESS statistic and R2 for prediction. Based on the result, which model is more likely to provide better predictions of future data Problem #4. Problem 5.2 page 203. Problem #5. Problem 5.12 page 205. Consider average response (y bar). So you are fitting one model. Problem #6. (Stat grads only) Problem 5.8 page 205.