Due: Tuesday, April 10In this homework, you'll work with techniques for multiple regression and model selection covered in class. Please start by downloading the file lastname_firstname_hw4.R and rename it so that it has your last and first names as appropriate (e.g., ''skinner_burrhus_hw3.R''). You will edit this file, putting the R code you use to complete the assignment and write down comments as requested in the problems. When you are done, you will submit this file on Blackboard. A perfect solution to this homework will be worth 100 points. Part 1 (50 pts). Regression and alien parties On the planet zofu, there live particularly gregarious creatures known as crumples who are famous for the size and extravagance of their parties. You are a scientist interested in the social life of crumples and you have collected data on how 100 of the creatures spent their last birthdays. You have collected information about the number of guests that came to their party and you want to understand the variance in what you see. You also have information on the amount of money that they spent per head at their party in the previous year, as well as the mean charisma rating for each creature (rated by other crumples), and the score that they recently recieved on a test of their understanding of the flora and fauna on the planet zofu. The data is available in the file alienparty.txt. You'll use this data to answer the questions in this part of the homework. Read in ''alienparty.txt'' as a dataframe called ''alienparty''. (a) [5 pts] Build a regression model which predicts the number of guests based on how much they spent per head at their last party. (b) [5 pts] Build a regression model which predicts the number of guests based on how much they spent per head at their last party and their charisma. Does this significantly improve the fit of the model over the model you build for 1a? ( c) [5 pts] Build a regression model which predicts the number of guests based on how much they spent per head at their last party and their charisma AND an interaction between expenditure and charisma. Does this significantly improve the fit of the model over those you built in 1a and 1b? (d) [5 pts] Select the best fitting model you've built so far and add their recent exam scores as an additional predictor. Does this give a significant improvement in fit over the models you've built so far? (e) [10 pts] Select the best fitting model you've built so far and explain what each of the values found in the first numeric column of the table of coefficients returned when you use the summary command correspond to in the alien world. (f) [5 pts] Explain why the output includes a t-value. What do these values tell us? (g) [15 pts] What proportion of the variance in the number of guests seen can be uniquely accounted for by a persons charisma (i.e. it can not also be attributed to the other variables (you can ignore interactions here)? Part 2 (50 pts). Regression and alien grooming habitsOn another planet creebo, there lives a rather more austere-living creature the zeepap. Their one slightly outlandish feature is a taste for unusual facial hair. You have data on 100 zeepaps, 67 of who have beards of some kind. You suspect that their is a relationship between their facial hair and a love of jazz but you are not sure. You have therefore collected data as to whether these creatures like jazz. You also know the age of each of the creatures. The data is available in the file aliengrooming.txt. You'll use this data to answer the questions in this part of the homework. Read in ''aliengrooming.txt'' as a dataframe called ''aliengrooming''. (a) [25 points] Decide what kinds of model is appropriate for predicting whether or not a creature will have a beard. Use the remaining variables to build regression models that predict this. Build all possible models (of the appropriate type) and pick the model that gives the best fit to the data. Give the commands that you used to decide that this, and not any other of the models, gave the best fit. Remember that we use different statistics for the comparison of nested and non-nested models. (b) [15 points] What are the values found in first column of the table of coefficients? What does each of these values tell us about life on the planet zofu? ( c) [10 points] Given the data and your model, what do you estimate to be the probability that each of the following creatures will have a beard? [you can convert a log odds value Z into probabilities using the formula exp(Z)/1+exp(Z).] (i) a 15 year old zeepap who doesn't like Jazz. (ii) a 40 year old Jazz-loving zeepap. |

Home > Assignments >