Here we are looking at the Hosmer and Lemeshow low birthweight data again. Fit a logistic regression model in which the response variable is the low birthweight indicator (low) and the covariates are: the age variable (age), the last weight variable (lwt), a dummy variable for African American individuals, a dummy variable for individuals who are neither white nor African American, the smoking indicator (smoke), and the indicator of uterine irritability (UI). For the purpose of this exercise, allow age and lwt to enter the linear predictor directly in a linear fashion, i.e, don't transform age and lwt.
a) Conduct a likelihood ratio test in which the null hypothesis is that the coefficients on smoke and ui are jointly equal to 0. Provide details of how you conducted the test. Report the results and explain what they mean.
b) Conduct a Wald test in which the null hypothesis is that the coefficient on the African American dummy is equal to the coefficient on the dummy for individuals who are neither African American nor white. Note that the null here is not necessarily that coefficients are both equal to 0-- only that they are equal to each other. Provide details of how you conducted the test. Report the results and explain what they mean.
c) Explain how you would construct a Wald test of the null hypothesis that three coefficients are simultaneously equal to each other, but not necessarily equal to 0.
Here we are examining survey data that look at the self-reported expected voting behavior, issue positions, and demographic characteristics of a random sample of American voters. These data are from the 1996 American National Election Study and are provided at the bottom of this page.
Your task is to determine an appropriate model or models for a citizen's voting decision. You have a great deal of freedom here. The only constraint is that the vote variable must be the response variable. With that in mind, what goes on the right hand side is completely up to you. Bear in mind that your models should be interpretable and should have good explanatory power.
Justify and explain your decisions regarding any variable transformations and which variables you ended up including in your model(s).
Write up the results, paying special attention to substantive interpretation of the results. Include graphs of key quantities such as predicted probabilities as appropriate. What are these results telling us about voting behavior in the US? Are the results in line with your prior opinions about American voting behavior? Any surprises? What are the effects of key variables on the probability of voting for Dole vs. Clinton? Are there any variables that have an especially strong effect?