POLS 6482 ADVANCED MULTIVARIATE STATISTICS
Eleventh Assignment
Due 19 November 2001


  1. In this problem we going to apply probit, logit, and linear probability to some data from the 1968, 1996, and 2000 NES Presidential election surveys. These estimation methods are designed for binary dependent variables. The 1968 and 1996 data are similar to the data we used in part (1) of the 2nd assignment. The variables are:
    
    
          Party Identification:  0=strong democrat
                                 1=weak democrat
                                 2=lean democrat
                                 3=independent
                                 4=lean republican
                                 5=weak republican
                                 6=strong republican
                                  
          Family Income:  Raw Data (we will not use this variable)
                                  
          Family Income Quintile:  1 is the lowest quintile, 5 is the highest
     
          Race:  0 = White
                 1 = Black
    
          Sex:   0 = Male
                 1 = Female
          
          South: 0 = North
                 1 = South
    
          Education:  1 = High School or less
                      2 = Some College
                      3 = College degree
    
          Age:   In Years
    
          Presidential Vote:  0 = Did Not Vote
                              1 = Voted for Democratic Candidate For President
                              2 = Voted for Republican Candidate For President
                              3 = Voted for 3rd Party Candidate for President
    
    The data are in three text files:

    1968 Data
    1996 Data
    2000 Data

    1. Download these three files and load them into EVIEWS and Stata. Turn in the d and summ commands for all three datasets.

    2. In Stata, a binary dependent variable is always defined as 0 being the "negative" outcome with all other nonmissing values being the "positive" outcome. Use Presidential Vote as a dependent variable with the remaining variables as independent variables; that is, run the following model on all three election years:

      probit voted party income race sex south education age

      You can interpret the probit coefficients roughly the same way that you interpret the regular multiple regression coefficients. A positive bj means that the independent variable is increasing (decreasing) the probability of a "positive" ("negative") outcome. Compare the results for all three elections. What is your interpretation of the coefficients (what do they tell you about American Politics)? Be Specific.

    3. In EVIEWS, a binary dependent variable is always defined as 0 being the "negative" outcome and 1 being the "positive" outcome. Create a dependent variable from Presidential Vote where 0 = Voted for the Democratic Party Candidate and 1 = Voted for Republican Party Candidate (note that non-voters and 3rd party voters are missing data!). Run the following logit model on all three election years:

      logit y c party income race sex south education age

      You can interpret the logit coefficients roughly the same way that you interpret the regular multiple regression coefficients. A positive bj means that the independent variable is increasing (decreasing) the probability of a "positive" ("negative") outcome. Compare the results for all three elections. What is your interpretation of the coefficients (what do they tell you about American Politics)? Be Specific.

    4. Linear Probability is simply regular regression with the White Standard Error Correction applied to a binary dependent variable. Replicate the estimations of part (c) using linear probability in EVIEWS. To compare the logit and linear probability coefficients, normalize the bj's (except for b0) so that their sum of squares is equal to 1. That is, square the k bj's, add them up, take the square root of this sum, and divide through the bj's by this number. Make a table showing these normalized bj's (except for the intercept term) and their p-values for the two models.

  2. In this problem we are going to apply ordered probit to the three Presidential election datasets. An ordered probit estimation is designed for a dependent variable with multiple categories where it is reasonable to assume that the categories can be rank ordered. For example, the party variable ranges from 0 to 6 where 0 = strong democrat, 1 = weak democrat, 2 = lean democrat, 3 = independent, 4 = lean republican, 5 = weak republican, 6 = strong republican.

    1. Run the standard regression of Party on income, race, sex, south, education, and age in both Stata and EVIEWS.

    2. In EVIEWS, to run an ordered probit issue the command:

      ordered party income race sex south education age

      In EVIEWS, to run an ordered logit issue the command:

      ordered(d=L) party income race sex south education age

      In Stata to run an ordered probit issue the command:

      oprobit party income race sex south education age

      In Stata to run an ordered logit issue the command:

      ologit party income race sex south education age

      Note the absence of C in the EVIEWS ordered probit and logit commands. As I will explain in class, the intercept term is picked up by the estimation of the cutting points on the latent dimension of the dependent variable. In EVIEWS if you issue the command:

      ordered party c income race sex south education age

      you will get exactly the same answer.

      Make a table for each election showing the normalized coefficients and their P-Values for all three models -- regular regression, ordered probit, and ordered logit.

    3. EVIEWS has two nice tables that you can produce for the ordered probit table. Under the View button on the probit table results you will find two options -- Dependent Variable Frequencies and Expectation-Prediction Table. The former is simply a table of the frequencies and is self-explanatory. The latter contains the predicted categories (3rd column in the table) for the dependent variable. Interpret the results shown in the Expectation-Prediction Tables corresponding to the three ordered probit estimations.