POLS 6482 ADVANCED MULTIVARIATE STATISTICS
Final Examination
Due 10 December 2001


  1. This problem deals with the Lublin/Jacobson congressional elections dataset from Homework 10. Per our discussion in class, I have augmented the descriptions of incumbency and challenger quality variables -- incumbst, challeng, and challenh -- in the listing below:
    
    Contains data from D:\statadat\lublin5.dta
      obs:         7,832                          
     vars:            39                          1 Nov 2001 11:35
     size:     1,057,320 (98.7% of memory free)
    -------------------------------------------------------------------------------
                  storage  display     value
    variable name   type   format      label      variable label
    -------------------------------------------------------------------------------
    year            int    %8.0g                  year
    congress        byte   %8.0g                  congress (87-104)
    icpsrid         long   %12.0g                 icpsr id #
    icpsrst         byte   %8.0g                  icpsr state code
    cdist1          byte   %8.0g                  cong. district (p&r)
    statenm         str7   %9s                    state name
    cdist2          byte   %8.0g                  cong. district (lublin)
    dempct          float  %9.0g                  demo. % two party vote
    blkpct          float  %9.0g                  black percent of pop.
    whpct           float  %9.0g                  white percent of pop.
    forpct          double %10.0g                 foreign born % of pop.
    south           byte   %8.0g                  south (1=confederacy + KY +OK,
                                                    0=north)
    incomewh        float  %9.0g                  white median family income
    incomebl        long   %12.0g                 black median family income
    hs25            float  %9.0g                  percent 25 and older completing
                                                    high school or more
    college         float  %9.0g                  percent 25 or older completed 4
                                                    yrs college or more
    party1          int    %8.0g                  party code (100=Dem, 200=Rep)
    blackrep        byte   %8.0g                  blackrep =1 if black
                                                    representative, 0 otherwise
    latinorp        byte   %8.0g                  latinorp=1 if mexican, 2=PR,
                                                    3=Cuban, 0 otherwise
    womanrep        byte   %8.0g                  woman representative (1=woman,
                                                    0=man)
    incumb1         byte   %8.0g                  incumbency (0=repub, 1=demo.,
                                                    2=open)
    votesd          long   %12.0g                 number of votes for democrat
    votesr          long   %12.0g                 number votes for republican
    demvshr         float  %9.0g                  democrats share two-party vote
    whowon          byte   %8.0g                  0 = repub won, 1= demo. won,
                                                    99=3rd party won
    incshr          float  %9.0g                  incumbents share 2-party vote,
                                                    99.9=unopposed
    incshrl         float  %9.0g                  incumbents share 2-party vote
                                                    last elect, 99.9=unpposed
    redist          byte   %8.0g                  redistricted: 0=district
                                                    unchange, 1=re-districting
    incumbst        byte   %8.0g                  incumbency status:  
                                                    0 = republican incumbent
                                                    1 = democratic incumbent
                                                    2 = open seat formerly held by democrat
                                                    3 = open seat formerly held by republican
                                                    4 = open seat, new (from redistricting)
                                                    5 = two incumbents (from redistricting)
                                                    9 = third-party incumbent
    challeng        byte   %8.0g                  challenger quality
                                                    0 = challenger has not held elective office  
                                                    1 = challenger has held elective office  
                                                    2 = only Democratic candidate for open seat has held office  
                                                    3 = only Republican candidate for open seat has held office  
                                                    4 = both candidates for open seat have held office  
                                                    5 = no challenger  
                                                    6 = no Democrat candidate (open)  
                                                    7 = no Republican candidate (open)  
    challenh        byte   %8.0g                  challenger misc. information
                                                    0 = Nothing special  (ignore)
                                                    1 = At Large or multi-candidate race  
                                                    2 = unopposed  
                                                    3 = incumbent switched parties since last election  
                                                    4 = challenger was state legislator  
                                                    5 = only Democrat was state legislator (open seat)  
                                                    6 = only Republican was state legislator (open seat)  
                                                    7 = both candidates for open seat were state legislators  
                                                    8 = challenger is former U.S. Representative  
                                                    9 = odd race, third party; in general, DO NOT USE  
    icpsrid2        long   %12.0g                 icpsr id number
    party2          int    %8.0g                  party id (100=Dem, 200=Repub)
    name            str11  %11s                   member name
    dwnom1          float  %9.0g                  dwnominate 1st dimension
    dwnom2          float  %9.0g                  dwnominate 2nd dimension
                                                    (multiply by .3)
    partynm         str13  %13s                   name of political party
    xincome         long   %12.0g                 median family income
    xhispct         float  %9.0g                  percent hispanic
    -------------------------------------------------------------------------------
    
    1. Using EVIEWS, augment your model of the Democratic Vote Share with indicator variables that control for Challenger Quality and Incumbency Status. You are free to use any independent variables you want but you must include median family income in 1967 dollars in your specification. Defend the reasonableness of your specification.

    2. Using EVIEWS, test for the presence of heteroskedasticity and autocorrelation. Estimate your model with AR(1), AR(2), and AR(3) terms.

    3. In EVIEWS, interpret the results from the HIST RESID command.

  2. In this problem you are going to apply probit to some data from the 1956 to 2000 NES Presidential election surveys. The 1968, 1996, and 2000 data are same as those used on Homework 11. The variables are:
    
    
          Party Identification:  0=strong democrat
                                 1=weak democrat
                                 2=lean democrat
                                 3=independent
                                 4=lean republican
                                 5=weak republican
                                 6=strong republican
                                  
          Family Income:  Raw Data (we will not use this variable)
                                  
          Family Income Quintile:  1 is the lowest quintile, 5 is the highest
     
          Race:  0 = White
                 1 = Black
    
          Sex:   0 = Male
                 1 = Female
          
          South: 0 = North
                 1 = South
    
          Education:  1 = High School or less
                      2 = Some College
                      3 = College degree
    
          Age:   In Years
    
          Presidential Vote:  0 = Did Not Vote
                              1 = Voted for Democratic Candidate For President
                              2 = Voted for Republican Candidate For President
                              3 = Voted for 3rd Party Candidate for President
    
    The data are in the text files:

    1956 Data
    1960 Data
    1964 Data
    1968 Data
    1972 Data
    1976 Data
    1980 Data
    1984 Data
    1988 Data
    1992 Data
    1996 Data
    2000 Data

    1. Download these files and load them into Stata. Specifically, stack the elections on top of one another and add a variable called year that is equal to the year of the election for every respondent in that NES survey. When you are finished you should have 19,449 observations for the combined 1952 - 2000 election dataset. Turn in the d and summ commands for this dataset.

    2. In Stata, a binary dependent variable is always defined as 0 being the "negative" outcome with all other nonmissing values being the "positive" outcome. Use Presidential Vote as a dependent variable with the remaining variables as independent variables; that is, run the following model on the combined dataset:

      probit voted party income race sex south education age year

      What is your interpretation of the coefficients (what do they tell you about American Politics)? Be Specific.

    3. The above model is clearly mis-specified because one would expect the the propensity to vote to vary with the absolute strength of party id. Create a variable called strongparty from party (report the steps you used to create strongparty from party) and run the following model on the combined dataset:

      probit voted strongparty income race sex south education age year

      What is your interpretation of the coefficients (what do they tell you about American Politics -- note the changes from the mis-specified model)? Be Specific.

    4. In Stata, create a dependent variable from Presidential Vote where 0 = Voted for the Democratic Party Candidate and 1 = Voted for Republican Party Candidate (note that non-voters and 3rd party voters are missing data!). Run the following probit model on the combined dataset:

      probit y party income race sex south education age year

      What is your interpretation of the coefficients (what do they tell you about American Politics)? Be Specific.

    5. The above model is somewhat mis-specified because we know that the "gender gap" opened up dramatically in just the past 20 years. To account for this try interacting sex with year -- sexyear -- and run the following model on the combined dataset:

      probit y party income race sex sexyear south education age year

      What is your interpretation of the coefficients (what do they tell you about American Politics -- note the changes from the mis-specified model)? Be Specific.

    6. A better specification for this model is to replace the variable year with indicator variables for each election. This has the effect of controlling for the unique circumstances of each election that are not captured by our independent variables -- for example, Lyndon Johnson's landslide victory over Barry Goldwater in 1964. To generate these indicator variables use the command:

      tabulate year, gen(elec)

      This creates 12 indicator variables named elec1 (equals 1 if 1956, 0 otherwise) to elec12 (equals 1 if 2000, 0 otherwise). Re-run the model in 2.e. only replace year with elec2, elec3, ... , elec12; specifically:

      probit y party income race sex sexyear south education age elec2 elec3 elec4 elec5 elec6 elec7 elec8 elec9 elec10 elec11 elec12

      Interpret the coefficients on the election indicators and interpret the coefficient for the intercept term.

  3. In this problem you are going to apply ordered probit to the stacked Presidential election dataset using party as the dependent variable.

    1. In Stata run the specification used in 2.f only with party as the dependent variable:

      oprobit party income race sex sexyear south education age elec2 elec3 elec4 elec5 elec6 elec7 elec8 elec9 elec10 elec11 elec12

      What is your interpretation of the coefficients? Be Specific.