POLS 6482 ADVANCED MULTIVARIATE STATISTICS
Eighth Assignment
Due 29 October 2001


  1. This problem is a continuation of 1.c of the 3rd homework, 2 of the 4th homework, and 2 of the 5th homework. I made some additional corrections in the file so download the new Stata file below:

    105th Congressional District Data (HDMG105ZZ.DTA)

    Note that I have added two variables bush00 and gore00 that are the Bush and Gore 2000 election percentages in the congressional district.

    1. Download the following text file:

      106th House DW-NOMINATE Scores (H106.TXT)

      and use Epsilon to paste H106.TXT into Stata. Use the following variable names and definitions:
        
      cong2           byte   %8.0g        congress no.
      icpsrid2        long   %12.0g       id no (icpsr and Poole/Rosenthal)
      state           byte   %8.0g        icpsr state code
      district        byte   %8.0g        cong. district no.
      statenm3        str7   %9s          name of state
      party2          int    %8.0g        political party
      name3           str11  %11s         name of member
      dwnom1n         float  %9.0g        1st dim. dw-nom. 106th
      dwnom2n         float  %9.0g        2nd dim. dw-nom. 106th
      Follow the instructions shown in part 1 of Homework 3 to merge this file into HDMG105ZZ.DTA. Sort on state and district in both files! Rename your STATA file HDMG106.DTA, do the d and summ commands, and report the results.

    2. In STATA run the regressions:

      regress clint96 black south hisp income dwnom1 dwnom2
      regress gore00 black south hisp income dwnom1n dwnom2n
      regress dole96 black south hisp income dwnom1 dwnom2
      regress bush00 black south hisp income dwnom1n dwnom2n

      Compare and contrast the results of these four regressions. What do you think accounts for the differences between the 1996 and 2000 results. Be specific.

    3. In STATA run the regressions:

      regress bush00 black hisp income dwnom1n dwnom2n if south==0
      regress bush00 black hisp income dwnom1n dwnom2n if south==1

      where "south==0" selects the congressional districts in the North, and "south==1" selects the congressional districts in the South (recall that South is defined as the 11 states of the Confederacy plus Kentucky and Oklahoma).

      Interpret the results of these two regressions. What do you think accounts for the differences? Be explicit.

    4. Paste HDMG106.DTA into EVIEWS and replicate the regional Bush 2000 vote regressions in (c). To do this, sort the dataset using the south variable, use the SHOW command to find the observation number of the last northern district, and then issue the command:

      SMPL 1 XXX

      where "XXX" is the observation number of the last northern district. (This command tells EVIEWS to use observations 1 to XXX -- SMPL stands for "sample".) Now, run the regression:

      LS BUSH00 C BLACK HISP INCOME DWNOM1N DWNOM2N

      To get the southern regression simply change the sample range and run the same regression; that is:

      SMPL XXY 435
      LS BUSH00 C BLACK HISP INCOME DWNOM1N DWNOM2N

      where "XXY" is the observation number of the first southern district. For example, if "XXX" was 200 then "XXY" is 201.

    5. Use the SORT, SHOW, and SMPL commands in EVIEWS to run the regression

      LS BUSH00 C BLACK SOUTH HISP INCOME DWNOM1N DWNOM2N

      on congressional districts with 9% or less Blacks and 25% or greater Blacks. Interpret the results of these two regressions. What do you think accounts for the differences? Be explicit.

  2. This problem is a continuation of problem 2 on the 6th homework using the Cigarette data discussed in Epple Notes V.

    1. Use EVIEWS to Perform the Chow Breakpoint test discussed on pages V-8 to V-12 of the Epple notes. Replicate the calculations shown on V-10 and use

      scalar pval=@fdist(_,_,_)

      to get the P-Value (report the first 10 digits). Replicate the method discussed on V-12.

    2. Perform the Chow Breakpoint test in STATA. To replicate the tables shown on page V-11 use the commands:

      regress carmon nicot tar weight if litedum==0
      regress carmon nicot tar weight if litedum==1
      regress carmon nicot tar weight

      The p-value can be computed with the command:

      display fprob(df_num,df_denom,f_stat)

      Below is the output for the first command. Note that the Residual Sum of Squares is what you need to do the calculations shown on page V-10.
      . regress carmon nicot tar weight if litedum==0
      
        Source |       SS       df       MS                  Number of obs =      18
      ---------+------------------------------               F(  3,    14) =   68.15
         Model |  467.096051     3  155.698684               Prob > F      =  0.0000
      Residual |  31.9839391    14  2.28456708               R-squared     =  0.9359
      ---------+------------------------------               Adj R-squared =  0.9222
         Total |   499.07999    17  29.3576465               Root MSE      =  1.5115
      
      ------------------------------------------------------------------------------
        carmon |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
      ---------+--------------------------------------------------------------------
         nicot |  -2.282413   5.320555     -0.429   0.674      -13.69387    9.129043
           tar |   .9897387   .3369055      2.938   0.011       .2671483    1.712329
        weight |  -4.604357   4.923935     -0.935   0.366      -15.16515    5.956433
         _cons |   6.669547   4.326866      1.541   0.146      -2.610657    15.94975
      ------------------------------------------------------------------------------
      
    3. Replicate the Chow Forecast Test discussed on pages V-13 to V-16. To do this, sort the data using the tar variable:

      sort tar

      Note that the last 3 observations will be the highest tar cigarettes. Go into the data editor in STATA and find the value of the 23rd observation for tar and use it in the following:

      regress carmon nicot tar weight if tar < XX.X

      where "XX.X" is the value of tar. The results should allow you to replicate the test given on page V-14.

      Compute the F-statistic and calculate the P-Value using EVIEWS.