Optimal Classification Scores
Updated 5 January 2009
The files below contain Optimal Classification two dimensional scores
for the 1st to the 110th Congresses
(1789 - 2008). These coordinates were computed using OC.
The algorithm is explained in
"Non-Parametric Unfolding of Binary Choice Data." Political
Analysis, 8:211-237, 2000, and
Non-Parametric Unfolding of Binary Choice Data (1998 APSA Paper) (PDF
file). The coordinates for the 1947 - 2002 (80th through
107th Congresses) are analyzed in
Changing Minds? Not in Congress (PDF file).
The files below contain OC coordinates for all members of the
House, all members of the Senate, many Presidents
[for Presidents prior to Eisenhower these are based on roll calls corresponding
to Presidential requests. These roll calls were compiled by an NSF project
headed by Elaine Swift
(
Study No. 3371, Database of Congressional
Historical Statistics, 1789-1989). Many of these scores are based upon a small number
of roll calls so use them with caution!],
and all roll calls with at least 2.5% in the minority
(43,423 roll calls in the House and 43,739 roll calls in the Senate). The
scaling was done on both chambers simultaneously using the 631 members
who served in both the House and Senate as "glue" to define a common
metric. See the discussion in
Changing Minds? Not in Congress for the advantages and disadvantages
of this assumption.
The overall fit of the scaling was 87.92 percent in two dimensions -- 13,533,045 of
15,392,635 total choices were correctly classified for an APRE of .6391. There
were 10,483 unique members of the House, 1,840 unique members of the Senate, and
631 legislators served in both chambers. Hence, there were 10,483 + 1,840 - 631 =
11,692 unique legislators who served in Congress in American history. Known major
political party switchers (e.g., Strom Thurmond) are in the data twice. (For lists
of party switchers, see Congressional Party
Defection in American History).
The unique members file looks like this:
1 3698 NORTH D 200 AANDAHL, 110 10 0.909 0.363 -0.511
2 40 4 VIRGINI 100 ABBITT W 2600 340 0.869 0.096 0.992
etc etc etc
etc etc etc
99909 99 0 USA 100 CLINTON 963 95 0.901 -0.386 -0.154
99910 99 0 USA 200 BUSH 119 5 0.958 0.396 -0.076
The first column is the ICPSR ID number (as corrected by myself and Howard
Rosenthal --
see Corrected ICPSR Member ID Numbers Congress 1 - 107
for a discussion and listing of all members of Congress). The second column is
the ICPSR State Code (99 if President),
the third is the congressional district
(0 if Senate or President), name of state, party code (see the party
codes page for a complete listing of the party codes), name of member,
total choices, errors, proportion correct ([total - errors]/total), and the last
two columns are the two dimensional coordinates.
All Unique Members of the House and Senate Estimates 1st to 110th (Text File, 11,620 lines)
All Unique Members of the House and Senate Estimates 1st to 110th (Excel File, 11,620 lines)
All Unique Members of the House and Senate Estimates 1st to 110th (Stata 8 File, 11,620 lines)
All Unique Members of the House and Senate Estimates 1st to 110th (Stata 7 File, 11,620 lines)
The separate legislator coordinate files by Congress look like this:
1 9062 198 CONNECT 5000 STURGES 0.567 0.350
1 9706 198 CONNECT 5000 WADSWORTH 0.697 0.013
etc etc etc
etc etc etc
107 14657 25 9 WISCONS 200 SENSENBRENN 0.594 -0.804
107 29584 68 1 WYOMING 200 CUBIN 0.571 0.280
The first column is the Congress number, the second is the ICPSR ID number,
the third is the ICPSR State Code (99 if President), the fourth is the congressional district
number (0 in the Senate file), the fifth is the state name, the sixth is the
party code, name of member, and the two dimensional coordinates. Note that
these coordinates are the same as those in the unique members file above.
Legislator Estimates 1st to 110th Houses (Text File, 35,739 lines)
Legislator Estimates 1st to 110th Houses (Excel File, 35,739 lines)
Legislator Estimates 1st to 110th Houses (Stata 8 File, 35,739 lines)
Legislator Estimates 1st to 110th Houses (Stata 7 File, 35,739 lines)
Legislator Estimates 1st to 110th Senates (Text File, 8,644 lines)
Legislator Estimates 1st to 110th Senates (Excel File, 8,644 lines)
Legislator Estimates 1st to 110th Senates (Stata 8 File, 8,644 lines)
Legislator Estimates 1st to 110th Senates (Stata 7 File, 8,644 lines)
The separate roll call coordinate files by Congress look like this:
1 1 1 41 8 49 1 6 1 -0.710 -0.966 0.260
1 2 2 36 1 37 1 1 6 0.675 -0.560 0.829
etc etc etc
etc etc etc
107 988 44774 366 19 385 17 6 1 -0.649 1.000 -0.027
107 990 44776 244 116 360 27 6 1 -0.306 0.998 -0.061
The first column is the Congress number, the second column is the roll call number
(note that unscaled roll calls are omitted!), the third is an overall counter
for the chamber, the fourth is the number of Yeas, the fifth the number of Nays,
the sixth the total number of votes cast, the seventh the number of classification
errors, the eighth and ninth numbers indicate where the Yeas and Nays fell
relative to the projected midpoint on the normal vector (if the first number is
a "6" that means that Nay was predicted below the cutpoint, if "1" Yea),
the tenth number is the projected midpoint on the normal vector, and the last
two numbers are the normal vector for the roll call.
Roll Call Estimates 1st to 110th Houses (Text File, 43,423 lines)
Roll Call Estimates 1st to 110th Houses (Excel File, 43,423 lines)
Roll Call Estimates 1st to 110th Houses (Stata 8 File, 43,423 lines)
Roll Call Estimates 1st to 110th Houses (Stata 7 File, 43,423 lines)
Roll Call Estimates 1st to 110th Senates (Text File, 43,739 lines)
Roll Call Estimates 1st to 110th Senates (Excel File, 43,739 lines)
Roll Call Estimates 1st to 110th Senates (Stata 8 File, 43,739 lines)
Roll Call Estimates 1st to 110th Senates (Stata 7 File, 43,739 lines)