## How does DW-Nominate determine whether a specific member of congress or vote is liberal or conservative?

6

1

The statistical methodology DW-Nominate is frequently seen as a more objective way of determining how liberal or conservative politicians are based on their voting record. Here's a few of many examples of such analysis.

I've never been able to fully grasp how this process works. One above article describes the process as:

Unlike the scoring done by interest groups, DW-NOMINATE doesn't rely on subjective determinations of what constitutes a liberal vote or a conservative vote--it sorts members of a population according to how similar each member's choices are to those of other members of the population. Two senators who vote the same way 90 percent of the time will be much closer to each other than two senators who only vote the same way 10 percent of the time.

But I get lost as to how this is used to generate an all out ranking of members of congress on a scale from most liberal to most conservative. For example, if a Senator votes extensively for hawkish foreign policy and for regulations in the economy, will he look more conservative in DW Nominate than an isolationist who always votes against regulation? And of course there are more issues of disagreement than just those two.

How is the DW-NOMINATE analysis used to rank members of congress on a single liberal vs conservative scale? Is there a way to understand which specific votes/issues are driving the score of an individual member of Congress in one way or another?

You might find this primer helpful, but after reading through part of it I still don't feel like I understand it well enough to explain it. It does discuss the multi-dimensionality, though.

– Bobson – 2015-08-20T16:51:23.380

5

For example, if a Senator votes extensively for hawkish foreign policy and for regulations in the economy, will he look more conservative in DW Nominate than an isolationist who always votes against regulation? And of course there are more issues of disagreement than just those two. How is the DW-NOMINATE analysis used to rank members of congress on a single liberal vs conservative scale?

Basically, the answer is that while it is theoretically possible for legislators to have one ideology on one set of issues and a different ideology on a different set of issues, the research of the people who came up with the DW-NOMINATE score system determined that only one dimension of political ideology has been necessary to describe how members of Congress vote for all but a few brief time periods in the time frame that they have studied, and that even in those brief time periods, no more than two dimensions were ever needed to explain Congressional voting behavior accurately.

Thus, it is possible because the history of roll call voting in Congress shows a very orderly history of simple liberal to conservative voting than it would have had to show. The underlying reality that the people who came up with this model validated when they developed this ranking system is as important an insight about how the political system works in the United States as the actual rankings of legislators that flow from applying their model to the data.

First, NOMINATE scores are functions of the roll call voting behavior of legislators. The algorithm is set up to assign similar scores to legislators who have similar voting patterns, while simultaneously determining how liberal or conservative a particular piece of legislation is based upon which legislators vote for and against it.

In principle, this could be very complicated and there could be elected officials with very different mixes of policy positions on different kinds of issues.

In practice, however, the vast majority of the time, you can predict who will vote for a bill and who will vote against it by simply assigning a single score to each legislator and to each bill on a liberal to conservative scale (put another way, there are lots of liberals and conservatives in legislative office and there are few libertarians and populists in electiee office in the U.S.). And, in a few brief transitional periods, there have been times when each piece of legislation and each legislator needs to have two scores attached to explain the data well (one for economic policies and one for social policies).

There are very few bills, for example, in real life, that win roll call voting support from both strongly conservative and strongly liberal members of Congress, but don't win roll call voting support from more moderate liberal and conservative members of Congress. The system is robust enough that if these outlier bills aren't very common, they won't throw the whole system out of whack.

Once you have a database of liberal to conservative scores for elected officials from all past sessions, you can use the voting patterns of those elected officials to rate particular pieces of legislation on a scale from liberal to conservative, and then you can use the way previously unranked legislators vote on more or less liberal legislation to assign a liberal to conservative score to that legislator. More liberal legislators more often vote for more liberal legislation and less often vote for more conservative legislation, than more conservative legislators in a very predictable pattern.

The process of iteratively updating legislator's scores each session makes it possible to track the shifts in political values a legislator has over time in a well validated statistical manner. The DW-Nominate score looks at one session at a time "dynamic weighting" rather than a whole career at once, to determine where a legislator fits on the liberal to conservative scale and assumes that legislators can drift over time. As long as not too many legislators who get re-elected change their political views too rapidly, the system doesn't break down, and in real life big sudden changes in political views and voting behavior in elected officials who are re-elected are rare.

Another trick that could be used to train the model is to start out with a few limus test bills chosen at random or on purpose to provide an initial rough sort of legislators to get the analysis started, and then to refine it with more data.

It turns out that this is easier than it could be because the more liberal legislation is the more likely it is the more liberal legislators will vote for it and more conservative legislators will vote against it, and these patterns are so regular that the vast majority of bills can be unambiguously ranked from liberal to conservative and the vast majority of legislators can likewise be unambiguously ranked from liberal to conservative.

The procedure sorts both legislation and legislators, along a single liberal-conservative line, is all that is needed, since the combined sorting of legislators and legislation on liberal-conservative lines turns out to be sufficient to explain the lion's share of roll call voting behavior, and the results match what you would expect from common sense without quantifying the distances between results in adjacent rank orderings.

There are also technical ways to overcome the chicken and egg problem in the earliest session of whether to rank legislators first or legislation first by making both rankings simultaneously, which is something you can do mathematically with a computer, if the data is as orderly as it is in real life, even though it is conceptually hard to imagine thinking about both sides of rankings at once with only factory standard wetware available.

The political scientists just have to decide which end of the line to call liberal and which to call conservative based upon the legislation and legislators who end up on each end up the line, which turns out to be trivially easy for anyone remotely familiar with politics. Other than labeling which axis is liberal and which is conservative, the rest of the process is purely automatic and deterministic (i.e. there aren't judgment calls for researchers to make on someone's score).

It is possibly to quantify just how liberal or conservative legislators are by scaling the scores such that legislators whose vote X% in common in a session have distance Y from each other in their liberal to conservative score. So, in the extreme, someone who always votes for legislation ranked at least 51% liberal and never votes for legislation ranked 49% liberal would have a 100% liberal score. The frequency with which someone deviates from this pattern in their voting determines their score.

Similarly, when you know the scores of legislators, it turns out that almost all bills can be easily ranked from liberal to conservative based upon who votes for the bill, since the data turns out to be well ordered. Basically, in time periods like now where Republicans are consistently more conservative and Democrats are consistently more liberal, the more Democrats there are who vote for a bill and the fewer Republicans there are who vote for a bill, the more liberal the legislation will be assigned to be (and visa versa).

Then, once you have a ranked set of bills, you just look at the scores of the bills a member votes for and against. If bills are ranked from 0 for most conservative and 100 for most liberal, and a legislator has a 51% or greater chance of voting for bills with a particular score, and a 51% or greater chance of voting against bills with a particular score, then that balancing point is a legislator's score.

In real life, legislators who are free to sit where they want tend to sit closest to people who vote most similarly to them causing them to self-sort in rank order, which happened in one of the first well documented legislative bodies leading up to the First Republic in France. Legislators then arranged themselves from left to right based on ideology and a new piece of political terminology was born. Seating arrangements aren't actually used at all to come up with NOMINATE scores, but do tend to line up based on NOMINATE scores when people are free to choose whom to sit next to.

In the rarely necessary two dimensional case where politician's voting preferences are more complicated you have to use tools from linear algebra called "eigenvectors" to figure out how much of a vote is based upon legislator's economic stances and how much is based upon their social issue stances. This is the same kind of mathematics that is used to make principal component analysis charts of multi-dimensional data when multiple dimensions can contribute to the outcome of a particular data point.

There are statistical formulas like the Chi-squared test which are called non-parametric statistics, that tell you how much your simple one score for each legislator and one score for each bill model replicates real life, and compare it to a more complicated model where you assign a social and an economic score to each legislator and to each bill. If your one number per person and bill model fits the data almost as well as a two number per person and bill model (unless there is a perfect fit to the one number model, the two number model will always be slightly better, but often not enough of an improvement to justify using two numbers instead of one), then you know that the data is basically one dimensional and can quantify the margin of error with the same Chi-squared or similar statistic that describes goodness of it. On the the other hand, if the fit is a lot better with two numbers to describe each person and bill than with just one number, then we say that behavior takes at least two dimensions to explain. There are rules in non-parametric statistics that allow you to compare the goodness of fit in a one number per legislator and bill model to the goodness of fit in a two number per legislator and bill model. (Never in U.S. history have three dimensions been necessary to get a maximally good fit consider the fact that you need three numbers instead of one or two to describe a legislator or bills' political leanings.)

I hope this answer strikes a reasonable balance between providing the gist of how the NOMINATE scoring system works and not getting too technical. This is based upon the explanation of the system linked to in the comments and my background as an undergraduate math major who specialized in applied math.

The kind of math used is not particularly sophisticated and really doesn't get harder than the first couple of courses of calculus based probability and statistics that an undergraduate student in a STEM field or other quantitative field would take. It is much easier, for example, than the math needed for quantum physics or general relativity.

2Nice answer. I'd call more attention to the "decide which end of the line to call liberal and which to call conservative" bit, just because it emphasizes that the score itself is just a description of how similar things are, and doesn't make any categorizations. The conservative/liberal labels are applied after the fact based on the results. – Bobson – 2019-01-02T22:19:16.003

1@Bobson Fair enough. This does go to the objectivity of the method, although the labeling becomes trivial once the objective part is done. The really key point in my mind, however, is that this is a method that from a rigorous mathematical sense doesn't have to work and only does work because the data are much more well behaved than they could have been. – ohwilleke – 2019-01-07T03:59:01.037