What is the relationship between what happens in these two early primary races and what actually happens later on in the election cycle?It turns out that this is a difficult question to answer. One very simple way of asking the question is this: Does the winner of a given contest also become the nominee? We could also ask if that person becomes president, but this would involve so many additional contingencies that it might be better left alone. So let’s just stick with the link between winning either race and becoming the nominee, and also consider the predictive power of winning both races and being the nominee.This is also difficult because things may change over time … the actual importance of early primaries vs. later primaries, for instance … such that earlier data may not be very useful. Fortunately, there isn’t much earlier data as the system we have now is really only a few decades old. The earliest relevant data is probably from New Hampshire in 1952 and Iowa no earlier than 1972.Also, for a given primary race for a given party, sometimes these early races don’t matter because a sitting president is running who either has no opposition or only token (or at least, irrelevant) opposition. In addition, the pattern of predictability may be different for the two parties.The Iowa caucus was a relevant race for Democrats in 1972, 1976, 1980, 1984, 1988, 1992, 2000 and 2004. The New Hampshire Primaries were relevant for Democrats in 1952, 1956, 1960, 1968 (but see below), 1972, 1976, 1980, 1984, 1988, 1992 and 2000. Out of the eight relevant Iowa caucuses, the winner went on to become the nominee four times. Of the eleven relevant New Hampshire primaries, six of the winners became the party’s nominee.However, it is reasonable to consider excluding 1968. The (probably) most likely nominee, Kennedy, was not running in the early primaries because he (and others) chose to not oppose a sitting president. Then, the sitting president (who had won the primary) dropped out. Then the likely nominee was murdered. Furthermore, the eventual nominee, Vice President Humphrey, did not actually campaign in most of the primaries because he entered the race late, and instead, used the “favorite son” method, whereby proxies ran on his behalf, pledging to ask their delegates to pass their votes on to Humphrey at the convention. And so on. In other words, there are so many oddities associated with the 1968 Democratic race, any one of which might reasonably exclude it from use in this kind of analysis, that perhaps we’d better exclude it.Taking out 1968, there were 11 relevant New Hampshire Primaries that “predicted” the party’s nominee 6 times.So the trend is that the winner of the Democratic nomination is “selected” correctly by the primaries somewhat more than half the time, which is slightly more important than it may sound considering that there are often more than two candidates running in that primary (we’ll get to that later). So yes, it matters. On the other hand, just over a 50% rate of successful prediction, which with numbers this small can easily be attributed to rounding errors, one should not put too much stock in this.For the Republicans, five Iowa caucuses successfully predicted three nominees, and eight New Hampshire primaries predicted five nominees. Again, the rate is just over half, and again the numbers are small.Since the pattern is roughly similar for Democrats and Republicans, and for both of these primaries, it seems reasonable to combine the numbers. Overall, there were 31 or 32 contests (depending on if you leave the Democratic 1968 contest in or out of the data) in which 18 winners went on to be the party nominee. I quickly add that there were not 18 different winners, and these data do not have the very important characteristic of independence. In other words, the same person may have run in and won or lost both New Hampshire and Iowa. But let’s pretend that is not really a problem and calculate a provision statistic. A simple proportion will do.The number is 0.5806. If every race was a two-person race, the expected value would be 0.5000 if these early contests were random. If every race was a three-person race, it would be 0.3333. And so on.Well, it turns out that the average number of people running in each of the relevant races (across both parties and both contests) not including the Democratic 1968 race, is 4.344. A randomly chosen candidate would be expected to win, if the entire process was random, about 23% of the time. This seems to indicate that a slightly more han 50% success rate is actually pretty darn good.Now, the actual number of candidates in each race may vary if one wishes to exclude crazy people. I did not count, for instance, Pat Paulsen at all, and Harold Stassen is counted only once, and probably legitimately since he had actually won the nomination in the previous year.Since I eliminated irrelevant races from all consideration, there are always a minimum of two candidates per race, and the total is never more than seven. Surely, some of the candidates can be viewed as having never really had a chance even at the time, as we might, for instance, view several of the candidates in this week’s races for both of the parties. Indeed, I would suggest that a meaningful number of candidates, on average, is almost always either 2, 3 or 4, but rarely 4. A thumbsuck estimate of the average number of candidates per race could reasonably be set at 2.8 (which might indeed be a somewhat high estimate).Thus, a random winner would, if the nomination was also random, have about a 35% chance of winning, yet in real life, winners of either Iowa or New Hampshire seem to have a nearly 60% chance of winning. So it matters.Is there a synergistic effect in the predictive power of these two early races? In other words, is winning both New Hampshire and Iowa a better predictor that a candidate will win the nomination? Well, there are twelve races in which the race for a particular party is relevant, and both of these races happened. The number of times in which the two races agreed and were correct is 3, or 25% of the time. In a two person race, we would expect random agreement on a correct (but also random) answer to be 25%. There are twelve possible outcomes for two sequential races with 3 candidates in each, meaning an 8.3 percent random rate of correspondence between randomized races and outcomes. So if our average number of candidates is just under 3, then the expected frequency of a candidate winning both races is about 10%.So yes, again, it seems to matter, a little. The magnitude of reality over random for a given race (2:1) is somewhat less than the magnitude of reality over random for winning both races (25:10). So, not surprisingly, a candidate that sweeps Iowa and New Hampshire is a better bet for the nomination than a candidate that wins only one.There are so many ways in which this analysis is invalid that it is not funny. For instance, winning one race vs. two is not independent of winning the nomination … you win the nomination by winning primaries. So when we use these two measures to test for a relationship, we are allowing a number to be correlated to itself in a very real sense. However, had this bit of number crunching shown that reasonable estimates for a random outcome are very similar to the empirical reality of the last few decades, then one might justifiably be very cynical about putting too much weight behind the results we are seeing now. The Iowa Caucus and the New Hampshire Primary are reasonably important. On the other hand, it is clear that the other primaries are going to matter too!By the way, only twice did a candidate who lost both races in relevant seasons for a particular party, out of the twelve opportunities to do so, fail to gain the nomination. This is one fewer than the number that won both races and then won the nomination. What does this mean, if anything? Well, if it means anything, it may mean that candidates who lose both Iowa and New Hampshire should not be written off as easily as one might otherwise expect.Which is probably why Harold Stassen and Pat Paulsen were always in the race, one way or another…
One function of the Iowa caucuses and New Hampshire primary that you didn’t consider or discuss is simply that of narrowing the field of candidates for the following primaries. I suspect that if you look at the overall results for the top two candidates in those races you’ll find that in the vast majority those early contests have weeded out the non-viable contestants.I’m also unsure that going back to 1952 is really relevant in measuring the current function of primaries, since the political conventions still had much more influence on deciding a final candidate up through at least 1968 (as your example of Humphrey illustrates).
chezjake:To some extent your first question is the obverse of what I was looking at: If a candidate is wiped out in this early stage, can they keep running and win. I think there are three kind of candidates in this regard: Two kinds of front runners and everybody else. Whoever ends up in the everybody else range is no longer viable.Regarding your second question, one could argue that post 1968 is the cutoff for relevancy. The modern system is pretty recent. The problem, though, is that with a race every 4 years, and for a given party the primary system not being relevant at all in a given year, there is very little data.There are other differences between the early races and later that I did not discuss (I really tried to limit what I was doing here). For instance, the total number of candidates in NH or Iowa is typically smaller in the earlier years than in the later years.Personally, I think that 1968 was a pretty scary year for American Politics. The Dem convention was violent. One of the candidates (earlier in the process) was murdered. The streets were virtually and sometimes literally on fire. The post 1968 system is, I feel, very much a response to that transition, with significant efforts to make things more calm and systematic.But, it is still true that we hear on a daily basis during the early primaries how important the New Hampshire Primary or the Iowa Caucus is. Or, we hear how unimportant each is. It all depends on who is taking, when, and why. My effort here was simply to put a couple of numbers together to see if there was actually a pattern that could be referred to.To me, the answer is this: Yes, there is a pattern and these political events matter, but no, they are nothing close to determinative.Thanks for your comments.
Another thing to consider (new to 2008 if i’m not mistaken) is that the early caucuses can expand the number of relevant candidates. Mike huckabee would probably have been a no chance candidate if it weren’t for his victories in Iowa. He probably wouldn’t have spent as much money (indeed, he wouldn’t have as much money to spend) in NH if he hadn’t won in Iowa.
Chris’s comment brings up another point. Earlier on, and to a certain extent still today, having Iowa and NH start things off allowed candidates to test the waters without having to raise huge wads of cash. With the way the media are turning the presidential race into a two-year marathon now, that’s not as easy any more.It also bothers me that although the media have been fairly decent at covering at least 5 Republican candidates, they essentially reduced the Democrats to 3 candidates beginning last spring — coverage of Richardson, for example, has been almost non-existent except in the debates. Many of the polls didn’t even report results for anyone but Clinton, Obama, and Edwards.
The Iowa caucus was a relevant race for Democrats in 1972, 1976, 1980, 1984, 1988, 1992, 2000 and 2004. The New Hampshire Primaries were relevant for Democrats in 1952, 1956, 1960, 1968 (but see below), 1972, 1976, 1980, 1984, 1988, 1992 and 2000. Out of the eight relevant Iowa caucuses, the winner went on to become the nominee four times. Of the eleven relevant New Hampshire primaries, six of the winners became the party’s nominee.