I recently developed a model of how the primary race will play out between Democratic presidential hopefuls Hillary Clinton and Bernie Sanders.
That model made certain assumptions, and allowed me to produce two projections (well, many, but I picked two) depending on how each candidate actually fairs with different ethnic groups (White, Back, Hispanic, since those are the groupings typically used).
The two different versions of this model were designed to favor each candidate differently. The Clinton-favored model started with the basic assumption that among white Democratic Party voters, both candidates are similar, and that Clinton has a strong lead among Hispanic voters and an even stronger lead among African American voters. The Sanders-favored model assumes that Sanders has a stronger position among White voters and less of a disadvantage among non-White voters.
The logic behind the equivalence among White voters is that this his how the two candidates did in Iowa, which is a representative of the United States White vote, unadulterated by a favorite son effect in New Hampshire. Nevada failed to indicate that this assumption should be changed.
The favoring of Clinton among non-White voters is based on national polling with respect to ethnic effects. The logic behind the Sanders-favored version is that Sanders’ strategy, to win, has to involve a large young, white, male turnout (evidenced in the polls) and a narrowing of the gap among African American and Hispanic voters.
In that model, presented here, I used statewide demographic data to establish the ethnic term. However, that is incorrect, because one’s chances of engaging in the Republican vs. Democratic process in one’s state is tied to ethnicity. More Whites are Republicans, more Blacks are Democrats. I knew that at the time I worked out the model, but sloth and laziness, combined with lack of time, caused me to simplify.
The newer version of the model adjusts for likely Democratic Party membership. The results are the same but less dramatic, with a much longer slog to the finish line and the two candidates doing about the same as each other for the entire primary season.
The outcome of my modeling (reflected in the non-adjusted and adjusted versions, each with a Clinton- and Sanders-favored version) is different from the expectations of either campaign, as far as I can tell. Clinton boosters are claiming that the Democratic Party is mainly behind her, and these first primaries are aberrant. Sanders boosters are claiming the Sanders strategy of having a surge of support will carry him to victory. Both of these characterizations require that each candidate surge ahead pretty soon, and don’t look back. The opportunity to surge ahead is, certainly, Super Tuesday (March 1st).
The models I produced, with the assumptions listed above, show a close race all along, so either the campaigns are wrong or I am wrong.
The graphic at the top of the post represents how far ahead each candidate will be across the primary season, for each of their respective favored strategies.
So for Clinton, the ethnic gap is maintained as wide, and the blue line shows that she will surge nearly 40 committed delegates ahead of Sanders (a modest surge) and continue to develop a wider and wider gap past mid-March, and thereafter, maintain but not increase that gap, of about 80 committed delegates, until the end.
For Sanders, the orange line, the initial gap formed on Super Tuesday, does not start out very large, but his gap steadily increases until the end of the primary season, ending with a gap of over 120 committed delegates.
So, that is the new model. But, it is a bogus model.
I’m trying to stick with empirical data that do not rely on polling. Why? Because everybody else is relying on polling, and this is an election season where the polling is not doing a good job of predicting outcomes. Also, my modeling gives credit to each campaign’s claims, which is at least interesting, if not valid, as a way of approaching this problem. If Clinton is right, she wins this way. If Sanders is right, he wins that way.
However, the data are insufficient to have much faith in this model. Super Tuesday will provide a lot more information, and with that information I can rework the model and have some confidence in it.
Who will win the South Carolina Primary, Clinton or Sanders?
While working this out, I naturally came up with predictions for what will happen in all of the future primaries. So let’s look at some of that.
In South Carolina, according to my model, if Clinton’s strategy holds, she will win 29 delegates, and Sanders will win 24 delegates. If the Sanders strategy pertains, they will tie, or possibly, Clinton will win one more delegate than Sanders.
Who will win the Super Tuesday primaries?
The following table shows the results predicted by this model, for both the Clinton-favored and Sanders-favored versions, for all the Super Tuesday state primaries or caucuses.
The Clinton-favored model suggests that Clinton will win six out of 11 primaries, and take the majority of uncommitted delegates. The Sanders-favored model suggests that Sanders will take 9 out of 11 primaries, and win the majority of uncommitted delegates.
Notice that I put Vermont in Italics, because Sanders is likely to win big in Vermont no matter what happens. This underscores the nature of this model in an important way. I’m not using any data from the actual states, other than the ethnic mix from census data, with an adjustment applied to produce an estimate of Democratic Party membership across ethnic groups. That estimate is based on national data as well as data specifically form Virginia, to provide some empirical basis.
I suspect most people will have two responses to this table. First, they will say that a model that incorporates Clinton’s strategic expectations should have her winning more. Second, they will say that all the numbers, for all states and all models, are too close.
These are both legitimate complaints about my model, and will explain why it will turn out to be totally wrong. Or, they are suppositions people are making that are totally wrong, and when my model turns out to be uncannily accurate, those suppositions will have to be put aside for the rest of the primary season. (Or, some other outcome happens.)
I will restate this: I’m looking for Super Tuesday to provide the best empirical data to make this model work for the rest of the primary season. But, in the meantime, this seemed like an interesting result to let you know about.