Tag Archives: Polling

Dem Vs Trump: How are we doing?

There is a new poll pitting various Democrats against Trump. Before you complain to me that we should not be looking at polls because it is not election day, think again and take note of the fact that polls are data and I’m a data-oriented scientist, so don’t even say that to me. (I’m working on a post that will serve as an answer to that complaint every time it comes up on Facebook)

Anyway, this is a Quinnipiac University poll taken in Pennsylvania. Quinnipiacis a good poll. Details are here. Also note that I’m not posting this poll because its results show something I want to push, or use to cause your hair to burst into flames. I’ve not looked at the results, yet here I am writing this blog post. I will now look at the results, figure out a good way to show them to you, then finish the post. brb.

The poll has a LOT of interesting data that will figure as important down the road as the number of candidates cull out and we get to see the results of a bunch of natural experiments (like, which non-dropping out candidate tends to accrete which demographic as they drift away from dropper-outer-candidates).

But here is what the head to head shows:

These results vary considerably when adjusting for age, gender, and race. Note that in this sort of matchup, reaching above 50 is considered by pollsters as a sort of magic number. Only Biden does that here, but Sanders is (obviously) vert close.

I think the most important message here is this: The candidates do not vary much in this very early indication of their electability, even if they vary a great deal in how they rank among Democrats.

Putting aside the head to head and looking at some of the other data, among registered democrats, Biden has 39% support, Sanders has 13% support, with Harris, Warren at 8%, Buttigieg at 6%, Booker at 5%, O’Rourke at 2%, and Klobuchar at 1%. Nobody else registers. Someone else has dropped, interesting, to an apparent low at 2%. I’m thinking people realized, “no, no, NOT someone else, pleasssseee!!!”

Again, that varies by age, gender, race, etc. Among the young, Biden and Sanders are essentially tied (Biden just ahead, 29-27), while among the old, Sanders barely registers and Biden swoops (Biden: 47%, Sanders 4%). Putting both Biden and Sander aside for a moment, and digging into the demographic weeds, Harris, Warren, and Buttigieg pop among those with higher incomes (Sanders gets very little support there, Biden plenty). Harris pops among older folks, Booker and Buttigieg do a bit better with younger folks. Liberals like Buttigieg and Warren.

UPDATED: Was there a Clinton Surge or not?

Updated to include polls through Oct 26th (AM, more polls later in the day on the 26th will be added at the next update):


Updated, 25 October AM

As I expected, and demonstrated much to the consternation of everyone, the ever widening double digit lead of Clinton over Trump in an increasing number of polls meme is a falsehood. Here is the latest graphic using the same approach as described below, but updated to reflect additional polls.


Rather than a widening, or even consistent, gap, or a gap that is double digit, we see Clinton continuing to lead, but pretty much in the same way that she has led since the conventions. In other words, the three presidential debates, the release of Trump’s tax records, the sexual assault tape, the confirmation of many actual groping cases, and the VEEP debate, may have had some short term effects on the polls, and if you look closely and squint, may have actually re-widened Clinton’s lead to post convention levels a bit, but for the most part, we are looking at a pretty steady relationship between the two candidates from the end of the convention period to the present.

When the general polls conform to expectations, they matter. When they don’t conform to expectations, say “yeah, but what really matters is the electoral college, and in the electoral college … bla bla bla.”

And yes, since we attempt to choose our president using the Electoral College (though that doesn’t always work) that is what matters, and it may be the case, though I can not independently confirm this at this exact moment in time (Tuesday AM), that Clinton is either taking or widening the lead in some of the swing states, and some red states are turing less red, as we speak. But, it turns out that we DO look at the general numbers for a number of reasons, including the fact that we expect general trends to conform to state wide trends, as a check on what we are seeing, and general trends may matter down ballot.

The original reason that I wrote this post is that I was concerned that a lot of commenters (and maybe voters) had come to the conclusion that Clinton’s lead was growing, nearing or in the double digit range, and that the Clinton campaign need not look back, and could start doing other things, but, my read on the polls was that the debate/scandal swing looked like earlier swings, and I had little faith that it was long lasting. I took a look at the data and saw preliminary information suggesting that this may be the case. And now, that is confirmed. I conclude for now that the three presidential debates, the release of Trump’s tax records, the sexual assault tape, the confirmation of many actual groping cases, and the VEEP debate, may have had some short term effects on the polls, and if you look closely and squint, may have actually re-widened Clinton’s lead to post convention levels a bit, but for the most part, we are looking at a pretty steady relationship between the two candidates from the end of the convention period to the present.

And yes, I said the part that the incredulous will ignore twice.

I may do another electoral projection to replace this one later today.

Original Post:
America. Democracy. Decency. Thoughtfulness. Everybody and every thing, it feels like.

Everyone is upset this morning about Trump’s comment that he will wait and see about the results before he accepts them. His comments are deplorable and astonishing, but I think they are also a distraction. If he ignores the results, it may be a bit messy but he will be ignored. A few militia groups will go and take over a Federal facility or two, but that will be managed. Unless the Congress gets on board with denying Clinton the presidency, nothing really bad will happen.

I’m more alarmed by all the comments he made in this debate, and perviously, about how he would handle wars, the military, the economy, the law, the Supreme Court, trade, ethnic/race relations, and his comments about women (which continued last night). Those are all problems that will ruin us as a country if he wins, and that have damaged us as a country already even if he walks away from this race right now. I’m not all that worried about him having a tantrum if he loses.

And, of course, it is maximally concerning that Trump wins the election, than it is that he loses and refuses to go quietly. This is because it is simply not the case that Hillary Clinton and the Democrats have this sewn up. Let me show you why.


This graph shows the daily averaged-out polls, all of them, as listed by RCP’s site, since July 1st (plotted on a y-axis of days before the election). There is a 3 day moving average imposed on this (a shorter moving average than usual, but this is an average of averages, and those averages are of polls taken over varying numbers of prior days, so we have plenty of helpful smoothosity on that curve).

Never mind the details for a moment. Notice first that over this time, which starts in the month of the conventions and goes up to the present, there is an overall pattern of oscillation. For much of the time all of the pols are within the margin of error, but Clinton’s polls are usually higher than Trumps, when averaged out. If you apply the FiveThirtyEight method, or use similar approaches, to combine the different polls into probability statements, one can be more definitive about Clinton’s overall and consistent lead since the conventions.

But, notice that about 50 days out, the two candidate’s polling became close before Clinton started to separate again, and also notice, that this cycle of Clinton pulling ahead and then drawing down again seems to be happening one more time. There was probably a lot of pressure separating Clinton and Trump, with Trump’s bizarre and generally poor performance in the debates, the revelation of the tape in which he seems to have no clue that sexual harassment is not OK, and the revelations seeming to confirm that he is a serial sexual molester, and the tax story from the NYT, and all of that. But the about 27 days out, that pressure relaxes, and all the numbers regress towards the mean again.

Let me put this another way, as a stark but supportable hypothesis. About 50% of the United States would vote for Trump, and about 50% would vote for Clinton. People talk about the 35% to 40% Trump base, and that’s real. And Clinton has a similar base. But the rest of the country, the 20% to 30% that are not part of those groups, are divided roughly in half, in terms of preference for either candidate, and their preference is soft.

If there are no more strong events pushing people away from Trump, the numbers will settle down to where they were between days 40 and 50. this will place trump within about one point of Clinton. And, one point is very very close.

The current widespread rhetoric that Clinton is going to win no matter what may be the exact cause of her losing. How many people will not bother to vote, when they otherwise might have, because they are confident that Clinton will win? If the two candidates are 1% apart, then only 1 in 200 voters have to do that to put Trump in the White House.

Let me note what may end up being the greatest situational irony of our times. MSNBC has lots of great commentators and reporters, like Rachel Maddow and Chris Hayes. They are providing the most thoughtful and coherent analyses of what is going on during this election cycle. But, they are also constantly repeating and supporting the rhetoric that Trump can’t win. And, their audience corresponds closely to that subset of people who are going to vote for Clinton.


Unless MSNBC and other sources fail to shut up about how Clinton can’t possibly lose, and one in 200 otherwise-Clinton-voters stay home.

There are, of course, other possibilities. The apparent closing of the gap we see on the above chart could be an artifact of poling and disappear by itself over the next 48 hours, or it could be real, but reverses because of something Trump does. However, keep this in mind: Trump is being such a distraction from the race that a lot of information that could be used against Clinton (legitimately or not) is currently piling up and not coming into play. It is quite possible that forces that work to push Trump down on this graph could be weak, and forces that work to push Clinton down on this graph could be strong, and we might not be looking at a dangerously weak 1% lead by Clinton when the first week of November rolls around. We may be looking at a distinct Trump lead.

I should mention that today’s polls are not shown on this graph because they are mostly not available. Those that are available are in that subset that tends to favor Trump, but they are all showing a virtual dead heat.

Today, tomorrow, through Monday, we should be looking very closely at the polls. If they show narrowing, then my Hypothesis from Hell can’t be ruled out and the idea that the race is really about 50-50 between scandals needs to be taken seriously.

Whom Should I Vote For: Clinton or Sanders?

You may be asking yourself the same question, especially if, like me, you vote on Tuesday, March 1st.

For some of us, a related question is which of the two is likely to win the nomination.

If one of the two is highly likely to win the nomination, then it may be smart to vote for that candidate in order to add to the momentum effect and, frankly, to end the internecine fighting and eating of young within the party sooner. If, however, one of the two is only somewhat likely to win the nomination, and your preference is for the one slightly more likely to lose, then you better vote for the projected loser so they become the winner!

National polls of who is ahead have been unreliable, and also, relying on those polls obviates the democratic process, so they should be considered but not used to drive one’s choice. However, a number of primaries have already happened, so there is some information from those contests to help estimate what might happen in the future. On the other hand, there have been only a few primaries so far. Making a choice based wholly or in part on who is likely to win is better left until after Super Tuesday, when there will be more data. But, circling back to the original question, that does not help those of us voting in two days, does it?

Let’s look at the primaries so far.

Overall, Sanders has done better than polls might have suggested weeks before the primaries started. This tell us that his insurgency is valid and should be paid attention to.

There has been a lot of talk about which candidate is electable vs. not, and about theoretical match-ups with Trump or other GOP candidates. If you look at ALL the match-ups, instead one cherry picked match-up the supporter of one or the other candidate might pick, both candidates do OK against the GOP. Also, such early theoretical match-ups are probably very unreliable. So, best to ignore them.

Iowa told us that the two candidates are roughly matched.

New Hampshire confirmed that the two candidates are roughly matched, given that Sanders has a partial “favorite son” effect going in the Granite State.

Nevada confirmed, again, that the two candidates are roughly matched, because the difference wasn’t great between the two.

So far, given those three races, in combination with exit polls, we can surmise that among White voters, the two candidates are roughly matched, but with Sanders doing better with younger voters, and Clinton doing better with older voters.

The good news for Sanders about younger voters is that he is bringing people into the process, which means more voters, and that is good. The bad news is two part: 1) Younger voters are unreliable. They were supposed to elect Kerry, but never showed up, for example; and 2) Some (a small number, I hope) of Sanders’ younger voters claim that they will abandon the race, or the Democrats, if their candidate does not win, write in Sanders, vote for Trump, or some other idiotic thing. So, if Clinton ends up being the nominee, thanks Bernie, but really, no thanks.

Then came South Carolina. Before South Carolina, we knew that there were two likely outcomes down the road starting with this first southern state. One is that expectations surrounding Clinton’s campaign would be confirmed, and she would do about 70-30 among African American voters, which in the end would give her a likely win in the primary. The other possibility is that Sanders would close this ethnic gap, which, given his support among men and white voters, could allow him to win the primary.

What happened in South Carolina is that Clinton did way better than even those optimistic predictions suggested. This is not good for Sanders.

Some have claimed that South Carolina was an aberration. But, that claim is being made only by Sanders supporters, and only after the fact. Also, the claim is largely bogus because it suggests that somehow Democratic and especially African American Democratic voters are somehow conservative southern yahoos, and that is why they voted so heavily in favor of Clinton. But really, there is no reason to suggest that Democratic African American voters aren’t reasonably well represented by South Carolina.

In addition to that, polling for other southern states conforms pretty closely to expectations based on the actual results for South Carolina.

I developed an ethnic-based model for the Democratic primary (see this for an earlier version). The idea of the model is simple. Most of the variation we will ultimately observe among the states in voting patterns for the two candidates will be explained by the ethnic mix in each state. This is certainly an oversimplification, but has a good chance of working given that before breaking out voters by ethnicity, we are subsetting them by party affiliation. So this is not how White, Black and Hispanic people will vote across the states, but rather, how White, Black and Hispanic Democrats will vote across the state. I’m pretty confident that this is a useful model.

My model has two versions (chosen by me, there could be many other versions), one giving Sanders’ strategy a nod by having him do 10% better among white voters, but only 60-40 among non-white voters. The Clinton-favored strategy gives Clinton 50-50 among white voters, and a strong advantage among African American voters, based on South Carolina’s results and polling, of 86-14%. Clinton also has a small advantage among Hispanic voters (based mainly on polls) with a 57:43% mix.

These are the numbers I’ve settled on today, after South Carolina. But, I will adjust these numbers after Super Tuesday, and at that point, I’ll have some real confidence in the model. But, at the moment, the model seems to be potentially useful, and I’ll be happy to tell you why.

First, let us dispose of some of the circular logic. Given both polls and South Carolina’s results, the model, based partly on South Carolina, predicts South Carolina pretty well using the Clinton-favored version (not the Sanders-favored version), with a predicted cf. actual outcome of 34:19% cf 39:14% This is obviously not an independent prediction, but rather a calibration. The Sanders-favored model predicts an even outcome of 27:26%.

The following table shows the likely results for the Clinton-favored and Sanders-favored model in each state having a primary on Tuesday.
Screen Shot 2016-02-28 at 12.50.21 PM
The two columns on the right are estimates from polling where available. This is highly variable in quality and should be used cautiously. I highlighted the Clinton- or Sanders-favored model that most closely matches the polling. The matches are generally very close. This strongly suggests that the Clinton-favored version of the model essentially works, even given the limited information, and simplicity of the model.

Please note that in both the Clinton- and Sanders-favored model, Clinton wins the day on Tuesday, but only barely for the Sanders-favored model (note that territories are not considered here).

I applied the same model over the entire primary season (states only) to produce two graphs, shown below.

The Clinton-favored model has Clinton pulling ahead in committed delegate (I ignore Super Delegates, who are not committed) on Tuesday, then widens her lead over time, winning handily. The Sanders-favored model projects a horserace, where the two candidates are ridiculously close for the entire election.


So, who am I going to voter for?

I like both candidates. The current model suggests I should vote for Clinton because she is going to pull ahead, and it is better to vote for the likely winner, since I like them both, so that person gets more momentum (a tiny fraction of momentum, given one vote, but still…). On the other hand, a Sanders insurgency would be revolutionary and change the world in interesting ways, and for that to happen, Sanders needs as many votes on Tuesday as possible.

It is quite possible, then, that I’ll vote for Sanders, then work hard for Hillary if Super Tuesday confirms the Clinton favored model. That is how I am leaning now, having made that decision while typing the first few words of this very paragraph.

Or I could change my mind.

Either way, I want to see people stop being so mean to the candidate they are not supporting. That is only going to hurt, and be a regretful decision, if your candidate is not the chosen one. Also, you are annoying the heck out of everyone else. So just stop, OK?