WISCONSIN PRIMARY: FINAL FORECAST

Hello everyone,

Bernie Sanders seems to have maintained about the same projected lead as indicated in my previous post. Recent polling, the Benchmark Politics’ benchmark, and the FiveThirtyEight projection seem to corroborate this. Here is my final estimate:

Screen Shot 2016-04-04 at 6.21.48 PM

This is a particularly strong number, because Hillary Clinton has generally done much better in the open primary format, with the exception of Vermont and Michigan (though Michigan may as well have been a tie). Still, the demographics of Wisconsin favor Bernie more than Hillary, with an 83.3% non-Hispanic White, and 6.3% Black population. If the above numbers are accurate, this would produce a delegate allocation of 35 for Clinton, and 51 for Sanders.

However, it has been reported that there has been a record breaking number of early voting in the last two weeks in Wisconsin. This has strongly favored Hillary in previous contests, and it stands to reason that it will likely hold true in Wisconsin as well. For this reason I expect Hillary’s vote share to be slightly higher than the above number (10:11 PM edit: A recent Emerson poll shows Clinton trailing Sanders in early voting 38% to 52% which seems to indicate the opposite is true).

-Tyler

APRIL DEMOCRATIC PRIMARIES: OUTLOOK

Hello everyone. I’ve been receiving a lot of requests to publish some early numbers for the April states, so I’ve put some preliminary numbers together for you.

I will continue updating this post as new data comes in.

Here are the current projections:

Screen Shot 2016-03-30 at 7.56.02 PM

Producing the following delegate allocation:

Screen Shot 2016-03-30 at 7.55.26 PM

  • Wisconsin: Bernie should do well here, though I’m not sure that he will do as well as the above numbers indicate. He has a significant presence on social media, and the demographics favor him. Wisconsin is an open primary, however, and the crossover anti-Trump votes by Democrats or Independents that would have otherwise supported him will be damaging. This effect is accounted for in the above numbers, though.
  • Wyoming: Bernie will win Wyoming by a margin somewhere between 25-60 points. Wyoming is a caucus and is only 0.8% African American.
  • New York: Hillary will do very well here. She has a massive social media presence among New Yorkers, and the state has a slightly larger than average percentage of African Americans. New York also has a closed primary.
  • Connecticut: This state is a toss up at the moment. Sanders has a fair social media presence here, but Connecticut has a closed primary. He has lost every fully closed primary (not semi-closed) thus far.
  • Delaware: Hillary should win Delaware by 10-40 points. This is because of the closed primary format, as well as the 21.4% African American population.
  • Maryland: Hillary will, more than likely, win Maryland by the biggest margin of any of the April primaries. This is because of the 29.8% African American population (more than Alabama, and effectively the same as Louisiana) and the closed primary format.
  • Pennsylvania: Because it is still several weeks until Pennsylvania votes, I can still see this one going either way, though it is clearly leaning Hillary. Pennsylvania has a closed primary as well, though Bernie has a decent social media presence in the state.
  • Rhode Island: It will be a while before Rhode Islanders vote, but it is currently leaning Bernie. Rhode Island has a semi-closed primary, which Sanders has done relatively well in so far (New Hampshire, Massachusetts, Oklahoma, North Carolina) compared to closed primaries. Rhode Island is 5.7% African American, but Bernie has only an average social media presence in the state. I would classify Rhode Island as a toss up.

As you can see, even if Bernie does remarkably well in Wisconsin, Wyoming, and Rhode Island, the delegate deficit he will pick up in Maryland alone will more than cover those surpluses. Hopefully the Sanders campaign campaigns intensively in New York, Pennsylvania, and Maryland to try to control the damage. Bernie’s campaign may, in fact, be mathematically better off forgetting Wyoming and Rhode Island altogether if (for example) a couple of points of over-performance in New York means that he offsets twenty delegates worth of deficit he would have otherwise incurred; though outright wins are without a doubt important.

As always, thank you everyone for the interest. I am truly honored that so many people enjoy and look forward to my work.

-Tyler

DEMOCRATIC PRIMARY PROJECTIONS: ALASKA, HAWAII, WASHINGTON

Let me first address the elephant in the room.

Arizona was a catastrophe. Thankfully, the controversy has picked up enough media attention that many of you already know what happened. For those of you that don’t, this article touches on some of the issues, though I don’t agree with everything that he says.  I have been aware of several instances of election fraud (though these were through manipulation of votes on electronic voting machines) in this election cycle already through the incredible work of this statistician named Beth Clarkson, but have largely remained silent on the issue because the instances thus far haven’t altered the results so much that the candidate that should’ve won lost. Not to mention, anyone that speaks out against perceived electoral injustices is immediately deemed a sore loser and totally discredited.

I encourage you to read through Beth’s work. She has received a great deal of media attention over the past couple of years and is actively working to improve the electoral process. I know many of you will disagree, but I stand by my Arizona projection and believe that if the election had been conducted in a normal, reasonable way, Hillary would’ve lost or came very close to losing. I have honestly lost a lot of sleep over this, and I can only hope that none of us witness anything like that again. Like many of you, I just want a fair election.

Now, for the elections today. Here are the numbers:

Screen Shot 2016-03-26 at 12.48.37 AM

Bernie Sanders should win Alaska, Hawaii, and Washington, largely for three reasons:

  • Extremely low populations of African Americans, 1.6-3.6%, among the nations lowest
  • All three are caucuses
  • Hillary Clinton has an unusually low proportion of Facebook likes in all three states, 17-19%, which is among her worst

With all this being said, there is once again the question of how a particular ethnic group will vote, but this time it is with respect to Hawaii. Hawaii has a large population of Asians, native Hawaiians, and Pacific Islanders, unlike any state we have seen thus far. These groups could be predisposed to favor Hillary Clinton, but the null hypothesis that I must currently accept is that they aren’t. I have tested the effect of Asian population size  on previous results specifically for the sake of Hawaii after a friend suggested that I do, but it was very statistically insignificant, with a p-value of ~0.8 and actually a positive coefficient for Bernie vote share at that. Regardless, Hillary Clinton won the Northern Mariana Islands as well as American Samoa, so perhaps it is the case that in locales with Asian majorities, the dynamic changes. Hawaii is a politically unique state in many other ways, so it will be interesting to see if this estimate holds true.

Also, I want to sincerely thank everyone for the outpouring of support. I received countless emails and messages after Tuesday’s elections, even immediately after the initial Arizona results made me look like a complete moron. To all of you that I haven’t yet been able to respond to personally, I apologize for the delay but I will get to you!. I have no agenda, and I’m not doing anything remarkable, though I’m flattered by those that suggest as much. I just want to perform solid regression analysis and statistical work to give you all the most accurate electoral projections (without using polls!).

-Tyler

 

 

FINAL DEMOCRATIC PRIMARY PROJECTIONS: ARIZONA, IDAHO, UTAH

Sanders search interest has fallen dramatically in Arizona over the past two days, and it remains to be seen if this will have a significant impact on the results tonight, but this same rapid downward search trend happened in Minnesota and did not ultimately change anything. Meanwhile, search interest for Bernie in Idaho and Utah is through the roof. Here are my final estimates for tonight:

Screen Shot 2016-03-22 at 12.03.04 PMHillary’s greatest advantage at this time is likely all of the early ballots that have been cast in Arizona. Other states have shown us that residents who are proactive enough to cast early ballots seem to vote disproportionately for Hillary Clinton (older people, of course). Who knows if this trend will hold true in Arizona, though I imagine it will.

Here are some charts to demonstrate a few relationships between variables.

In all charts, the Y-axis is the %Vote Share of Bernie Sanders.

 

Screen Shot 2016-03-22 at 12.17.00 PM

The chart looking at Facebook like proportions should demonstrate that Bernie’s current “polling average” of ~23% in Arizona is not reflective of reality. Bernie almost has to land somewhere between 45% and 63% because this is such a strongly correlated variable.

Screen Shot 2016-03-22 at 12.14.28 PM

Hopefully this convinces at least a few people that what I am proposing with Arizona is not in any way a radical idea.  As you can infer from this chart, in general, Hispanics don’t tend to vote for Hillary or Bernie in America. There is actually almost perfectly no correlation.

Screen Shot 2016-03-22 at 12.11.13 PM

As for the %Black variable, and as you can see in that chart, I am actually estimating Bernie to under perform with regard to it. Bernie almost has to land somewhere between a 48% to 75% interval because this variable is also so strongly correlated with vote share.

Thanks for all of the interest,

Tyler

DEMOCRATIC PRIMARY PREDICTIONS: ARIZONA, IDAHO, UTAH

I know the numbers I am posting today will look especially suspicious to those who have accused me of manipulating my model for the sake of increasing Bernie’s projected vote share. For this reason, I will also be sharing a screenshot of the model fit to previous results to demonstrate that even after correcting for many different factors, even when the model has adjusted to fit last Tuesday’s results, it is still projecting Bernie wins on Tuesday.

There remains one lurking question in my mind, however, and that is the question of how Arizonan Hispanics will vote; and if they are inherently more likely to vote for one candidate over the other. BenchmarkPolitics believes that Hispanics are far more predisposed to voting for Clinton over Sanders, but as much as I have tried to prove this within all of my own data, I just cannot get this result. Clinton has won a few states with a large Hispanic population, yes, but after I control for other factors (primarily Facebook presence which is the primary driver of my model), there is no negative correlation whatsoever between Hispanics and Bernie vote share. I have tried and tried to prove myself wrong here, but the numbers just don’t agree with that assessment. There are a few reasonable arguments to be made why Bernie Sanders will win Arizona:

  • Arizona has one of the lowest African American populations of any state in the country, 4.1%, which is half that of Nevada (8.1%), and almost a third of Texas (11.8%).
  • AZ has ~3% more Non-Hispanic Whites (57.8%) than Nevada (54.1%), and 12% more than Texas (45.3%).
  • Bernie has 4% more of the Facebook likes among the Democratic candidates in Arizona (76.6%) than he did in Nevada (72.7%%), and 10% more than in Texas (66.6%). This is almost as much as he had in Kansas (78%) and also more than he had in Massachusetts (74.5%) and Oklahoma (75%).
  • Arizona is also a closed primary, just like Massachusetts and Oklahoma, which doesn’t help Hillary Clinton as much* as open primaries do (edit: for the reason outlined in my previous post that I made a few days ago).
  • Arizona is also a younger state, with a median age of 36.9, which is to Hillary’s detriment.
  • Bernie, at this time, has 1.8 the relative search interest on Google than Hillary (a three day average). This is among the highest relative interest measure he has ever achieved of all the states so far. It is greater than Colorado (1.79) and Minnesota (1.55), and far greater than Nevada (1.51), Texas (1.32), and many other states.

Regardless of all of this, Hispanics will decide the Arizona primary. I don’t know how they will vote, but after sifting through and testing all of this data over and over again, I have zero reason to believe they will inherently favor Hillary. If we assume that Hispanics will choose either Hillary or choose Bernie, rather than favoring one or the other right off the bat, here are the projections for Tuesday:

Screen Shot 2016-03-20 at 6.44.59 PM

I realize this seems ridiculous, but the regression model I have simply will not produce a Hillary victory in Arizona. I have spent a great deal of time trying to challenge this result in the data, but this is all I get. If you are dissatisfied with this, think I’m a Bernie shill, or believe that I am purposefully inducing this result; that’s not true, and I don’t know what else to tell you. Believe it or don’t. Unofficially, I don’t believe that Sanders will win by more than 10%, but I’m not going to throw a number to you folks based on a gut feeling.

I expect a large loss for Hillary in Idaho and Utah. As far as I know this is relatively non-controversial and other outlets are expecting the same. This is due in large part to the overwhelmingly large white populations, Bernie’s massive Facebook presence from users in those states, and the open caucus format which has hurt Hillary in the past.

To demonstrate that I got these results from the same model that (now) fits last Tuesday’s results, the following is the model fit to all previous results. This model has an r^2 of 0.9701. These ARE NOT projections that I posted here, this is what the model estimates retrospectively knowing what it knows now:

Screen Shot 2016-03-20 at 6.46.35 PM

Thanks for your support everyone, tweet at me or email me with any questions.

-Tyler

WHAT HAPPENED LAST TUESDAY?

Though I seek to be accurate with margins of victory and loss with the projections I post here, even more important than that are the predictions of whether a candidate will win or lose a contest. As many of you already know, I got two consequential calls wrong last Tuesday, and missed two more by significant amounts. Hillary Clinton won Missouri by 0.2%, and won Illinois by 1.6%; both very small margins. Though numerically I missed the win/loss in these states by 0.2% and 1.6%, I fully recognize that the difference is night and day. This is why I started over, from scratch, and have spent the last two days building a more robust and comprehensive model that can account for factors that I had previously thought were indirectly contained within the variables I was using.

  • Why did Bernie under-perform my estimates in almost every state Tuesday? Was it coincidence or a systemic mathematical bias of my model?

I believe it was more coincidence than mathematical bias, though I will concede both to some degree. I do want to make it clear that there was no intentional bias (I have been accused numerous times of inflating Bernie’s numbers for some imaginary reason), but rather the structure of the model itself created a mathematical bias in four of these last five elections. I say it was coincidental because the factors that allow this bias to show appeared disproportionately in most of Tuesday’s states, particularly states with an open primary.

Illinois, Missouri, and Ohio all have open primaries. Up until this point, the open primary was not a statistically significant driver of results for either candidate, and therefore was not included in my model. However, over this past month, more and more Democrats (apparently a disproportionate number of Sanders rather than Clinton supporters) have been requesting Republican ballots in open primaries to cast anti-Trump votes. They seem to harbor more disdain for Donald Trump than support for Bernie Sanders. I was able to isolate this effect and subsequently include it in the new model by interacting the amount of Trump support on social media in a state with a binary variable that defines whether the state has an open primary or not. This is a powerful variable, because it accounts for the scale of anti-Trump sentiment. In states that have more Trump support, more Democrats will cast anti-Trump votes, disproportionately helping Hillary Clinton. This happened to a substantial extent in Illinois, Missouri, and Ohio.

I am also now factoring in the median age of the state in question. Though Sanders has won some “older” states like Maine, New Hampshire, and Vermont, he does better overall in “younger” states, statistically speaking. Florida and Ohio are both older states, with a median age of 41.6 and 39.4, respectively. This is now being accounted for and will help produce more accurate results.

I have heard the claim many times that northerners and southerners, and particularly minorities, just vote differently from an ideological perspective. I don’t disagree, but I had previously believed that this bias was contained in the social media data that I was using. I have been experimenting with including a variable to track whether a state is in the “Deep South,” and as it turns out, this variable is statistically significant. In my opinion, this is the primary reason that Hillary Clinton performed so much better than my expectations in Florida. Even accounting for so many different things, people that reside in an area that possesses a southern culture will simply vote for a more conservative candidate.

I am happy for the opportunity to refine the model in so many different ways. This is, at its very core, an experiment to determine whether it is possible to model primary elections without the aid of public polling. I have a renewed confidence in the projections for the next few weeks, and look forward to determining once and for all which candidate Hispanics prefer with the Arizona contest next week.

-Tyler

 

DEMOCRATIC PRIMARY PROJECTIONS: SUPER TUESDAY 2

There is a non-zero chance that Hillary Clinton will have a bad day tomorrow.

My model is estimating two Sanders wins on Tuesday, in Missouri and Illinois. However, Illinois and Ohio are both effectively coin flips with such thin margins between victory and defeat (if you recall, I put Bernie at 53.48% in Michigan and he won by less than 1%, though my model should be more accurate now). It is also estimating two wide victories for Hillary in North Carolina and Florida, which is and has been expected. Here are tomorrow’s projections:

Screen Shot 2016-03-14 at 11.37.02 PM

Only one Bernie win in Missouri will not likely lead to any permanent change in the perception of Hillary being the candidate that is destined to win the nomination. Two upsets will likely change the narrative of the presidential race, and bolster Bernie’s image as a threat to the prospect of Hillary being the Democratic nominee. Three upsets tomorrow will likely transform Bernie from “challenger” status to “probable nominee”status, and I say this because early numbers indicate to me that Bernie will win (at least) the next eight states in a row, all the way until April 19th. If Sanders wins three states tomorrow, this means that in mid-April he will be able to say that he has won eleven of the last thirteen state primaries. That’s some serious momentum.

I’ve also been putting together a GOP model over the past week. Though the model seems to fit previous elections extremely well, the GOP elections are just far too volatile for me to have much confidence in the numbers. Regardless, it is estimating at least two upsets tomorrow, in Florida and North Carolina. If it turns out to be acceptably accurate, I will begin posting projections for the GOP as well.

-Tyler

SUPER TUESDAY 2: PRELIMINARY DEMOCRATIC PRIMARY PROJECTIONS

I received countless emails from all over the world expressing support after my very questionable Michigan projection turned out to be the only one that was correct this past week. To all of you that I haven’t yet personally responded to, thank you so much for the interest.

There is a great deal of uncertainty surrounding Super Tuesday 2. With three states having fairly even odds between Bernie and Hillary in the betting markets, it is not immediately clear who will emerge victorious in Illinois, Missouri, and Ohio. Today I am posting this to essentially echo that sentiment of uncertainty, because these are three remarkably close races. Florida and North Carolina will go to Hillary on Tuesday unless something catastrophic happens to her campaign.

Screen Shot 2016-03-12 at 8.40.58 PM

Illinois is, in my opinion, going to be the most interesting to watch. We know that politicians almost always get a bonus in their home state for obvious reasons, just as Bernie received in Vermont and Ted Cruz received in Texas. But what about Hillary? Where does she have the strongest ties? She grew up in Illinois, went to college and law school in Massachusetts and Connecticut, lived and served as First Lady in Arkansas, and was elected Senator for the state of New York. According to my calculations, she did get a bonus in Arkansas that can be attributed to her history with the state, but received no such bonus in Massachusetts. The question is, will she receive another “home state bonus” in the state of Illinois in addition to the bonus she already received in Arkansas? This is something that I genuinely don’t know, but if she does, I doubt it will be a significant number because of Bernie’s historical ties to Illinois. Furthermore, we are unable to even look to 2008 to make a better guess, because her most significant opponent was also from Illinois, Barack Obama.

If Bernie Sanders can maintain pressure in the state of Missouri, he should win it. He had one event in Springfield today, and has an event in St. Louis tomorrow and Monday, which will more than likely be enough to secure him a victory there.

Ohio is where Bernie must focus his energy if he wants to continue shifting the narrative of the presidential race (one win in Missouri won’t be enough) and build on his success from Michigan. This seems like what his campaign is trying to do, with events in different Ohioan cities over the next three days. Whether he will be able to get two points out of that remains to be seen, but if the outreach effort in that state is anything close to what happened in Michigan, and if Hillary focuses her efforts primarily in Illinois at the expense of Ohio, he may win.

-Tyler

MICHIGAN, MISSISSIPPI DEMOCRATIC PRIMARY PROJECTIONS

It’s a bit unsettling to go against the grain with this forecast. As far as I know, every outlet is projecting a Clinton win tomorrow in both Michigan and Mississippi.

The Sanders campaign must be doing something remarkable in Michigan right now, because the upswing in Sanders popularity among my data sources is undeniable. I am seeing levels of interest in Bernie Sanders in Michigan similar to that of Colorado, Oklahoma, Kansas, and Nebraska. This, along with Michigan’s relatively normal demographic makeup, leads me to personally believe that he does have a chance. It leads my model to estimate that he will win there. Hillary leads every conventional poll, however, which makes me skeptical of these numbers.

Bernie Sanders will be lucky to get above 20% in Mississippi, but I do believe that if he doesn’t win Michigan, the final results will be very close. Here are the numbers:

Screen Shot 2016-03-07 at 11.35.52 PM

My official prediction is that Bernie will win Michigan and Hillary will win Mississippi, but in reality Michigan is too close to call with a mathematical model. Elections culminate in a single number after the movement of hundreds or thousands of variables, and as statisticians we can only select a few of those and hope that we account for as much variance as possible. Given the outcome of all the other elections so far this season, the positions of those variables right now in Michigan seem to indicate that a massive upset will happen tomorrow night.

-Tyler

KANSAS, LOUISIANA, NEBRASKA, MAINE DEMOCRATIC PRIMARY/CAUCUS PROJECTIONS

After projecting one incorrect result on Super Tuesday, in the state of Minnesota, I was able to refine my forecasting models further. There does seem to be some variability in these outcomes that I am currently unable to account for, e.g. if the models predict a win in Minnesota, a win in Iowa also should have happened. Iowa could without a doubt be a special case for our purposes, as it was indeed the first state to hold a Democratic Caucus, both candidates had been campaigning there relentlessly for months, and so on. Therefore, it stands to reason that, looking backwards, perhaps it was Iowa that was the anomaly, and not Minnesota. Sanders outperformed my estimates in every state except Texas, Alabama, and Arkansas; with these states of course having in common the characteristic of being southern and having a larger minority population.

Fortunately, these next four states seem to be firmly in one or the other candidate’s favor. Here are the new projections:

Screen Shot 2016-03-04 at 9.35.51 PM

Hillary Clinton will win Louisiana by a significant margin, but the subtle and interesting characteristic of this estimate (as my colleague Matt pointed out to me) is that the estimated margin of victory seems to be smaller than other similar states. For instance, Georgia, Alabama, and South Carolina are almost identical in demographic makeup as Louisiana, yet Clinton won all those states with greater than 70% of the vote. This could signal that Bernie Sanders is becoming increasingly more popular with the minority community.

Bernie Sanders is projected to win the other three states, Kansas, Nebraska, and Maine. Though these states have relatively few delegates up for grabs, this will still be a victory for his campaign insofar that it should create some positive momentum for his campaign after he lost the majority of the Super Tuesday states. Honestly, I expected the estimates for Kansas and Nebraska to signal a more hotly contested race, but the data from the past three days shows that the residents of these states are certainly feeling the Bern.

Special thanks to Andrew, Phil, and Matt for their collaboration and thoughts.

-Tyler