DEMOCRATIC PRIMARY PREDICTIONS: ARIZONA, IDAHO, UTAH

I know the numbers I am posting today will look especially suspicious to those who have accused me of manipulating my model for the sake of increasing Bernie’s projected vote share. For this reason, I will also be sharing a screenshot of the model fit to previous results to demonstrate that even after correcting for many different factors, even when the model has adjusted to fit last Tuesday’s results, it is still projecting Bernie wins on Tuesday.

There remains one lurking question in my mind, however, and that is the question of how Arizonan Hispanics will vote; and if they are inherently more likely to vote for one candidate over the other. BenchmarkPolitics believes that Hispanics are far more predisposed to voting for Clinton over Sanders, but as much as I have tried to prove this within all of my own data, I just cannot get this result. Clinton has won a few states with a large Hispanic population, yes, but after I control for other factors (primarily Facebook presence which is the primary driver of my model), there is no negative correlation whatsoever between Hispanics and Bernie vote share. I have tried and tried to prove myself wrong here, but the numbers just don’t agree with that assessment. There are a few reasonable arguments to be made why Bernie Sanders will win Arizona:

  • Arizona has one of the lowest African American populations of any state in the country, 4.1%, which is half that of Nevada (8.1%), and almost a third of Texas (11.8%).
  • AZ has ~3% more Non-Hispanic Whites (57.8%) than Nevada (54.1%), and 12% more than Texas (45.3%).
  • Bernie has 4% more of the Facebook likes among the Democratic candidates in Arizona (76.6%) than he did in Nevada (72.7%%), and 10% more than in Texas (66.6%). This is almost as much as he had in Kansas (78%) and also more than he had in Massachusetts (74.5%) and Oklahoma (75%).
  • Arizona is also a closed primary, just like Massachusetts and Oklahoma, which doesn’t help Hillary Clinton as much* as open primaries do (edit: for the reason outlined in my previous post that I made a few days ago).
  • Arizona is also a younger state, with a median age of 36.9, which is to Hillary’s detriment.
  • Bernie, at this time, has 1.8 the relative search interest on Google than Hillary (a three day average). This is among the highest relative interest measure he has ever achieved of all the states so far. It is greater than Colorado (1.79) and Minnesota (1.55), and far greater than Nevada (1.51), Texas (1.32), and many other states.

Regardless of all of this, Hispanics will decide the Arizona primary. I don’t know how they will vote, but after sifting through and testing all of this data over and over again, I have zero reason to believe they will inherently favor Hillary. If we assume that Hispanics will choose either Hillary or choose Bernie, rather than favoring one or the other right off the bat, here are the projections for Tuesday:

Screen Shot 2016-03-20 at 6.44.59 PM

I realize this seems ridiculous, but the regression model I have simply will not produce a Hillary victory in Arizona. I have spent a great deal of time trying to challenge this result in the data, but this is all I get. If you are dissatisfied with this, think I’m a Bernie shill, or believe that I am purposefully inducing this result; that’s not true, and I don’t know what else to tell you. Believe it or don’t. Unofficially, I don’t believe that Sanders will win by more than 10%, but I’m not going to throw a number to you folks based on a gut feeling.

I expect a large loss for Hillary in Idaho and Utah. As far as I know this is relatively non-controversial and other outlets are expecting the same. This is due in large part to the overwhelmingly large white populations, Bernie’s massive Facebook presence from users in those states, and the open caucus format which has hurt Hillary in the past.

To demonstrate that I got these results from the same model that (now) fits last Tuesday’s results, the following is the model fit to all previous results. This model has an r^2 of 0.9701. These ARE NOT projections that I posted here, this is what the model estimates retrospectively knowing what it knows now:

Screen Shot 2016-03-20 at 6.46.35 PM

Thanks for your support everyone, tweet at me or email me with any questions.

-Tyler

76 thoughts on “DEMOCRATIC PRIMARY PREDICTIONS: ARIZONA, IDAHO, UTAH

  1. Pingback: Does Bernie Sanders Perform Better In Open Primaries? - Statistica

  2. Hillary will crush sanders in Pennsylvania, New York, New Jersey, Puerto Rico and California. These are all closed or semi-closed primaries. she will win all these with at least 20 or 30 percent over sanders. There’s 1,097 delegates for grabs between all of those 5. If she wins 60% of those delegates that’s 658 delegates, she might win more though. Unless Uncle Bernie becomes a black person, we won’t win.

    Like

    • Algorithm may not be off. AZ is going through a storm right now. Phoenix mayor requested DoJ investigation, AZ House hearing scheduled for the 28th, tens of thousands of voters turned away, votes missing, lifelong Dems getting their party allegiance switched by the system, independents who registered Dem prior to the deadline weren’t processed, and more. It’s ugly over there. ACLU already filed a lawsuit, too.

      Like

  3. And Hillary just won Arizona by 20 points, exactly as the public polls predicted. Your magic prediction model failed again, trolls! But hey, he’s going to erase a 50-point deficit in NY, 30 in NJ, MD, and Penn, and 10 in California, right? Magical thinking!

    Like

    • Born the bastard unwanted mistake of his fathers one night stand in Reno, Nevada, Jonathan Sanders would never grow to reach his fathers expectations.

      In his late teens and early twenties he challenged himself and began taking an interest in politics. “My dad is a mayor so one day I can do something cool like that” he boasted.

      But soon enough his attempts would prove futile, sealing his fate as the failed son of a politician.

      What ever happened to Jonathan? Many years past and Jonathan tried other aspirations only to fail again. The only thing that ever brought him pleasure was criticizing others over the internet. Legend has that it his only form of pleasure is spewing venomous hate towards anyone who supports the single thing that causes his him the most pain: His father.

      According to folklore if you listen very closely to any keyboard you can hear his violent keystrokes which echo throughout eternity.

      Like

  4. Very interesting stuff, Tyler. I would echo the comments above that encourage you to continue your work and damn the trolls. Some people’s only reason for being online appears to be to shit on other people. Who needs them?

    It is fascinating to have a model different than the conventional ask 200 LVs and use sample theory to predict blah blah blah. The thing is conventional polling has become more about *driving* opinion than measuring it, with margin of error caveats that make the “science” of it completely ridiculous. Back when I was in college (hesitate to say how long ago that was…sigh) MOEs outside even 2-3% were an exercise in paid guessing. Now it’s completely out of control. One YouGov “poll” on MI that predicted Clinton +11 was actually one of the only ones that was statistically accurate when you realize it had and MOE of 7.7%!!! So kudos for trying to break that silly mold.

    I am skeptical about Arizona, if only because registered dems tend to be more machine-sensitive and back the party line and Sanders has taken so much support from independents. But time will tell, I guess.

    I also had a question about comments above showing drops in interest via google and FB. I don’t know jack about any of this, but was just wondering how such analysis accounts for interruptions in traffic flow by sudden cataclysmic events–which in the last 24 hours or so would include the Brussels attacks. Anyone have any insight?

    Like

  5. Fawk the haters. You are doing terrific work. So what if you are off a bit. It’s called statistics not fortune telling. I do have a question, I thought Bernie does better in caucus states than primary states??? Or are there certain primary states that would work in his favor?

    Liked by 1 person

    • Correct. He does best in open caucuses, followed by closed caucuses, followed by closed primaries, followed by open primaries. Doesn’t make a lot of sense but that’s what has been happening

      Like

      • Tyler Pedigo, wrong again, it goes like this, Uncle Bernie does best in open caucuses, followed by closed caucuses, followed by open primaries, followed by closed primaries. In other words he does worst in closed primaries with a minority population over 20%. He does well in anything that’s open because independents can vote, which I think that you either let all independents or no independents vote, not just in some states, it’s not fair.

        Like

  6. Very interesting read, Tyler. I was really impressed about how you predicted the Michigan upset, and it looks like you have tweaked your model into near optimization. Hoping your predictions will come alive tomorrow. I think you’re doing a stellar job. Your model has a lot of potential.

    Like

  7. Wishful thinking about predicting Bernie to win Arizona, but all the current conventional factors go against this. First of all, you can’t compare Michigan ,an open primary, to Arizona, the non-Democrats (independents, largely) won it for Bernie. That means that independent voters made Bernie win. That’s not going to happen in Arizona, because non dems or republicans, aren’t able to vote in this primary. So sanders has a couple things against him, which are Hispanic voters and Black voters, which together make up 30% of the population. Hillary does well in primaries, Michigan was an exception due to the independent voters. In, conclusion, you should stop predicting as you are biased to Bernie and you you obviously don’t know what you’re doing.

    Like

    • Seriously?! Why don’t you build us your own predictive model that can outperform traditional polling if this is all just sooo simple for you… You might want to brush up on those grammer skills too before you go around insulting someone’s work and intelligence.

      Liked by 1 person

    • If you can show predictions as accurate as Tyler then do it yourself. Your just an unhappy person that is disappointed in her life so she feels she has to take down others.

      Liked by 1 person

    • That was the worst comment I’ve ever read. He acknowledges that Bernie could lose AZ. In 08′ black people made up 8% of the vote I believe and Hispanics are not a solid group, they’ve voted differently in every state that has a high representation of them. Plenty of other problems with your comment, but since you might just be a troll I’ll leave it alone.

      Liked by 1 person

    • In closed primary states Independents can change their affiliation to Democrat very easily so they can vote in the primary, this is what I am doing in my state. AZ is only 4-5% AA, a bit misleading there. 30% hispanic, they will have a large influence on the outcome, as Tyler said.

      There have only been 3 fully closed primaries so far. Democrats Abroad, which Bernie won by a large margin, then Florida and Louisiana which Hillary won largely because of the AA vote. Almost no AA vote in AZ, so this is essentially something we haven’t seen yet.

      Like

  8. What percent of the Hispanic vote share would Bernie have to fall to for your model to predict him getting 50% of the total Arizona vote?

    Like

  9. Seems like you’re overfitting the hell out of your model. It’s easy to retroactively fit a model but that doesn’t mean the fitted model has much predictive power going forward.
    The fact that your 99.8% CI in Arizona has Bernie getting 50.8% of the vote when the polls we have along with conventional wisdom both point towards a solid Clinton victory should be a cause for concern. That isn’t to say that polls and conventional wisdom are always correct. We obviously saw in Michigan that isn’t the case. It just makes your confidence interval look incredibly suspect.

    Like

      • Yes, yes it does. Just because data is significant does not mean that they don’t share a confounding variable. In other words, lets say a confounding variable explains 99% of the variance in 10 other variables, each of which is significant and the other variation is random. You could overfit any model enormously with that.

        Like

  10. Tyler,
    I’d love to see Bernie win in AZ, but the google trends seem dismal. Did your calculations that showed a high GT ratio factor in the Super Tuesday 2 numbers? It just seems so low through yesterday and into today to deliver a win there for Bernie.

    Thanks,
    Emily

    Like

  11. Hi Tyler,
    While being a Bernie Sanders supporter, I would love to believe your predictions for tomorrow’s primaries, obviously last week has left us a bit concerned for his chances. Can you tell me how/why you think your model predicts Sanders winning as confidently as you do, especially in Arizona, where it seems that the number of retirees (similar to FL), and the support that the Clinton family name has there are challenging factors for Sen Sanders to overcome? Thanks so much! I think you’ve done a great job so far, and even though the numbers were off slightly last week in your predictions, I think in general you did better than the polling data projected..
    Kind regards,
    Colby

    Like

    • Hey there fellow feeler of the Bern. Florida didn’t have open primaries like my home state of Michigan. Here, we got out in the streets, door to door, making sure people knew who Bernie is and reminded them of the date of the primary. Get out there! Spread the word! We are counting on you! We did it! You can, too!

      Like

  12. Small correction: Oklahoma Democrats voted to allow independents to vote in the Democratic primaries this year and next. This semi-open primary was a big factor in Sanders winning OK. The exit polls show Clinton won among registered Democrats (52-43) but lost to Sanders among independents (69-21).

    Like

  13. Hi Tyler!

    First, personally, I was impressed with how you predicted the results last Tuesday. That you managed to come to similar error as the polling with a model that has at other times come to results opposing polling is interesting.

    Second, when I see such low error in something as volatile as elections, I have to ask about overfitting. How many variables is your model fit to (And of those, how many are binary)? Whatever the number is, what makes you comfortable saying that this new model does not over-fit (i.e., how did you decide the variables you chose affect the error significantly)?

    Thanks!

    Like

  14. I’m sorry for the negative reaction you got when you model didn’t predict the correct results on Super Tuesday.

    I also woke up super disappointed (couldn’t pull an all-nighter to watch the results coming in live – I live in Europe) with the ST results but I DO think you’re right in saying that Pro-Bernie people are more anti-trump.

    It makes sense but sadly, for those voters, they are voting away their best chance of actually defeating Trump. Bernie has shown to consistently out-perform Hillary against Trump.

    I’m curious: do you still think he has a chance of winning the nomination? 🙂

    Like

  15. How many parameters does your model have? Are you training it on state level data or were you able to get precinct or congressional district data?

    Like

  16. Mass. is semi-closed. Meaning you have to be Dem or unaffiliated. Zero closes primaries outside of the deep South thus far, but surely closed primaries will favor Hillary, because party line Dems do. The whole notion of closed primaries is silly, because it ignores a crucial general election voting bloc: independents… But I digress.

    Bernie will pretty much have to win Latinos to win Arizona, I’d guess. Hate to say it, but any variable you are using for open/closed can’t really be supported by any data because this is the first closed primary.. Though in Michigan, an open primary, the non-Dems (independents, largely) won it for Bernie.

    Liked by 1 person

  17. You mentioned last week that you’d talk about a preliminary model for the GOP you came up with- provided the upsets it predicted in NC and FL came to pass.

    While I understand you’ve likely been preoccupied with refactoring the Dem model instead, what did that model predict, and did it come to pass? What does it predict for Super Threesday?

    Like

    • The GOP model fit previous results well, but the results last Tuesday pretty much destroyed all of it. I’m not going to attempt to successfully predict the outcomes of the GOP elections this year; far too volatile.

      Liked by 2 people

  18. “Arizona is also a closed primary, just like Massachusetts and Oklahoma, which doesn’t help Hillary Clinton as much* as open primaries do.”

    This is quite a non-intuitive claim since Bernie leads independents. Could you say what led to the claim?

    Like

    • So it’s pretty counter-intuitive, I agree. But what has been happening is Democrats crossing over to vote anti-Trump. I explain this phenomenon in more detail in my previous article. With closed primaries, more registered Democrats vote on the Democratic ballot. Everything I’m saying is statistically significant, and the crazy thing is that the more Trump support there is in a state that has an open primary, the more democrats will cross over to vote anti-Trump.

      Liked by 1 person

      • Are you saying that most of the Democrats crossing over to the Republican ballots would otherwise have voted for Bernie? Is there data showing that? Thanks.

        Like

      • Die-hard Bernie supporter here for context…

        You really think the closed primaries are better for Bernie, than open primaries? I understand the cross-over anti-Trump votes probably hurt Bernie more than Hillary.

        But still, in the open primaries, Bernie has gotten 30-50% of his support from Independent voters. I’d love further explanation/data if you have it. I want to believe this, because I want to have high hopes for Pennsylvania and NY!

        Like

      • That statement is incorrect, you said “closed primary like Massachusetts”
        Massachusetts is an open primary. I know because I am an independent voter in Massachusetts and voted on Super Tuesday

        Like

      • In closed primaries so far Sanders has been destroyed both times, Florida and Louisiana. In all primary type events he’s 4-19 with two of those wins being Vermont and NH. And a 3rd being a massive 1.6% win in Michigan. Sanders is better in closed primaries??? Hahahahahahahaha, too funny indeed. Almost sounds as much fantasy as the bulk of his Alice in Wonderland platform ideas.

        Like

      • I believe Massacusetts is listed as a “semi-closed” or hybrid type is because technically it is closed, however independent voters can vote. Its sort of complicated, an independent voter temporarily aligns with 1 party, while those enrolled in either party cannot cross-over vote (enrolled voters can only vote in their party). So it is functionally similar to an open primary, but with technicalities that make it a closed primary.

        Like

    • “Can you think of any procedural nuances of the Massachusetts Democratic primary that would warrant this classification?”
      For an independent voter, it would work like an open primary, since independents / non-affiliated can vote in either in a semi-closed primary. For an affiliated voter (D or R), it would work like a closed primary, since affiliated voters cannot cross over to vote in the other primary in a semi-closed primary.

      Like

  19. As a Hispanic I would like to pitch in this thought. Hispanic voters unlike African-American are not monolithic. More often than not we will get split up on certain issues. Cuban voters in Miami are much more conservative than your Mexicans in Denver. Mexicans unlike Cubans have no bias against leftist politics, and actually could come up to his benefit when you contrast his policies to someone like Lazaro Cardenas. Also you have to take into account into the current socioeconomic and political standing of each Hispanic group in every state. Arizona Hispanics which are largely Mexican have been disenfranchised by conservative Republicans, and thus are more politically motivated to pick a candidate who has strong progressive convictions and beliefs in civil rights.

    Liked by 1 person

    • Thank you for remarking on this! I teach at a college in Southern California and probably a third to a half of my students are Latino, mostly of Mexican descent. Another third are of various Asian ethnicities. I read and watch the MSM on this, and I’m so confused. First of all, when pundits say “people of color” or “minorities,” they are almost never thinking of my Asian American students. Secondly, there are so many people from Central and South America who get jammed into that designation “Latino.” Third, Los Angeles is the third-largest Spanish speaking city in the world–you’d think that if candidates actually wanted to meet a lot of Latinos, they’d come here, wouldn’t you? Fourth, people marry people of other ethnicities. Put that all together, and I don’t see a lot of monolithic anything. I don’t really know why anyone would assume that every Latino person would vote for Hillary.

      Except that ALL of my students are worried about tuition and debt, and yes, a lot of them think about and value similar things, but I think that’s more because they’re English students. I don’t ask them who they are supporting, but I bet a lot of them are Feeling the Bern.

      Liked by 2 people

  20. Tyler, I hope you are not fazed by those criticizing your work or calling you a Bernie puppet. This has always been trial and error. As such, I think that you will undoubtedly run into some more error throughout the rest of primary season.

    However, I think that refining the model and working hard on it now will greatly benefit you come General Election season, as well as primary season 4 years from now. Each state is different, so while the model might hit some states on the nose (Michigan) there are bound to be big errors elsewhere (Ohio, Colorado). But learning about the inefficiencies of the model in some states and refining it is the only way to perfect it.

    In fact, I think that your model is interesting because if you are able to pinpoint differences in individual states and still manage to use those differences in a cohesive model, you will have done something incredible. There will be bumps along the road, but you’re doing a great job.

    Like

      • I suppose after the Minnesota result came in, the model adjusted the weighting of the variables to put less emphasis on the Google data and more on the other measures that I use. The Google trend that we saw in Minnesota hasn’t happened the same way anywhere else, and I still can’t really explain it. Perhaps something to do with the caucus format. That’s my theory.

        Like

      • Kansas could probably be brought in line if you account for absentee caucus votes. Kan had them Neb didn’t. They went 75%+ for Hill.

        Like

Leave a reply to James Cancel reply