Presenting the Model, Part I: Data Inputs
If you're a rabid follower of electoral politics you probably know that polling for the 2020 Democratic nomination is already underway in full force. Along with the polls is a lively debate over how informative polling can be this early out. 538's Nate Silver recently weighed in with his take arguing that there's good information to be had, along with a huge amount of uncertainty. I wanted to see if I could shed a little more light on the state of the race by supplementing the polls with other sources of data, and what I have (so far) is a fairly simple model that averages these inputs together to give an overall score for each candidate. My interest is not at all to successfully predict in advance who will win the nomination. What I really want is to be able to track trends over time, and maybe spot a darkhorse who's gaining momentum before we see it in the headlines. So here we go.
There are (currently) five inputs to the model:
National Polls
Source: Real Clear Politics National Polling Average
With all their flaws, the polls are the best snapshot we have of where the electorate actually stands. For a candidate to be successful, ultimately it will have to show up in the polls. Polls tend to be a trailing indicator of momentum in the race however, especially national polls at this very early stage. Simple name recognition is driving a lot of the numbers right now, so we need inputs that may serve as leading indicators.
As far as the source, RCP maintains the most handy aggregate of recent polls that I've found. I would like to take advantage of 538's pollster grades to increase accuracy, but that requires manual aggregation at this point, as 538 isn't producing a weighted average, at least not yet.
Early State Polls (Iowa, New Hampshire, Nevada, South Carolina)
Source: Real Clear Politics State Primary Polls
One place where we might hope to spot an early trend is in the polls for the first four states to choose delegates for the nomination. This is where retail politics at the grassroots level is already happening, so voters in these states are getting to know the candidates much earlier than the nation at large. Name recognition is going to increase rapidly here over the next several weeks, so if a lesser-known candidate's message is resonating with voters we'll see it here first.
For this source I'm simply averaging the most recent poll from each state as long as they are no more than about a month old. Right now there haven't been any recent polls from Nevada, so only the other three states are included.
Pundits' Candidate Power Rankings
Source: CNN's Monthly Power Rankings, Washington Post Power Rankings
The pundit class are highly subjective and often wrong, but they bring a wealth of experience and breadth of knowledge to offer insight into the complexities of the race. Their views may add credibility to trends showing up elsewhere in the data.
Right now I'm pulling power rankings from two sources. Each month CNN's Chris Cillizza and Harry Enten survey the field and rank their Top 10 Democratic candidates. At the Washington Post, Megan McCardle and others are publishing a weekly ranking of their Top 15.
Candidate Endorsements
Source: 538 Endorsement Tracker
The endorsements each candidate picks up from prominent party colleagues represents the influence the Establishment has on the nominating process, and thus may show different patterns from the polls and pundits, especially early on. It will also serve to differentiate the establishment favorites from outsiders in the race.
538 has already done the hard work here of tracking endorsements and devising a scoring system, so I'm simply using their point system.
Political Betting Markets
Source: PredictIt
The final input is more recent in origin than polls, pundits, and endorsements. Betting markets for politics have taken off over the last two decades, and while I'm not aware of any dramatic successes they've had over the other sources, they certainly offer useful information in reflecting the views of a broad swath of political junkies. They are also likely to serve as a leading indicator, as information on candidate momentum will be transmitted more rapidly among this population than the voters at large.
PredictIt is a prominent, reliable source for this, although other markets are out there.
What's Not Included
The five inputs I've chosen are not an exhaustive list, and I'll consider adding additional data if it's available in a convenient format. A significant factor I hope to address is fundraising, and I'm monitoring Open Secrets for this information as it becomes available. Another consideration is Google name searches, which a lot of election junkies like to monitor. I'm not convinced they add much to the picture I've already laid out, but could change my mind.
That's it for data inputs! In the next post I'll show how I score each of these and combine them for an overall score for each candidate, plus I'll share the first set of results. Stay tuned!
There are (currently) five inputs to the model:
National Polls
Source: Real Clear Politics National Polling Average
With all their flaws, the polls are the best snapshot we have of where the electorate actually stands. For a candidate to be successful, ultimately it will have to show up in the polls. Polls tend to be a trailing indicator of momentum in the race however, especially national polls at this very early stage. Simple name recognition is driving a lot of the numbers right now, so we need inputs that may serve as leading indicators.
As far as the source, RCP maintains the most handy aggregate of recent polls that I've found. I would like to take advantage of 538's pollster grades to increase accuracy, but that requires manual aggregation at this point, as 538 isn't producing a weighted average, at least not yet.
Early State Polls (Iowa, New Hampshire, Nevada, South Carolina)
Source: Real Clear Politics State Primary Polls
One place where we might hope to spot an early trend is in the polls for the first four states to choose delegates for the nomination. This is where retail politics at the grassroots level is already happening, so voters in these states are getting to know the candidates much earlier than the nation at large. Name recognition is going to increase rapidly here over the next several weeks, so if a lesser-known candidate's message is resonating with voters we'll see it here first.
For this source I'm simply averaging the most recent poll from each state as long as they are no more than about a month old. Right now there haven't been any recent polls from Nevada, so only the other three states are included.
Pundits' Candidate Power Rankings
Source: CNN's Monthly Power Rankings, Washington Post Power Rankings
The pundit class are highly subjective and often wrong, but they bring a wealth of experience and breadth of knowledge to offer insight into the complexities of the race. Their views may add credibility to trends showing up elsewhere in the data.
Right now I'm pulling power rankings from two sources. Each month CNN's Chris Cillizza and Harry Enten survey the field and rank their Top 10 Democratic candidates. At the Washington Post, Megan McCardle and others are publishing a weekly ranking of their Top 15.
Candidate Endorsements
Source: 538 Endorsement Tracker
The endorsements each candidate picks up from prominent party colleagues represents the influence the Establishment has on the nominating process, and thus may show different patterns from the polls and pundits, especially early on. It will also serve to differentiate the establishment favorites from outsiders in the race.
538 has already done the hard work here of tracking endorsements and devising a scoring system, so I'm simply using their point system.
Political Betting Markets
Source: PredictIt
The final input is more recent in origin than polls, pundits, and endorsements. Betting markets for politics have taken off over the last two decades, and while I'm not aware of any dramatic successes they've had over the other sources, they certainly offer useful information in reflecting the views of a broad swath of political junkies. They are also likely to serve as a leading indicator, as information on candidate momentum will be transmitted more rapidly among this population than the voters at large.
PredictIt is a prominent, reliable source for this, although other markets are out there.
What's Not Included
The five inputs I've chosen are not an exhaustive list, and I'll consider adding additional data if it's available in a convenient format. A significant factor I hope to address is fundraising, and I'm monitoring Open Secrets for this information as it becomes available. Another consideration is Google name searches, which a lot of election junkies like to monitor. I'm not convinced they add much to the picture I've already laid out, but could change my mind.
That's it for data inputs! In the next post I'll show how I score each of these and combine them for an overall score for each candidate, plus I'll share the first set of results. Stay tuned!
Comments
Post a Comment