Polling

Chris Wright

2024/09/20

The picture above is of George Gallup. Gallup is considered to be the father of public opinion polling and founded the namesake polling organization, Gallup. Since Gallup revolutionized polling, it has been a mainstay when talking about elections. Although polls are widely conducted and a focal point of campaign news, the validity and accuracy of polls has always been in question. In the 1948 election, polls famously predicted a Dewey victory over Truman and more recently some polls underestimated the support for Trump in the 2016 election. These polling mishaps lead to one essential question: What makes a good poll?

538 Pollster Ratings

538 attempts to answer the question by giving each poll that it includes in its aggregation a rating. In the past this rating was on a scale from A+ to D-, but now the rating is on a scale from 0 to 3. According to 538’s website the rating takes into account several factors

Evaluating the Ratings

Utilizing polls from the 2016 and 2020 election cycle (before the numeric pollster ratings) I analyzed the difference in polling ratings. For the analysis, I grouped the pollster ratings into three categories: A (A+, A, and A-), B (B+, B, and B-), and C (C+, C, and C-).

The first observation from this analysis is the large number of C rated polls compared to A and B rated polls. There are almost 1000 more polls in the C range than the A and B range. C polls also stand out on sample size. The mean sample size of C polls is 4x the mean of A polls. C polls also seem to have a more Democratic tilt, as they predict Democratic support 4 points higher than A and B polls on average. Lastly, C polls appear more prone to outliers as they have both a poll with a sample size of 88 and one with a 94% prediction.

Pollster RatingNumber of PollsMean Sample SizeMax Sample SizeMin Sample SizeMean DEM predictionMax Dem PredictionMin Dem Prediction
A27831122545930146.4569.4434.00
B232327227178920146.1264.0020.00
C36435043428588850.3994.0021.48

Additionally, in an attempt to see accuracy over time, I plotted the average poll prediction for Democratic candidates. The dotted line represents the actual election outcome. Although the C polls have more range, they seem to get closer to the actual prediction as the election date is closer

Note: The dotted line is a sum of the 2016 and 2020 election since the dataset covers the 2016 and 2020 elections

Predicting Based on the Ratings

Lastly, I used the A, B, and C polls to predict the 2024 election to see if there is any difference in the predictions. Before fitting the model, I subset it to only predictions for the Democratic party. Then I fit the model using the projected Democratic vote share and the actual election outcome in either 2016 or 2020. Lastly, I used 2024 polling data to predict based on the model. In future analysis, I would include older polls to have more data points; however, I was unable to find older polling data with the associated 538 pollster ratings.

ModelPredicted DEM VoteR-squared
A49.15%0.2853
B50.65%0.0004535
C50.53%0.06187

Ultimately, all of the models had pretty low R-squared values which means that they explain little of the variance in the data. However, the A prediction model had the best R-squared. The models also gave similar predictions and the B model gave the most support to Democrats. When all the models are combined, they predict that Democrats will capture 50.1% of the vote in the 2024 election. According to my analysis, there is little variation in the prediction based on pollster ratings. However, the higher R-squared leads me to believe that A polls are better than lower rated ones.