Reverse-engineering the problematic tail behavior of Fivethirtyeight forecast

Reverse-engineering the problematic tail behavior of Fivethirtyeight forecast(statmodeling.stat.columbia.edu)

185 points by buddhiajuke 5 years ago | 234 comments

screye 5 years ago |

I disagree with the author on the idea that tail is too fat for isolated anomalies. There are most certainly events that can happen, which may lead to a red California or a blue Alabama.

Presidential assassination, war, video proof of something incredibly heinous (pedophilia?), etc. can absolutely lead to these outcomes. You don't even have to go that far back. Nixon and Reagan flipped states like no-one's business.

I do however agree, that 538's state-state correlation model seems weak.

California and Alabama would only flip during a wave, and that wave would consume any and all states. The fact that 538's model doesn't strongly show that pattern is a failing of it. But, it is not clear if a model that inaccurately models the unlikeliest of events (california flipping while Florida stays blue), does not necessarily mean that it is terrible predictor of it's primary target (Presidential likelihoods).

As a data scientist, I can totally understand Nate's hesitation. Do you impose strong priors on the model to reflect strong domain intuition or do build a model that best characterizes the data it is based on. In the presence of infinite data, you should abandon all domain based priors. For single digit data points, priors are essential. For any number of data in between, it is anyone's best guess.

yodon 5 years ago | |

I've always liked Enrico Fermi's attitude on this. When you're Enrico Fermi, you get to say things like "One data point gives you a curve. Two data points gives you the distribution about the curve."

pierrefermat1 5 years ago | | |

Curious is there a source for this? It is meant as satire?

wisty 5 years ago | |

OK, so is this your point?

Something could flip California and Alabama (example, Trump starts defending Roe v. Wade and in response Biden somehow manages to sound like he's opposing it). This would probably be some latent hidden variable, like whether the candidates are seen as socially conservative, which would effect all states (though California and Alabama would be the most impacted).

nullc 5 years ago |

Meh. If you fit a model and don't explicitly constrain against "un-physical" results like negative correlations, you'll end up with them.

Constraining against them won't improve your models fit (usually by definition), and it doesn't always improve robustness (at least for situations near average)-- because they're acting to debias the model in ways that you otherwise don't have enough degrees of freedom to address.

A negative correlation here is also potentially historically supported, in the sense that sometimes DEM/GOP candidates are philosophically reversed in some way relevant to the state. As in, "The only way a GOP would get elected in X is if they had the DEM position on subject Y which would make them lose state Z, who cares as much about that subject as X but in the opposite direction."

Now-- it doesn't seem likely case in this election (e.g. Trump is not (currently) a massively pro-choice republican), so it probably shouldn't apply here-- but it's isn't hard for me to imagine how a negative correlation might show up out of the historical data.

RivieraKid 5 years ago | |

> If you fit a model and don't explicitly constrain against "un-physical" results like negative correlations, you'll end up with them.

The Economist model does exactly that, and all of their correlations are positive.

I recommend reading their methodology, they know what they're doing (I wouldn't say the same about 538). Andrew Gelman has developed some of the Bayesian methods and software that people like Nate Silver use, he's the main author of what's considered a reference book on Bayesian statistics.

noirbot 5 years ago | | |

I think the question is if it matters to the predictive accuracy of the model. Just because it puts out results you can't envision actually happening on the margins doesn't mean they can't happen, or that they can't be valuable in presenting a holistic result.

It's clear that the models are tuned differently, but from Silver's replies in the PS's, it seems that he's ok with these artifacts being part of the model.

sangnoir 5 years ago | | |

My statistics knowledge has withered away, but isn't this quibbling over overall approach? 538 seems do be doing a top-down approach while the Economist is more bottom-up. What is strange is that a person affiliated to the Economist is then asking why the 538 model's emergent properties aren't exhibiting more bottom-up characteristics .

preommr 5 years ago |

> It didn't take very long to do the analysis. But it did then take another hour or so to write it up.

It's very interesting to see how long it takes people to do things. I am amazed that entire article took 1 hour to type up. I've spent entire afternoons trying to write shallower pieces of work.

i_love_limes 5 years ago | |

I think a lot of people (myself included when I've felt the pressure to) lowball how long things like this take because a) it makes me look smart and b) people could judge of they knew who much time I actually wasted on it.

taeric 5 years ago | | |

I think you read the parent in the opposite direction than intended. It was praise for getting this done so quickly, by my read.

Granted, i could just be misreading this post. :)

chrisfrantz 5 years ago | |

It looks like it was written as a single stream of conscience. While I couldn’t write that article, if I hit a flow state and was interested in the topic, it seems possible.

jacquesm 5 years ago | | |

The trick is to think before you write. The same goes for programming. If you already know what you are going to write or build then you can reach very high apparent productivity, the time spent on thinking about it isn't accounted for.

cwhiz 5 years ago |

Not sure if it’s wrong to put this here, but here is a link to their election forecast.

https://projects.economist.com/us-2020-forecast/president

You can compare this to the 538 model and see where these two teams and forecasts disagree.

mrtnmcc 5 years ago |

FiveThirtyEight has a page where you can choose winning states (condition on a certain outcome) and it will regenerate the prediction map, https://projects.fivethirtyeight.com/trump-biden-election-ma... This appears to be the what Andrew Gelman is also trying to do with their raw data.

At the bottom of the 538 page it says, " If you choose enough unlikely outcomes, we’ll eventually wind up with so few simulations remaining that we can’t produce accurate results. When that happens, we go back to our full set of simulations and run a series of regressions to see how your scenario might look if it turned up more often."

I interpret that as running a regression (linear?) and extrapolating it out to the tail where the conditioning is happening. This should eliminate the issue Andrew is seeing?

paulgb 5 years ago | |

From the plots Andrew posted, it looks like the problem is not just sample size and that (some) individual state pairs have inverse correlations, e.g. https://statmodeling.stat.columbia.edu/wp-content/uploads/20...

mrtnmcc 5 years ago | | |

I'd argue negative correlation on conditionals distributions can be reasonable here.

In that particular WA-MS example, if Trump suddenly took more liberal positions and somehow won WA (e.g., announces he's pro abortion), he would in fact be more at risk of losing Mississippi. The idea that these two states are in play already is fringe and would require some major idealogical (or other third variable) shifts.

noirbot 5 years ago |

To me, this mostly tracks with what 538 has said on record about how their model works and the design philosophy behind parts of it. To me, what Nate means when he says "directionally the right approach in terms of our model's takeaways" is that these sorts of wild and unintuitive outcomes are part of the point of the way the model is constructed.

Specifically, that when you get off into the weird situations like Trump winning Washington state, it's likely something incredibly weird has happened - something that likely has no historical precedent, so it may actually be a more sane thing to do to assume that now almost everything is backwards and Biden would win a bunch of states he shouldn't either.

To me, this points to a general willingness in the 538 model to just go "who knows" and build in some room for insane things to happen on the fringes. The Friday podcast episode about the 538 model specifically mentions that they have large/fat tails on their distribution that make it nearly impossible for someone to get over 95% chances of winning on a national level, and these sorts of wild results seem like the outcome of that. If you bake in an assumption that there's always a 5% chance of something crazy happening, that chance has to come from something in the data somewhere that reflects the ability of that to happen numerically, and thus will have numerical outcomes that seem impossible.

drblast 5 years ago | |

Yeah I think for Trump to win washington state, he'd have to do something to appeal to voters there in a way that would likely cause his red state base to abandon him.

The negative correlation makes sense when we think about how difficult it is for everyone in Washington to suddenly turn conservative and everyone in Mississippi to turn liberal. Much more likely is that the crazy thing is that the candidate or circumstances changed in some way.

It makes more sense if we ask...if a candidate wins NJ what is the chance they also won AK?

Invictus0 5 years ago | | |

I think there's nothing Trump can do to win Washington; rather, more accurate to say there are many things Biden could do to lose Washington.

justinzollars 5 years ago |

Every single one of these models will break down this year. We are living in an unprecedented time. I can't understand how we can model how many people will vote, when we don't even know how many people have moved out of cities this year. Half of my friends have left San Francisco - if as many people left Philadelphia, The Twin cities, Milwaukee or Pittsburgh, then that really effects the outcome.

an_opabinia 5 years ago |

Andrew Gelman designed the 538 model in 2007.

Nate Silver authored an adjustment to polls used in that model. Polls have more impact if they are more representative of statewide turnout among demographic things he chose like “black” and “low income.” This is why his predictions were so accurate for Obama’s 2008 and 2012 elections, and likely why they were so inaccurate in 2016.

Gelman’s own grad student is the only person to have academically published this approach, in a paper about polling Xbox Live users.

These guys sort of make a thing that is the same in many more ways than it is different. Why not just share the code is the biggest question?

Traster 5 years ago |

> I’d think that if Trump were to win New Jersey or, even more so, California, that this would most likely happen only as part of a national landslide of the sort envisioned by Scott Adams or whatever.

That's a valid intuition to have but you can also clearly make the argument that if Trump wins California you're in such a weird scenario that using the traditional wisdom about correlation is dangerous. The point that 538 have tried repeatedly to make is that firstly: if you're conservative in your level of confidence you'll give a higher likelihood to outliers, and secondly: It's not particularly useful to focus on whether X has a 3% or 4% chance.

If Trump wins California, we aren't going to be talking about whether the chance was 3% or 0.3% we're going to be talking about that Nuclear explosion that wiped out 25million Californians.

For the same logic the reason that Trump winning Alaska given winning New Jersey is lower than given losing New Jersy is because your sample size is rubbish. The chance of Trump winning Alaska given losing New Jersey is an accurate number, the number of Trump winning Alaska given winning New Jersey is like saying "How likely is it Trump wins Alaska given the UK gains US statehood" it's like.... well... if that happens then we're so far outside of what the model thinks can happen then you should be that we're just gonna say it's 50:50 - because who the hell knows.

It's not like saying "Oh well if X swing state goes blue, Y will probably follow", the scenarios in this article are so bizarre that the model should rightly be very cautious and probably default to either refusing to give an answer or just default to 50:50 or the same probably ignoring that data. The implicit bias in this analysis seems to be that if NJ went Red that would be because Trump won by a big margin, but that's not a likely enough scenario to actually get numbers for, and is so unlikely that things like "The supreme court threw out all the ballots for inner city areas" start to become valid possibilities.

jsnell 5 years ago |

See also this Twitter thread from Nate Cohn (who has no dog in the fight): https://twitter.com/Nate_Cohn/status/1320042092694065153

happytoexplain 5 years ago | |

That's a pretty bold claim to make about essentially anybody in regards to the US presidential election. Not that I don't believe his account is even-handed and valuable, just that I'm curious what makes you say he has no dog in the fight.

chki 5 years ago | | |

I think he is not talking about the election in general but referencing the "fight" between the 538 model/Nate Silver vs. the Economist/Andre Gelman.

noirbot 5 years ago | |

I think the third post in that thread really nails the underlying issue here. We, as humans, know that it would be silly to think that some configurations could happen. We know there's no reason to expect Biden to win Alabama outside of him winning nearly every state.

A statistical model only has a vague idea of context/the real world. It looks at polls (and probably not really that many polls of Alabama or Mississippi or Alaska) and sees that, statistically, Biden should win 3% of the time or so.

It doesn't have a specific world set of events in mind that would cause that, it just knows that that's how the numbers go, and thus may lead to weird circumstances in the grander results because it has to make the world match the numbers in these small corners.

vitus 5 years ago |

One thing to note: 538 uses a t-distribution (and calls out on regularly in their podcast that this yields much heavier tails than a normal distribution). Even 40,000 samples is not enough to characterize the tails.

So, it seems to me that the entire article is predicated on a faulty conjecture, namely that 538 uses a mixture of a normal distribution with an independent heavy-tailed one. (It's not explicitly stated what the author thinks the base model is, but I think "normal" is a reasonable guess.)

I'd be interested in seeing a reverse-engineering analysis of 538's choice of distribution parameters, and extrapolation from there to see if these pathologies still arise with (much) larger samples.

...

That said, ultimately, the choice of how fat to make the tails is a modeling decision, and how the models behave outside the regime of interest isn't as important as how they behave within the operating region. There are key ways we can evaluate goodness of fit once we have results (e.g. bias, MSE) which we can use to determine just how wrong the model was as a predictor, and chances are pretty good that we won't see, say, Trump winning NJ, so we won't actually be able to validate the tail correlation with the vote in PA. But we will be able to validate the correlation in margin between PA and NJ.

Maybe 538's tails are too fat, and every prediction in the 80-95% range ends up going as predicted. Or maybe they're not fat enough, and some races in the 99% bucket end up going the opposite way. Point is, we won't know for sure which models were the best predictors until we can verify the predictions.

(see: all models are wrong, etc. Newtonian mechanics work great as long as your objects are big and slow, for instance.)

olliej 5 years ago |

It seems like the behavior between WA and MS could just be statistics saying that WA and MS always[1] vote for the opposite candidate, rather than considering a massive sudden change in the direction that one of them votes in. E.g. it's not reflecting who they vote, just who they most vehemently disagree with.

I'm not sure why that kind of interstate correlation should impact predictions?

<incoherent rambling :D> IANAS but it feels like these correlations were added to compensate for the failure in 2016 to recognize that state A going one way implied that state B would also go that way. It "feels" like a more correct approach would be to compute some kind of error/weakness measure in a states polls by bringing in those of its geographical neighbors and incorporating the polling error of that entire block vs prior years. Or something.

The intuition I'm having difficulty conveying is that actual voting correlation is based on neighboring states only because you've got bubbles of ideology that aren't strictly cut along state lines. If strength of opinion in a bubble is going one way, then you'll see that mostly in the state at the center of the bubble, but the bubble still spreads into neighboring states, and a "stronger" bubble could push it geographically further into those neighbouring states, and/or could increase the bias in areas inside the bubble. </rambling>

[1] "Always" == most recent history

RivieraKid 5 years ago | |

> I'm not sure why that kind of interstate correlation should impact predictions?

538 has low positive correlations between states on average, which actually has a big impact, it increases overall uncertainty (and therefore Trump's win probability). Why? If the states are not correlated, you usually end up with a few states going off the rails, like Trump winning Colorado without any nationwide swing.

DavidSJ 5 years ago | | |

Other way around: uncorrelated errors tend to cancel each other, correlated errors tend to reinforce each other.

aakilfernandes 5 years ago |

Could this just be a result of low sampling by the author? If you reduce a sample to only include some tiny edge case, the resulting data points are going to be weird in random ways.

RivieraKid 5 years ago | |

No, there's enough data to determine all between-state error correlations.

Edit: Why the downvote? Each of the between-state correlations can be calculated from 40,000 datapoints.

yk 5 years ago |

Somewhat interesting, however the guy lost me more and more the longer he argues. So, the various anomalies in the dataset are somewhat interesting, but having weird outliers in the margins is an entirely expected effect. Just because there are not many datapoints. So when you filter for something marginal like Trump winning New Jersey, then the statistical error increases and therefore it is entirely unsurprising that something weird happens. Thankfully, these systems are designed to work with probabilities, and these outliers are weighted down.

Additionally, getting worked up about a 3% chance of Biden winning Alabama. I mean, what does a 3% chance even mean for a one off event, compared to a 5% chance or a .3% chance? I know fully well, that it means I should bet $100 if I can get more than $3000 payout, but the trouble is that is only if we bet often enough. (Perhaps often enough on different things.) For a one off thing, the important part is, it is with a very high degree of certainty a loss of $100. So any claims that Bidens chances of winning are too high should be regarded with high suspicion.

Also, I listened eralier to Nate Silver's model talk [0], where he discusses quite a few problems with low quality polls in some states.

[0] https://fivethirtyeight.com/features/politics-podcast-nation...

RivieraKid 5 years ago | |

> Just because there are not many datapoints.

There are more than enough data points to determine the between-state error correlations, many of which seem to be very off.

> Additionally, getting worked up about a 3% chance

The weird between-state correlations actually have a large effect, they increase state and nationwide uncertainty and as a result Trump has a higher chance of winning.

noirbot 5 years ago | | |

It seems more likely that it's actually the other way. Nate Silver has specifically said he built the model to have relatively high uncertainty, especially with the volatility of this year, so this seems more like the outcome of intentional decisions to not let the model be overly confident.

lunchladydoris 5 years ago |

Not to nitpick, but it's Andrew Gelman. Murray Gell-Mann is the physicist.

kevmo 5 years ago | |

Haha I rolled in here to see if it was his son.

sequoia 5 years ago |

As a US voter I am very frustrated With and tired of this obsession with election forecasting. Is it going to influence whether or not you go out and vote? If not, what is the point of it?

What value does something like fivethirtyeight add to our democracy, if any? Is this motivation the same as that of diving deep into baseball stats or Star Wars starship engineering, just like “nerding out” for its own sake?

Contrast the voter who never looks at any of these polls with one who keeps up with them daily. Is the latter voter better off in some way? Is this just about trying to read the tea leaves so you can strut and preen later about having been correct, should the dice roll be in your favor?

My concern is that these things are distracting and may actually dissuade some people from voting because they think they “don’t have to.”

Here’s an idea: everyone go vote for whoever you think the best candidate is regardless of what a stack of polls say.

Someone set me straight here, what is the point of all this stuff.

klyrs 5 years ago |

The negative correlation between NJ and AK is curious. What I'd like to see here (and in all of these forecast sites) is some confidence analysis. If you pick Trump in NJ, looking at the plots, you've selected a tiny fraction of the data to examine. Who cares if the predicted value is wonky; show me the confidence interval!

Treating this as a tea-leaf reading (that is, deliberately searching for meaning via free association, without investing it with a truth value) I'm reminded of the "own the libs" meme. I see folks on foxnews.com comments bragging about it; I see lefties complaining about it, but I suspect that it's overblown and not actually a driver behind people's decision-making. But that's what comes up for me when I see "NJ goes Trump" forcing "AK goes Biden".

I'm amused by the resulting thought experiment... if dems started airing "socialists for Trump" campaigns in otherwise safe GOP states, would it move the needle there? Even sillier: if you aired those ads in NJ, would it move the needle in AK?

runawaybottle 5 years ago | |

This is the 2008 election result county map in NY:

https://upload.wikimedia.org/wikipedia/commons/thumb/1/1f/Ne...

This is it in 2016:

https://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Ne...

Long Island is really red there. It’s really hard to say how a democratic stronghold like NYC and something literally a 45 min train ride next to it could vote so differently. Long Islanders are not separate from NYCers, they commute to and work in the city.

To your question, could experiments work in similar situations like this across the country for either side? I think so in the next 50 years as demographics shift (and I don’t think it’s as simple as urban liberals taking over, people do become more conservative as they get older). God knows the dynamic at work between NYC and Long Island in 2016, but it’s obvious things are in flux.

I’ll make a bold prediction here. If Long Island is that red again, yeah, you better believe the typical rust belt states are staying red.

lordnacho 5 years ago |

I only briefly scanned this, but it sounds like there's some correlations that turn up when conditioning on specific scenarios?

Doesn't that sound like Berkson's Paradox?

RivieraKid 5 years ago |

I'm surprised that a lot of people almost worship the 538 model, when there is a better model with more competent people behind it. Economist is also more open about how their model works and it's open-source. After reading the methodology I was pretty impressed.

Edit: Why am I being downvoted?

abnry 5 years ago | |

While numerically literate, I don't understand the details of the 538 or economist models. What I do know is that 538's model has had a great track record. It gave Trump one of the highest chances of winning in 2016. It did very well in prior elections. And both models are essentially predicting the same results: ~10% chance of Trump winning.

ceilingcorner 5 years ago | | |

How is being slightly less wrong than everyone else "having a great track record"? Serious question. Because I find it hard to take any of them seriously after the debacle that was 2016.

TTPrograms 5 years ago |

For a post on a blog presumably focused on statistics the author seems oddly concerned with a model that predicts odd events as being "possible". The difference between "possible" and "impossible" is \eps << 1 - there's no real distinction to be made there in practical statistical terms.

Likewise asserting that California with a 3% chance of going Trump is absurd is an unreasonable degree of overconfidence. Assuming maximizing expected return, the author is implying that they would be willing to take a bet that Trump would lose California with odds >> 97::3, i.e. presumably they would take a bet where I bet $1 to every $99 they bet. To be critical of a model based on outcomes it predicts with tiny probability you need truly remarkably biased priors.

artichokes 5 years ago | |

I don't know if I'm reading your comment wrong but I would happily take that bet- I think most people would. Want to make it? The election result of California isn't a matter of probability; it's an empirical matter that the number of people who will vote for Biden in California far exceeds the number that will vote for Trump.

TTPrograms 5 years ago | | |

To you and the other comment - as much as I dislike Taleb's rhetoric, this is precisely the sort of bet he has made a lot of money on. People round rare event probability down to zero, and if you bet against them enough with sufficiently extreme odds you'll eventually (and in expectation) hit a home run.

I would be more than happy to make this bet with anyone willing to take the other side - as in literally, find a modern middle-man system and I'm game.

You may want to be aware that you have provided me a nontrivial arbitrage opportunity, as the odds on predictit are closer to 7:93 : https://www.predictit.org/markets/detail/6611

Der_Einzige 5 years ago | |

I'll take that exact bet for you. I like those odds because I know that the chance of california going red is effectively 0

altdatathrow 5 years ago |

polls and votes don’t matter when one side controls faithless electors and the supreme court. none of these models are accounting for that black swan.

phonebucket 5 years ago |

This is undoubtedly useful to know when playing around with different scenarios.

However, for my use of 538, I’m perfectly happy to ignore such scenarios (such as Trump taking New Jersey). I can call the election in his favour by myself in these scenarios without needing the model.

asplake 5 years ago |

An interesting read and a great concluding paragraph

newfeatureok 5 years ago |

The thing about this election is that what if Trump himself is some sort of wildcard that can't really be properly forecasted in the polls?

Why is there so much fascination with polls to begin with? I understand that there are betting markets, but it seems sort of silly. If you had a 100% accurate poll, for instance, then what would be the purpose of the actual election?

techsupporter 5 years ago | |

> Why is there so much fascination with polls to begin with?

Polling is useful for candidates and gives them ideas on where to target outreach and spending.

For the rest of us, it gives us something to watch. With Presidential campaigns running for almost two years in advance of the actual voting weeks, there’s a huge gap between when the thing starts and when we see results. This way, people have something to fill the time. Even now that voting has started, we are still another 10 days until the voting is done and likely another 7 after that until we have a sufficient count to know who has been elected.

That’s a long time for a populace that’s worried, distracted, and interested, especially since so many of us live in states where—due to the mechanics of a broken election system—we can’t do much to influence the national outcome.

rsynnott 5 years ago | |

> The thing about this election is that what if Trump himself is some sort of wildcard that can't really be properly forecasted in the polls?

The only way that would work is if he made people more likely to lie about their voting intentions. Now, there may be something about Trump that makes polling methodologies less accurate (notably, many pollsters have started to take into account education, which turned out to be unexpectedly important last time round) but that points just to bad methodology, not inherent unpredictability.

unwoundmouse 5 years ago | |

Now that you bring that up, if we had a 100% accurate poll that would be really good for productivity wouldn't it? Perhaps it wouldn't give voters the same feeling of self-determination but it'd save a lot of resources in fundraising, going out to vote, counting votes

bigbubba 5 years ago | | |

The sensation of self determination is the entire point of democracy though. We don't use democracy because we think masses of people are particularly wise; we use these systems because they feel more fair than the alternatives and that perception of fairness produces good results (peaceful power transitions.)

jedberg 5 years ago | | |

I mean technically the election is a 100% accurate poll. You get data from every voter.

To run a 100% accurate poll would require you to sample every voter, so it would literally be an election.

ceilingcorner 5 years ago | |

The soothsayers and fortune tellers never went out of business, they just adopted a new name.

peteradio 5 years ago |

The problem is that 538 is correctly factoring the voting fuckery done at the state level which ruins the voting correlations. Gelman seems to be modelling fair elections - ha! Now the question, how did 538 come up with the correct model which takes into account vote manipulations at the state level? /s

fisherjeff 5 years ago |

I’ve decided to start my own election forecasting site that only ever gives 50/50 odds. Then I’ll just have to wait for the next 2016-style underdog win and my inevitable victory lap in the press as The Guy Who Called the Election.

sampo 5 years ago |

Funny thing, can't submit this to r/politics, as they seem to have a tightly curated whitelist of allowed domains that must "Be notable, as defined by our domain notability guidelines. Notable domains will consist of news organizations, research organizations, political advocacy groups, governmental agencies / bodies, and political parties."

And apparently columbia.edu does not fulfill those criteria.

thetinguy 5 years ago | |

Did you try asking the moderators to add it to the whitelist?

sampo 5 years ago | | |

In a way. They have a form for suggesting new domains to the whitelist: https://goo.gl/forms/lRQikA1rI0bVbKCl1

secondcoming 5 years ago | |

Don't bother, /r/politics is one of the most censored places on the internet. It's best avoided. Despite its description don't expect any actual adult discussion of politics there.

mathattack 5 years ago |

The first problem with the model is it grossly missed the last election. I still think it’s good reading.

hn_throwaway_99 5 years ago | |

Are you referring to the 2016 election? If so, you are wrong. 538 gave Trump a higher chance of winning than pretty much every independent pollster.

disown 5 years ago | | |

So they were "very very" wrong rather than "very very very" wrong like other pollsters? I think you proved the OP right rather than wrong.

Ibethewalrus 5 years ago | | |

nope, you're wrong.

538 gave Hillary a 71.4% chance of winning

https://projects.fivethirtyeight.com/2016-election-forecast/

KKKKkkkk1 5 years ago |

My biggest issue is when people say that it's a probabilistic model, and therefore it wasn't wrong in 2016 because 28% chance of winning is pretty high and you don't get probabilities. Well, guess what, this kind of model that provides a probabilistic estimate on a future event that cannot be repeated cannot be validated or falsified. It's basically junk science (if it has any aspirations of being scientific).

Marazan 5 years ago | |

If I tell you an unweighted 6 sided die only has a 1 in 6 chance of coming up 6 and we roll it once and it comes up a 6 then that is not junk science.

bena 5 years ago | | |

Yeah, but what if we could never roll that particular die again?

I think that's what he's talking about.

bigbubba 5 years ago | | |

Sounds more like gambling than science; you're making a one in six bet that you can dupe me. More seriously, I would not accept your claim with evidence as weak as that. Roll the dice a few more times and then I'll credit your claim. Rolling the dice only once when you could just as easily roll them a dozen times is junk science.

Incidentally, this trick is something magicians sometimes do. Sometimes when a trick has gone wrong they'll make a wild guess. If they're right, the audience is impressed. If they're wrong, they'll brush it aside with some joke and the audience won't notice/mind much. This works for things card guessing tricks and puedo-psychic/cold reading stuff.

stjohnswarts 5 years ago | |

So all statistics and probabilities are junk science because they can't predict the future 100% of the time? Surely you can't be asserting that...

apsec112 5 years ago | |

You can check their forecasts on Senate and House races, where we have a lot more data points.

https://projects.fivethirtyeight.com/checking-our-work/

civilized 5 years ago | |

Or... if you think for a second about what probabilities are supposed to mean, there is an obvious way to check if a probabilistic forecaster is accurate https://projects.fivethirtyeight.com/checking-our-work/