A recent study shows a prediction market is a better predictor of scientific reproducibility than the editors of the journals Nature and Science (who have only a 62% chance of accurately identifying reproducibility). Telltale signs of irreproducibility include newsworthy results.
“Consider the new results from the Social Sciences Replication Project, in which 24 researchers attempted to replicate social-science studies published between 2010 and 2015 in Nature and Science—the world’s top two scientific journals. … As it turned out, that finding was entirely predictable. While the SSRP team was doing their experimental re-runs, they also ran a ‘prediction market.’ … Overall, the traders thought that studies in the market would replicate 63 percent of the time—a figure that was uncannily close to the actual 62-percent success rate. … The traders’ instincts were also unfailingly sound when it came to individual studies. … The 62-percent success rate…is…galling…since the project specifically looked at the two most prestigious journals in the world. …several of the studies that didn’t replicate have another quality in common: newsworthiness. They reported cute, attention-grabbing, whoa-if-true results that conform to the biases of at least some parts of society. … ‘I did a sniff test of whether the results actually make sense,’ says Paul Smeets from Maastricht University [a participant in the prediction market]. ‘Some results look quite spectacular but also seem a bit too good to be true, which usually that means they are.’… [Says Vazire, the study ‘s author,] these journals ‘are not especially good at picking out really robust findings or excellent research practices. And the prediction market adds to my frustration because it shows that there are clues to the strength of the evidence in the papers themselves.’ … If prediction-market participants could collectively identify reliable results, why couldn’t…the journal editors who decided to publish them? ‘Maybe they’re not looking at the right things,’ says Vazire. ‘They probably put too-little weight on markers of replicability, and too much on irrelevant factors…’ ” [such as their own political bias.]