The discussion on publishing models, the problems of peer-review and the lack of reproducibility in science took a new turn last week when Nature Publishing Group stiffened their policies on data integrity.
This is a great move and motivated by concerns over the reproducibility of published work. Indeed I have heard from industry figures as high as “50% of biomedical papers are not reproducible”. Examples include a study by Amgen, which showed that 47 of 53 papers they examined failed the reproducibility test and one by Bayer, where 43 of 47 papers were found not to be reproducible.
The problem is not restricted to clinical science. So a simple question is why? After all, reproducibility is meant to lie at the heart of science. A compelling article in the New York Times, based on an interview with Stapel, the fraudulent Dutch psychologist, provides some key insights. The article, entitled “The mind of a con man” by Yudhijit Bhattacharjee should be compulsory reading.
What follows is not an attempt to place blame, but to see what lessons we can learn. I have selected a few excerpts.
” Each case of research fraud that’s uncovered triggers a similar response from scientists. First disbelief, then anger, then a tendency to dismiss the perpetrator as one rotten egg in an otherwise-honest enterprise. But the scientific misconduct that has come to light in recent years suggests at the very least that the number of bad actors in science isn’t as insignificant as many would like to believe.”
A conclusion that is becoming more acceptable, but why might this be the case and what are we doing about this problem?
Why? Part of the answer in the very next paragraph:
“He insisted that he loved social psychology but had been frustrated by the messiness of experimental data, which rarely led to clear conclusions. His lifelong obsession with elegance and order, he said, led him to concoct sexy results that journals found attractive.”
On page 4
“In doing these studies, Stapel had to go through the tedium and messiness that are the essence of empirical science.”
On page 5
“The experiment — and others like it — didn’t give Stapel the desired results, he said. He had the choice of abandoning the work or redoing the experiment. But he had already spent a lot of time on the research and was convinced his hypothesis was valid. “I said — you know what, I am going to create the data set,” he told me.”
Further down page 5
“The results were published in The Journal of Personality and Social Psychology in 2004. “I realized — hey, we can do this,” he told me.”
“If Stapel’s status served as a shield, his confidence fortified him further. His presentations at conferences were slick and peppered with humor. He viewed himself as giving his audience what they craved: “structure, simplicity, a beautiful story.”
What happened next? A young member of academic staff and two graduate students looked some of the data and realised they were made up. Note that a senior colleague in the US told the young professor to let it drop. He didn’t and the result is the exposure of Stapel as a fraud and 51 papers retracted to date (see summary on Retraction Watch).
Beyond the obvious, that Stapel was a fraud, what can we learn from this?
First, as discussed at length on page 9 of the NYT article, Stapel didn’t actually do anything new scientifically. He read the literature carefully and his “experiments” merely added to existing dogma. That is, the “results” and the “conclusions” were those expected by the community, who, therefore, felt vindicated.
To quote from the NYT article “Everybody wants you to be novel and creative, but you also need to be truthful and likely. You need to be able to say that this is completely new and exciting, but it’s very likely given what we know so far.”
So the field was massively at fault, for simply not engaging in any form of critical thought. This problem is NOT restricted to physchology. Remember the Amgen and Bayer studies at the start of this post. For a scientist, as opposed to a purveyor of snake oil, experimental noise and assumptions made in measurement and analysis are some of the most interesting parts of all papers. After all, these are places where you can be certain that discovery lies.
Second, returning to page 3 of the NYT article, “He soon realized that journal editors preferred simplicity. “They are actually telling you: ‘Leave out this stuff. Make it simpler,’ ” Stapel told me. Before long, he was striving to write elegant articles.”
The conclusion is that our system of peer review and publishing actually encourages fraud and that journals and universities are complicit. Though a fair number of commentators have stated that Stapel is trying to shift the blame, I disagree. What he is saying is that our system exerts a strong selective pressure that favours doctoring data and fraud. I don’t think the counter argument has a leg to stand on – again, remember the number of papers that were not reproducible identified by Amgen and Bayer. These were published in “excellent” top-notch peer reviewed journals. So Stapel’s comment should be taken at face value: our system encourages fraud. Remember that Stapel also taught research ethics to students. I would argue that he had a very good understanding of how the system works and figured he could play the system rather than doing any work. A secondary conclusion, drawn from the fact that Stapel taught research ethics, is that we can have as many courses and rules as we want, but selection pressure operating in the other direction wins every time.
This selection pressure is very, very powerful. To do science you need funding, to get funding you need papers, the more “prestigious” the paper, the greater the likelihood that you will obtain funding. The more you get, the higher you fly – tenure, promotion and joining the traveling circus – you will fly often, in and out of meetings, giving the same talk and jetting out ASAP to the next destination, never actually engaging in any scientific activity worthy on the name. Fiddling the books then become very tempting: it is one way to maintain the lifestyle and not have to sink back into the morass of data, noise, signal coming from the experiment rather that the sample.
What can we do? The steps taken publicly by Nature Publishing Group last week are a start. However, before we allow them to crow, we should remember some facts. Nature Publishing Group are no paragon of virtue. For example, as Raphael Lévy pointed out in a post on his blog, they have publicly refused to accept that they should help in pressuring Francesco Stellacci for the primary data relating to his STM images of stripes on nanoparticles (many posts, examples
here, here and here). The same is true for a paper published in the NPG journal Scientific Reports (Hat Tip, Alexander Lerchl)
This particular NPG publication allows readers to comment directly on the paper. A so-called “new model”, which in principle circumvents the lengthy process required to extract the truth though authors and institutions, as in the case of the stripy nanoparticle controversy (where Philip Moriarty waited for data), which was made avaiable after some time. Read the Nature Reports paper, look at the comments.
This paper has serious problems.
Editorial action = 0.
So whilst Nature Publishing Group are making a deal of noise regarding their new rules, they are actively flouting the very same rules. Nature are not alone. You only have took across at Science and the debacle over “arsenate” DNA and the immense work done by Rosie Redfield to uncover the truth (nice summary here) to see that the problem is pervasive. Note the original Science article has yet to be retracted, though it is clearly wrong.
This doesn’t look good. Until we have a system where publication actually means that authors supply the raw data to whoever requests it and where reward is not tied to some absurd notion of journal quality, but to the intrinsic value of the work, we will have a regular flow of Stapels, in all fields. My reading regularly uncovers “problem” data. I try to teach my graduate students how to dissect a paper, its methods and data, so they too can spot such problems. But, how will they fare in the world of science in the future? Will they be pushed into a ‘backwater” by a system intent on tabloid sensationalism or will their talents actually be given room to flourish?
From a personal perspective, one motivation for engaging in this debate is to ensure they have room to flourish, so that the only way to figure something out is to actually READ the paper AND do experiments.
A last word.
Imagine that the two Dutch graduate students had been similarly treated and that the young professor had followed the advice of his senior colleague in the US? The Stapel papers would still stand, poisoning physchology.
Finally, it isn’t all bad! There are people who publish innovative science and then find out that it is wrong. They go through the work with a fine comb, figure out what went wrong and then retract the paper, explaining exactly why. Two “exemplary retractions” on Retraction Watch (here and my comment on an earlier case here) show that science is indeed alive and kicking. So for those starting out in science, if you enjoy the idea of understanding the world, keep at it. For some of us, going through that morass of data and noise from strange sources and rising to the challenge of teasing apart a measurement system to find a reproducible signal isn’t “work”, it is fun. Being paid to have fun is the greatest of privileges.
Update 3 November 2013