An excellent example of either crappy science reporting or crappy science …

Let’s have a look and see if we can decide. has this piece on a paper published in the Journal of Medical Ethics in which the claim is made that “US Scientists Significantly More Likely to Publish Fake Research.” The problem is that the statistics given don’t show that.

The study is said to look at papers withdrawn according to PubMed between 2000 and 2010. Which means 2001 through 2009 inclusively (though I’m guessing that is not what was meant). Here’s the data:

Papers retracted: 788
… because of error: 545
… “attributed” to fraud (no indication in the write up what this attribution is attributed to): the rest of the papers. (We are left to do our own math, possibly because the number is so tiny that it might fail to impress us?): 243

So that’s 243 papers over 9, 10 or 11 years.

Now, here’s the tricky part:

The highest number of retracted papers (260) were written by US first authors.

Interesting. We were talking about fraudulent papers, but now we are citing numbers that include papers with errors. We are also told that “one in three” of these are fraudulent, again, we need to do the math ourselves. The answer is 86.66.

Now, at this point, I should mention that there are between about 3200 and 4000 days in the period “between 2000 and 2011” and there are probably more than 100 scientific papers published every day. So the rate of production of US based alleged fraudulent papers is minuscule.

But that is not the real problem with this report. This is the problem: “The highest number of retracted papers were written by US first authors (260), accounting for a third of the total. One in three of these was attributed to fraud” … “The UK, India, Japan, and China each had more than 40 papers withdrawn during the decade. Asian nations, including South Korea, accounted for 30% of retractions. Of these, one in four was attributed to fraud.”

Well, that’s pretty meaningless unless we want to get the calculator out again, and even then, useful only if we want to assume that the rate of “fraud” against “withdrawn” is identical for all cases, but here we have a hint that it varies by as much as about 30%. But who cares about that … you see the problem, right? We are told (in the article’s title) that the RATE of fraud in the US is higher than anywhere else, but we are then told in the body of the piece that the NUMBER of fraudulent cases is highest in the US.

Not the same thing at all.

But, wait, there’s more. Or should I say, less.

One would think that the rate of fraud being alarmingly higher (or even moderately but statistically significantly higher) in the US would be important enough that it would be mentioned in the abstract of the original published paper. Let’s see if it is:


Background Papers retracted for fraud (data fabrication or data falsification) may represent a deliberate effort to deceive, a motivation fundamentally different from papers retracted for error. It is hypothesised that fraudulent authors target journals with a high impact factor (IF), have other fraudulent publications, diffuse responsibility across many co-authors, delay retracting fraudulent papers and publish from countries with a weak research infrastructure.

Methods All 788 English language1 research papers retracted from the PubMed database between 2000 and 2010 were evaluated. Data pertinent to each retracted paper were abstracted from the paper and the reasons for retraction were derived from the retraction notice and dichotomised as fraud or error. Data for each retracted article were entered in an Excel1 spreadsheet for analysis.

Results Journal IF3 was higher for fraudulent papers (p<0.001). Roughly 53% of fraudulent papers were written by a first author who had written other retracted papers ('repeat offender'), whereas only 18% of erroneous papers were written by a repeat offender (Ï?=88.40; p<0.0001). Fraudulent papers had more authors (p<0.001) and were retracted more slowly than erroneous papers (p<0.005). Surprisingly, there was significantly more fraud than error among retracted papers from the USA (Ï?2=8.71; p<0.05) compared with the rest of the world. Conclusions This study reports evidence consistent with the ‘deliberate fraud’ hypothesis. The results suggest that papers retracted because of data fabrication or falsification represent a calculated effort to deceive. It is inferred that such behaviour is neither naïve, feckless nor inadvertent.

A few notes:
1Note that since only English Language papers are used, there will be a slight reduction of papers from non-US sources in the sample.

2Funny how an article looking into fraud and retracted papers would use a non-OpenSource, and thus unverifiable, mathematics tool in their research. Is this a case of product placement or merely misplaced understanding of the concept of replicabiltiy and transparency?

3IF = Impact Factor

The headline from Sciencedaily is not even mentioned.

Now, please revisit this statement from the abstract: “Surprisingly, there was significantly more fraud than error among retracted papers from the USA (Ï?2=8.71; p<0.05) compared with the rest of the world.”

Surprising? Why? Because of the original hypothesis that fraud would more likely come from places with “weak research infrastructure.” I’m not sure why they thought that but the evidence contradicted it, so they are surprised. Does the quoted sentence say that “US Scientists Significantly More Likely to Publish Fake Research, Study Find”? No. It says that fraud has a higher rate relative to error. Perhaps US scientists have the same or even lower rates of fraud, but screw up their data more often.

Conclusion: Interesting paper mauled by crappy science reporting. I expect to see a retraction!

Steen, R. (2010). Retractions in the scientific literature: do authors deliberately commit research fraud? Journal of Medical Ethics DOI: 10.1136/jme.2010.038125

Share and Enjoy:
  • Twitter
  • StumbleUpon
  • Facebook
  • Digg
  • Yahoo! Buzz
  • Google Bookmarks
  • LinkedIn

25 thoughts on “An excellent example of either crappy science reporting or crappy science …

  1. Not too related to the original point of this post but of interest: At the end of the ScienceDaily piece, is this warning:

    Editor’s Note: This article is not intended to provide medical advice, diagnosis or treatment.

    I find this interesting because it is an example of policy doing someone’s thinking for them.

    Then, more related we have this below the warning about medical advice:

    “Need to cite this story in your essay, paper, or report? Use one of the following formats:

    BMJ-British Medical Journal (2010, November 15). US scientists significantly more likely to publish fake research, study finds. ScienceDaily. Retrieved November 17, 2010, from­ /releases/2010/11/101115210944.htm

    Note: If no author is given, the source is cited instead.

    So, if you want to cite the original story in your High School or College paper without reading it, this is how.

  2. Judging from the abstract of the actual paper (the paper itself is behind a paywall) I would say that this is a case of bad reporting of bad science. Did Dr. Steen take the obvious step of dividing the number of retracted papers by total number of papers? That absolute numbers are higher in the US than in the other countries in the study should not be a surprise, but is that merely because the US has a much bigger and more established research infrastructure than any of those other countries, or is there something more nefarious afoot? Show me the rate of retractions for error and fraud for these countries, and then we’ll talk. For instance, Dr. Steen’s hypothesis might still be valid, because he has only shown that the US has a larger share of fraud among its retracted papers, not that the US has a higher overall fraud rate.

    An alternative hypothesis: The journal system is good at catching errors but not so good at catching intentional fraud. American researchers, who have been under more pressure for longer than some of their colleagues elsewhere to publish in journals, have therefore optimized their internal lab procedures to catch cases of error before the papers get published, meaning that a larger fraction of papers that have to be retracted later are cases of fraud rather than error. I have no idea whether that hypothesis is true, but it’s at least as plausible as Dr. Steen’s initial hypothesis.

  3. “Funny… use a non-OpenSource, and thus unverifiable, mathematics tool in their research.”

    Is it now? It shouldn’t matter which calculator you use..
    Most of the times the calculator would be spss or such, not exactly freeware is it? But one can do the same tests (if implemented yet) in pspp for example, no problem..

  4. The fact that proprietary closed-source software was used does not make the results unverifiable in this case. The Chi-Square test is not based on secret algorithms, it can be calculated using a pencil and a piece of paper.

  5. I call it the “Statistica Syndrome” (it sounds better than the S Syndrome). People load a table into software like Statistica, press a button, and make outrageous claims. I’m surprised (at least to the best of my limited knowledge) that this didn’t feature in a movie first – such as the case with Agent Scully and the PhD by Google Search phenomenon. Anyone who actually understands the statistics that they’re fooling around with wouldn’t make such outrageous claims unless they really intend to jerk people off. It reminds me of all those decades ago with the advent of computer analysis of X-ray diffraction patterns. Suddenly there were all these expert wannabes who didn’t understand the physics of X-ray diffraction who were making fantastic claims based on well-known artifacts (even something as basic as the higher order diffraction patterns). The software back in those days weren’t refined enough to eliminate the majority of artifacts. I can’t comment on modern software; I haven’t done X-ray diffraction in about 20 years.

    Oh well, at least for once it’s not a “Bayesian” routine – to which I usually tell people “There’s something wrong with your Bayesian ass – you know the one that sits on your shoulders.” People ignorant of basic statistics is one thing, but if they think saying “Bayesian” or “Inverse Bayesian” will make me any friendlier they’re seriously mistaken.

  6. The fact that proprietary closed-source software was used does not make the results unverifiable in this case.

    It almost never would, but there is an irony of research investigating 0.0-whatever-0 percent of papers as problematic doing the whole Excel product placement thing, apparently blind to the closed source problem.

    The vaxt that X2 is not based on a secret algorithm is utterly unrelated to the issue at hand. We know that 1+1=2 … no secrets. But if your software unpredictably, in a way that can’t be tested or fixed, tells you that 1+1=1.0002, then your software is fucked and your decision to use closed source software places the responsibility of the math being done wrong on your head.

  7. First author!?!? They used the fucking first author. Have the authors of this study actually been in a fucking research lab? Last author is probably the most relevant person to use, I mean if you have to choose.

  8. Last author is probably the most relevant person to use

    Depends on which field you are in, and which subfield you are in. In my area of physics, the first author is the responsible party of whom you ask any questions about the science content (the Authority Figure may know, too, but won’t be as familiar with the paper), and relative contributions decrease as you proceed down the author list (i.e., there is no special significance to being last author). Other fields, such as experimental particle physics, often just list the authors alphabetically, and you have to look carefully for the note identifying the corresponding author as position in the author list carries zero significance.

    Even when the last author is considered relevant, often that’s the Authority Figure, and the first author is the person who actually did the work. Sometimes the first author screwed up or committed fraud and the Authority Figure discovers it after publication. Other times the Authority Figure is the fraudster. The abstract does not describe any obvious way by which these two cases are distinguished.

  9. Of course it’s an excellent example of crappy science reporting, it’s a press release. Science Daily is not a journalistic news outlet. It is an aggregator of press releases issued from educational and institutional organizations. It is common to find weak or misleading statements in press releases of all stripes. One would hope for a higher standard in a release from the BMJ, or any journal, but it’s still just a press release.

  10. “Just a press release” – which is what is usually read by the lay public because most people don’t read the original research. False or misleading press releases are all too common and are an area in which the perception of science in the public can be greatly improved, but we’ve all become far too complacent with the status quo of utter rubbish.

  11. Nationality was given primacy in the study’s purpose, and then selectively ignored.

    Not: “What percentage of all bad papers were US, UK, etc.”,

    It should be: “What percentage of US papers were bad, vs. what percentage of UK papers were bad, etc.”.

    This is not so much a study, as an attempt to hide a nationalistically motivated smear behind science’s good reputation.

  12. Kimbo: Exactly. This one is not the worst I’ve seen but it is on the poor end of the spectrum, and there are examples of good press releases out there. We bloggers need to be pushing the mean (and extremes) by pointing this out all the time.

  13. @Eric You raise a reasonable issue with my “first author” rant. However, I respectfully contend that you are wrong in this instance. Yes fields and subfields use varying conventions. However, PubMed does not aggregate many experimental particle physics papers. The vast majority of papers in PubMed are funded by NIH or agencies that use NIH styles. Authorships that are not first or senior are not looked on favorably by these funding agencies and the last (senior) author is generally considered the go to person of primary responsibility.

  14. MadScientist @ 7:

    Suddenly there were all these expert wannabes who didn’t understand the physics of X-ray diffraction who were making fantastic claims based on well-known artifacts

    Like claiming to have found weak superstructure peaks that were actually W La contamination peaks?

  15. The fact that proprietary closed-source software was used does not make the results unverifiable in this case.

    Perhaps, perhaps not. It depends on which data are presented in a transformed way. If they present all the raw data, then you are correct, if not, then you are not necessarily correct. I’d love to hear your argument.

    Doesn’t really matter anyway. Regular use of proprietary numerical software is the same as regular use of proprietary lab bench machinery or methods.

  16. @NJ: Unfortunately. Don’t remind me of these things, it always makes me want to throw stuff at people. I wonder if the general competence level is going down or if there are just so many people involved these days that it’s easier to find people who don’t know what they’re doing.

  17. Wouldn’t this be evidence that more fraud was discovered in the US, not that more fraud was committed? By that reasoning, wouldn’t you expect more retractions due to fraud from countries with a stronger research infrastructure, not fewer?

    Despite that, if I have to choose, I’m putting my vote in the “bad science reporting” category. The number of fraud cases in the US compared to the rest of the world is only a small part of the paper, and not even the most interesting part. And yet it gets the headline.

  18. They are – IMO – using the wrong statistical test. With ~10 years of data, some of those papers have had much more time for errors to be found and reported. Instead they should be looking at time-to-retraction and using the log-rank test (survival analysis). This would be a more powerful analysis and would give less biased results.

    Just off the top of my head, studies with more authors also have more people who might notice errors in a study and do something about it. It could be that that studies with more authors are at greater risk of having errors detected.

    (1)Kaplan-Meier plots of the retraction rate would show more early retractions with “many” authors compared to “few” authors. (non-proportional hazards)
    (2) The retraction rate for both groups (few and many authors) will tend to converge to the same proportion over time.

Leave a Reply

Your email address will not be published.