I inadvertently started a twitter storm when I posted a few thoughts about bioRxiv (Figure 1). Submissions to bioRxiv have grown tremendously to >2000 per month, yet the percentage of papers not moving into publication in a peer-reviewed journal appears flat at ~30%. Assuming the analysis that appeared in bioRxiv is correct, 30% of the submissions do not proceed into the traditional peer-review process for eventual publication in a scholarly journal. At a rate of 2000 new submissions per month, that is 600 preprints that will persist as only preprints each month. Why are these preprints not proceeding into the scholarly peer-reviewed literature?
For those who may not be familiar, bioRxiv is a biological sciences “preprint” server (https://www.biorxiv.org/content/what-unrefereed-preprint). This means it is an online database where scientists can self-publish their scientific manuscripts before publication in a peer-reviewed primary research journal. From the bioRxiv site, the intention of this service is to provide a place where scientists can make their “… make their manuscripts available as ‘preprints’ before peer review, allowing other scientists to see, discuss, and comment on the findings immediately.” The implication is that these preprints will eventually be subjected to peer review and published in a journal. The rationale behind this need is that the process of publication in a peer-reviewed journal takes too long. Thus, the preprint server provides a way for information to be distributed to the scientific community (and entire world) prior to final publication or at the same time that it is undergoing in-depth review and revision at a journal.
The system is online and bioRxiv enabled commenting; thus, the barrier to providing feedback or just pontificating is low. So, scientists (actually anyone with a Disqus account) can easily critique or comment on the work. A benefit is that the authors may receive pre-publication public feedback that they can use to make the work better. Another is that the preprint may generate buzz that helps the manuscript eventually be accepted and published in a traditional journal. Indeed, Rxivist is a site where papers are ranked according to their Twitter buzz. PreLights is a site providing commentary by early career life scientists on preprints.
From the Twitter responses, what has happened is a bit different. Some scientists use the preprint server as intended, as a way to distribute their science, get feedback, and improve the work before submitting the study for peer review and eventual publication in a peer-reviewed journal. For others, publication on the preprint server is the end goal. From the Twitter storm, I gained some insight into some of the rationales for using bioRxiv as an online non-reviewed “journal,” rather than a step along the publication process. Some use it to publish negative results that they feel will not be of sufficient interest to journal editors and reviewers to be published as a peer-reviewed article. Some authors want to avoid the fees associated with publication in a journal (bioRxiv is free for submitting authors and readers). Some use it to publish work that they have no intention of revising and so do not want to undergo the peer-review process. There are various reasons for this: The lab is closing, budgetary constraints limit any chance of performing more experiments, the students or post-doctoral fellows who had done the work have left the lab and no one in the lab wants to pursue the study. There are undoubtedly other reasons.
For articles that never proceed past the preprint stage, are they still preprints? In effect, they are the final publication, because they will not ever move to the next stage of peer review and publication in a traditional scientific journal. To me, this creates a dilemma. If self-publication as a preprint is sufficient to use as a basis for scientific research, then how should preprints be cited? Should distinctions be made between peer-reviewed and non-reviewed references in a bibliography or reference list? Should this discussion be broadened to include how to distinguish peer-reviewed journal articles from other types of non-reviewed scientific literature? I would argue, yes. As scholarly publishing evolves, readers, especially those who are not experts in a particular field, need to be able to distinguish among different kinds of publications.
In this context though, it is important to remember that not all peer-review processes are equal. Articles published in open access predatory journals without proper peer review are a major concern. Of course, even well-recognized, high profile journals publish flawed work. As I was engaged in the flurry of tweets and replies related to bioRxiv, Nature was taken to task on Twitter for a high-profile retracted paper with multiple authors (Figure 2).
Although impact factor is one way of quantifying journal “quality.” Others could be helpful: A retraction factor (retractions in a given year for research articles divided by number of research articles published in the year), an erratum and corrections factor (errata + corrections in a given year for research articles divided by number of research articles published in the year), and finally a peer-review factor. Exactly how to calculate a peer-review factor is less obvious. There could be weighted scores for number of reviewers on average per research article, number of revision cycles, number of editors, and evaluation time. The retraction factor, erratum & corrections factor, and peer-review factor could all be rolled into a “trustability” factor (TF). Of course, preprint servers would not have such a rating.
Some of the twitter comments indicated that the level playing field of the preprint environment was the best way for scientific information to be disseminated. In this view, anyone could read and judge for themselves the quality of the work, read the comments, decide whether to cite the work, and decide whether to plan future studies on the basis of the papers published on the preprint server. My first reaction to this suggestion is that not all readers are sufficiently knowledgeable to judge the quality of the work. My second reaction was that there is already so much to read, why do I want to have to read everyone else’s comments? Why should I trust the comments associated with a preprint? Providing this evaluation behind the scenes is why well executed peer review is valuable. Someone else is responsible for finding appropriate reviewers, evaluating the study and the reviewers’ concerns, and making a decision about the quality of the work for publication. In my new world with journal TF ratings, I could be confident in papers published in journals with a high TF.
If the future of scientific publication is level, all open access repositories without expert peer review, then each discipline may need some form of Faculty of 1000 to help guide readers to the best quality research. I am not convinced that visibility on Twitter is the best way to decide what to read, although I admit to hearing about interesting studies through Twitter. I am not ready to give up on expert peer review or expert curation of the scientific literature. What is clear is that scholarly publishing is in a period of transition and change.
Whether curation before publication in a scientific journal continues as the standard remains to be seen. One could argue that with close to 70% of the preprints eventually moving into peer-reviewed journals that the curation before publication model is still an important part of scholarly publishing. For this 70%, preprint servers have just added another step in the process: Perform scientific research, present the research at conferences or meetings, post the research on a preprint server, submit the research for in-depth review and publication in a journal (Figure 3). Alternatively, one could argue that there is a place in the literature for preprint publication independent of in-depth review and journal publication. It remains to be seen how much research will be disseminated with preprints as the final publication or if certain fields in biology will adopt this as the primary mode of publication.
R. J. Abdill, R. Blekhman, Tracking the popularity and outcomes of all bioRxiv preprints. bioRxiv (13 January 2019) DOI: 10.1101/515643
bioRxiv (accessed 25 February 2019) https://www.biorxiv.org/
Rxivist (accessed 25 February 2019) https://rxivist.org/
Faculty of 1000 (accessed 25 February 2019) https://f1000.com/
Retraction Note: A homing system targets therapeutic T cells to brain cancer. Nature (20 February 2019) DOI: 10.1038/s41586–019–0967-z
@NancyRGough https://twitter.com/NancyRGough (as seen on 25 February 2019)
Tweet that launched the twitter storm
Less than 70% of bioRxiv papers from 2013 published in a peer-reviewed journal. This is what scares me about making preprints as citable as peer-reviewed studies. pic.twitter.com/vcdzqqGtZa- Nancy R Gough (@NancyRGough) February 24, 2019
Cite as: N. R. Gough, Preprints as Final Publications. BioSerendipity (25 February 2019) https://www.bioserendipity.com/preprints-as-final-publication/ (as seen on Medium.com)
Originally published at https://www.bioserendipity.com on February 26, 2019.