A Future for Scientific Publishing


 

A Future for Scientific Discourse

Alex Bäcker

September 2004

 

Computation, Computers, Information and Mathematics Center, Sandia National Laboratories*, P.O. Box 5800, Albuquerque, NM 87185

and

California Institute of Technology, MC 139-74, Pasadena, CA 91125

 

 

The future of scientific publishing is the subject of intense debate both in government, academic and industry circles these days (Nature 431, 111 (2004)). The introduction late last week of a Beta release of Google Scholar, a freely available search engine for the scientific literature, makes the issue as topical as ever –articles in non-peer-reviewed archives, such as arXiv, stand side-by-side with those in peer-reviewed journals. Is this a democratization of science under way? Or is it a loathsome drop in the standards? I make the case here that the current peer review and scientific publishing system is obsolete. Below, I explain why, review alternative systems that have emerged recently and their own problems, propose a new emerging paradigm of scientific publication and validation, and discuss some of its potential advantages.

 

Roles of scientific publishing and peer review

 

There is no doubt that scientific publishing and peer review serve a crucial function in today’s scientific community. These functions include:

1. distinction of good science from nonsense;

2. prioritization, ranking or selection of the “most important” science that “deserves” to be published in highly read journals;

3. feedback about a piece of scientific work to its authors from their colleagues, sometimes resulting in an improvement of the work;

4. segmentation of the scientific literature into clusters more or less aligned with the interests of specific scientific communities;

5. dissemination of scientific work in an enduring and citable way;

6. evaluation of the work of research scientists for purposes of tenure, promotion, etc.

 

 

Problems with the traditional system of science dissemination

 

Problems with the current system of publication and peer review have long been known to exist (see, for example, Goodstein, 2000, p.9). These problems include:

1. the peer doing the reviewing is often a competitor for the same resources (pages in prestigious journals, funds from government agencies) being sought by the authors, presenting a conflict of interest;

2. “one size fits all” rankings of scientific work cannot be as useful as rankings customized for an individual scientist, taking into account his or her interests and previous knowledge;

3. review in a limited period of time by a small (typically around two) number of (busy) anonymous scientists who are not compensated for their effort, nor held up to the same rigor standards as those that apply to authors, leads to frequent imperfections in the review process, leading both to the inclusion of errors in the scientific literature and to the exclusion of valuable science from it. If the two or three referees of a paper do not understand it, or its significance, the paper does not get published –or at the very least gets significantly delayed;

4. reliance on a few “experts” in the field introduces a bias against revolutionary science and for incremental additions to the current paradigm, since this paradigm is, after all, what the experts referees have spent their careers building;

5. the cumbersome nature of the peer review process prevents comparatively small pieces of scientific work from being submitted to publication, leaving substantial amounts of potentially useful scientific knowledge (including massive amounts of “negative” results) unpublished, and for most intents lost from scientific knowledge;

6. the duration of the peer review process delays the dissemination of scientific results, delaying progress and increasing duplication of efforts by independent groups;

7. scientific work often, and to a growing degree, straddles multiple scientific disciplines, and thus classification into one field (made necessary by the selection of a journal) is rather futile and can lead, in the absence of truly multi-disciplinary search tools that can “translate” the lingo of one field into that of another, to the withholding of valuable results from part of their potential audience –unless scientists are willing to rewrite the article and submit to multiple journals, a wasteful practice discouraged by the scientific community and made difficult by its demands on the time of the most productive scientists, who get (or should get) their rewards more from the production of novel discoveries than for the exhaustive dissemination of their existing results;

8. the growing size of the scientific literature and its increasing multidisciplinary character, coupled with the constant 24 hours in a day, lead to increasing difficulty for scientists to regularly keep abreast in a timely fashion of all journals that contain science relevant to them;

9. the high and steeply growing costs of scientific journal subscriptions makes scientific results less than universally accessible, particularly in the developing world;

10. the static nature of paper journals makes it difficult to keep results updated with errata, corrections, etc.

 

Regrettably, examples of peer or editorial reviews that reject papers by induction (“other papers on this topic have failed to survive the scrutiny of independent referees, thus yours is unlikely to and will not be sent for refereeing at all”), that cite papers to support statements that are not supported in the least by the paper cited, that do not understand the paper at hand, or that miss crucial mistakes in a manuscript, are not hard to find.

 

Furthermore, the impact of the problems with peer review is growing now that the exponential growth of science is over (Boyack and Bäcker, 2003). In the words of David Goodstein (1994):

Peer review, one of the crucial pillars of the whole edifice, is in critical danger…it is not at all suited to arbitrate an intense competition for research funds or for editorial space in prestigious journals. There are many reasons for this, not the least being the fact that the referees have an obvious conflict of interest, since they are themselves competitors for the same resources. This point seems to be another one of those relativistic anomalies, obvious to any outside observer, but invisible to those of us who are falling into the black hole. It would take impossibly high ethical standards for referees to avoid taking advantage of their privileged anonymity to advance their own interests, but as time goes on, more and more referees have their ethical standards eroded as a consequence of having themselves been victimized by unfair reviews when they were authors. Peer review is thus one among many examples of practices that were well suited to the time of exponential expansion, but will become increasingly dysfunctional in the difficult future we face.

As a young high school graduate, I recall thinking that one of the attractions of science was the objective nature of the correctness of scientific results, the irrefutability of data. With science, it is possible to make discoveries about Nature independently of whether anyone believes them. Although success as a professional scientist no doubt depends on the scientist’s ability to advocate for his/her results, correctness does not, and in the long run, the truth always prevails. So, while a suspect can be wrongly convicted, or exonerated, on the basis of a manipulated jury, the jury is always out in science. This continual self-correcting character ensures that it eventually arrives at the truth.

 

And yet this nature of science is unfortunately not true of the process of scientific publication. Arbitrary and biased decisions by a small number of competitors routinely decides the publication fate of scientific findings. Every scientist knows a fair number of horror stories of unfair, ignorant, misjudging or outright nasty reviews. Are these problems unsurmountable and intrinsic to scientific dissemination, or can they be addressed? I claim scientists can do better.

 

Alternative science dissemination models

 

First, I very briefly review some of the recent publishing alternatives that are already being pursued. For much more exhaustive coverage, see Suber (ongoing). To address the lack of universal availability to scientific papers, open-access journals have emerged (e.g. the Public Library of Science, BioMed Central). But these are subject to all the same other perils of traditional journals, including peer review. To allow fast barrier-free dissemination of results, public archives have emerged (e.g. arXiv). But except for very narrow and well-defined communities, where all papers can be sent to a list of all scientists in the field with little risk of missing important readership segments or overwhelming the community with too many papers and too low a signal to noise ratio, these archives leave many of the above-mentioned problems unsolved –including the important issue of allowing the evaluation of a scientist’s work by non-experts or experts who have not read every paper the scientist has ever published--, and have in fact not been widely adopted by the largest scientific communities, such as chemistry and biology.

 

Among the most revolutionary online publishers is EPrints (http://www.eprints.org/), dedicated to self-archiving. Associated with EPrints are the Open Citation Project, providing reference linking and citation analysis for open archives, and CiteBase, a search tool for the open-access scientific literature. And yet EPrints does not consider its archives a substitute for publication:

 

…for scholarly and scientific purposes, only meeting the quality standards of peer review, hence acceptance for publication by a peer-reviewed journal, counts as publication. http://www.eprints.org/self-faq/#What-self-archive

 

 

Is peer review defensible?

The defense of peer review, which has been surprisingly staunch even by critics of other aspects of the system (Harnad, 1998, 2000; Buck et al., 1999), usually relies on: a) the volume of literature that would otherwise have to be read by scientists (role #2 of peer review, above), b) the inability of scientists to decide by themselves what journal a paper is “worthy of” (again, #2), and c) the ineffectiveness of public commentary (as opposed to anonymous peer review) by virtue of the fear of scientists to publicly criticize a paper by a colleague who could review their next grant proposal or paper –related to role #1 above, the distinction of science from nonsense. This last argument is made particularly eloquently by Harnad (1998):

 

If someone near and dear to you were ill with a serious but potentially treatable disease, would you prefer to have them treated on the basis of the refereed medical literature or on the basis of an unfiltered free-for-all where the distinction between reliable expertise and ignorance, incompetence or charlatanism is left entirely to the reader, on a paper by paper basis?

 

Upon close examination, however, these arguments do not hold water. Tackling them in turn:

a) With good ranking and search algorithms, the volume of science produced is a non-issue: Google and other search engines routinely deal with much larger volumes of much raw data in the WWW, and yet finding information on the Web is often faster than finding it in the scientific literature.

b) Classification of papers into journals to separate them by field and/or “quality” is an unnecessary and useless relic of papyrocentric times (Harnad, 1999). Again, good search tools can provide personalized groupings of papers that are much more meaningful to the interests of individual researchers, and network-based metrics, such as citation counts, can provide much more meaningful evaluations of the quality of a paper than the brand-name of the journal it was published in. The latter is static, low resolution (there only so many journals in a field), and decided by a handful of hand-picked referees and editors; the former is dynamic, relies on significantly more data, has high resolution, and can be customized to the interests of individual scientists.

c) The first problem with this defense of peer review is the very system it seeks to defend: if anonymous peer review was replaced by more democratic measures that relied on more evaluations, the fear of retaliation would be significantly diminished. Even if peer review persisted in arenas such as grant proposal evaluation, the very self-correcting nature of science (no play intended on the journals) depends on scientists’ willingness to correct other scientists’ mistakes –I would go further and venture that, fear of retaliation notwithstanding, a mistake in the literature is to scientists what fresh blood is to piranhas. A dynamic rating system that tallied how many confirmatory and critical citations each paper received would be a far better indicator of whether the findings are reliable and reproducible or a fluke than peer review by two or three competitors of the authors.

 

It may take a generation for scientists to get over their indoctrination in the virtues of peer review –as my good friend and distinguished neurosurgeon Joe Bogen says, the best way to win an argument is to outlive your opponents--, but I believe a switch to less archaic, more democratic and more effective science dissemination methods is inevitable.

 

Dissociating publication from evaluation

For a hint of a direction that might lead to a better alternative, it is worth looking outside the realms of science. How does the world at large disseminate information? Government, corporations and private citizens increasingly rely on the World Wide Web as a medium of practically free distribution for information they want to make available to the world. Of course, it would be impossible to find anything on the vast WWW were it not for search engines. So how do search engines decide which pages are most relevant for a user given a specific query? Google was made famous by a metric, PageRank, that it developed to rank the “importance” of individual webpages. This system has a number of advantages over the current scientific system. First, it allows the publication of everything. Second, ranking is a dynamic process, one that incorporates changes in the community’s view of the relevance of a page. Third, and by virtue of counting links to a page from any other page on the WWW, for good or bad, it is a much more democratic process, incorporating a potentially very large number of opinions into the rating of any one page. Fourth, both by virtue of the rules of the game being a lot clearer than those for peer review, and because of the larger numbers involved, it allows less room for subjective discrepancies and individual conflicts of interest to play a big role in any particular rating.

 

Mining scientific citation networks for relevance

The wealth of data in scientific citation networks can be harnessed to yield highly useful metrics of relevance. Yet contemporary search engines still have a long way to go to become of real value to the scientific community. Some of the remaining hurdles are:

1. A way to rank the credibility of each scientific source must be established, since democracy may not be the best way to rate scientific results. To quote Albert Einstein in response to a pamphlet published entitled 100 Authors Against Einstein: "If I were wrong, one would be enough."

2. A way to ensure the immutability of, and continued access to, papers over time, since individual web pages and URLs scattered all over the WWW can change, leading to broken links and worse. CrossRef’s Digital Object Identifier (DOI) Resolver (http://www.crossref.org/) provides one solution to this issue.

3. Ways to automatically notify individual scientists when a finding likely to be of their interest has been published, reducing hours of needless browsing through noisy collections to find a nugget of a signal, or reducing the chance that a relevant paper in a different discipline is missed.

4. Algorithms to make the ranking of a paper dependent on the topic of interest: while citations to the Bible might make it the most cited publication of all time, it might not be the most relevant document to someone interested in viticulture, even if it mentions vines, grapes and wine.

5. Results must be personalized to provide customized rankings based on each reader’s interests and past knowledge;

 

Google Scholar (http://scholar.google.com/) begins to address some of these issues, but not others. It ranks papers using undisclosed algorithms that purport to take into account the full text of each article as well as the article's author, the publication in which the article appeared and how often it has been cited in scholarly literature. Whether such ranking systems will need to operate using disclosed standards in order to be trusted as proxies for the quality and relevance of scientific publications, or whether their apparent effectiveness will engender trust in the absence of disclosure, remains an open question.

 

Is a solution within reach? I think so. These are some of the features I think a comprehensive solution should boast:

1. Scientific findings should be stored in open-access, permanent, citable, actively maintained repositories with no publication restrictions (i.e. no pre-publication review). Whether these repositories are localized or distributed, unique or with duplications is not of the essence. One of the consequences of such unrestricted publication would be the lowering of the “barrier of entry” for publications, reducing the publon (Feibelman, 1993), or minimum publishable unit, to anything a scientist judges likely to be of interest to others, rather than restricting it to the structure imposed by a finite set of journals, their editorial policies, and referees’ criteria. This will expand the domain of scientific publication to a much wider set of scientific findings –similarly to the way new electronic payments technology has expanded the range of commerce to small transactions previously deemed of too low an economic value to provide viable economic models-- and allow findings to reach the public domain in a much more timely fashion. Importantly, the repository must be indexed so that citations to and from papers in the repository are recorded, allowing the determination of the success of a paper not by the crude measure of where it was published (a pre-publication measure decided only by two or three people), but by who cited it.

2. “Dead” papers should be replaced by living and evolving documents, with links to errata, updates, rebuttals and derived knowledge. This does not need to be in lieu of the original manuscript, but in addition to it. The British Medical Journal (BMJ), for example, is among journals that already allow rapid responses to articles, posted within 24 hours. I see no reason why scientists could not keep citable blogs, that have become so popular and interconnected recently outside the scientific community, that post scientific results soon after the discoveries themselves, Although errors would likely surface, these would be corrected subsequently and, by virtue of the corrections citing the original, perhaps with a special “correction” type of citation analogous to the errata of the printed world, readers would find the corrected version in lieu of or appended to the original.

3. Classification of papers into journals has no value in the electronic era (Harnad, 1999). Tools must be developed that allow individual scientists to effortlessly find the papers most relevant to them, and rank them. Metrics for this purpose can combine information from a variety of sources, including age-normalized citation counts (Bäcker, 2004), network-based ranking metrics for temporally structured document collections (A.B., manuscript in preparation), viewing counts, subjective ratings by readers, reading times, etc. The ranking metric need not be static: while some metrics may be less noisy and more valuable in the long term, others have shorter rise times and constitute better indicators of potential interest shortly following publication when citation counts, for example, are still scant and noisy. Indicators of scientific relevance, like biological indicators, need not be the same for early and late warning systems. That said, a strong and significant correlation between hit counts, an early measure, and citation counts, a late measure, has recently been demonstrated (Perneger, 2004).

4. Automatic personalized notification when a new paper likely to be of interest appears in the collections.

5. The brand name of journal names should give way to a dynamic impact rating that quantitatively measures the impact of the paper up to date, measured by the number of citations, the quality of those citations (measured by their own impact rating), normalized to account for time since publication (see Bäcker, 2004). Such a dynamic rating should accompany every citation, and come to replace journal names in conveying the “quality” of individual publications. From a technical perspective, it could easily be implemented by having the rating be provided by a link to a dynamically updated database entry, so that references to a paper anywhere on the WWW would be simultaneously and effortlessly be updated every time a paper’s rating changes. Furthermore, query-specific versions of this rating could be developed, that quantified the impact of a paper within a particular field, its relevance to a particular query, or to a particular paper (Batagelj, 2003). Even better, citations could be classified by types, so that separate tallies could be kept for confirmatory evidence, failure to reproduce and counterarguments, and follow-up articles that neither confirm nor deny the cited paper.

 

To go back to the beginning, none of the functions of the current peer-review system will be lost with the new system, but rather replaced:

1. Distinction of good science from nonsense would emerge from tallies of confirmatory and critical citations.

2. Prioritization or ranking of science would be accomplished by dynamic network-based ratings and other correlated measures such as hit counts.

3. Feedback about a piece of scientific work from colleagues would come in the form of anonymous and/or signed commentary, which would be published and citable itself, providing the incentive for insightful commentary that has always motivated scientists: the possibility of recognition through citations.

4. Segmentation of the scientific literature into journals will be replaced with modern search and automated clustering and notification tools, just like, in another domain, usage of Yahoo’s web directories has given way, for the most part, to query-based search.

5. Dissemination of scientific work in an enduring and citable way would be done by citable archives.

6. Evaluation of scientists’ work would be done using dynamic impact metrics, including citation-based ones –for a very early approach to this see, for example, ISI’s Highly Cited.

 

 

A viable and gradual paradigm shift

Importantly, the transition between systems need not be abrupt, thus making it much more viable: if tools are developed to rate the importance and relevance of self-archived digital versions of papers published in traditional journals, the brand names of journals will slowly start to erode in favor of dynamic, customizable and paper-specific ratings. When this gradual transition is complete, authors will no longer feel the need to publish in journals for recognition, and the paradigm shift will have been complete. In the long run, the new system will benefit:

a) authors, by eliminating the long and tedious process of getting papers accepted into a journal,

b) underprivileged scientists in particular (e.g. from underdeveloped countries), by leveling access to readers (see Bäcker, 2004 for evidence that this reduction in the barriers of entry is already occurring), and

c) science and society at large, by speeding the dissemination of discoveries and more fairly, accurately and rapidly apportioning impact among papers –reaching equilibrium impact faster, if you will.

 

In effect, I believe this model does allow a commercially viable future for the publishing industry. Importantly, this role is not as a repository of information –information itself is rapidly becoming a commodity--, but as an information technology provider that provides the best tools to access the most relevant information in a timely manner. Such an open-access repository with no publication restrictions and both state-of-the-art query-specific and queryless personalized search might go great strides toward endowing scientific publication with the truth-seeking and self-correcting virtues of science itself.

 

 

References:

1. V. Batagelj (2003). Efficient algorithms for citation network analysis. arXiv:cs.DL/0309023 v1.

2. A. Bäcker (2004). Papers are being cited more and more uniformly. 4th International Conference on University Evaluation and Research Evaluation.

3. K. Boyack and A. Bäcker (2003). The memory of science, ISSI IX.

4. A.M. Buck, R.C. Flagan and B. Coles (1999). Scholars’ Forum: A New Model for Scholarly Communication, http://library.caltech.edu/publications/ScholarsForum/ .

5. P. J. Feibelman (1993). A PhD is NOT Enough! A Guide to Survival in Science. Addison-Wesley.

6. D. Goodstein (1994). The Big Crunch. http://www.its.caltech.edu/~dg/crunch_art.html . See also Scientific Ph.D Problems", American Scholar, vol. 62, no. 2, spring 1993, and "Scientific Elites and Scientific Illiterates", Ethics, Values and the Promise of Science, Forum Proceedings, Sigma Xi, The Science Research Society, February 25-26, 1993, pg. 61, and Engineering and Science Spring 1993, vol. 56, no. 3, pg. 22.

7. D. Goodstein (2000). How Science Works. http://www.its.caltech.edu/~dg/HowScien.pdf

8. S. Harnad (1998) The invisible hand of peer review. Nature, http://www.nature.com/nature/webmatters/invisible/invisible.html .

9. S. Harnad (2000) The Invisible Hand of Peer Review, Exploit Interactive, issue 5, http://www.exploit-lib.org/issue5/peer-review/ .

10. S. Harnad (1999). http://library.caltech.edu/publications/ScholarsForum/042399sharnad.htm

11. Institute of Scientific Information (ongoing). http://isihighlycited.com/ .

12. P. Suber (ongoing). Free Online Scholarship Newsletter, http://www.earlham.edu/~peters/fos/newsletter/archive.htm .

13. T.V. Perneger (2004). Relation between online "hit counts" and subsequent citations: prospective study of research papers in the BMJ. BMJ 329:546-547.

 

 

 

 

Alex Bäcker's wiki