Physiology News Magazine

Full issue

The open science movement

Revolution is underway

Features

The open science movement

Revolution is underway

Features

Keith Siew
University of Cambridge, UK


https://doi.org/10.36866/pn.107.24

‘Information is power. But like all power, there are those who want to keep it for themselves. The world’s entire scientific and cultural heritage, published over centuries in books and journals, is increasingly being digitized and locked up by a handful of private corporations.’ Aaron Swartz, in Guerilla Open Access Manifesto, 2008


The world’s first academic science journal, Philosophical Transactions, was published by the Royal Society in 1665. At last count there were some 11,365 science journals spanning over 234 disciplines by 2015, and yet the primary model of scientific publishing remained largely unchanged throughout the centuries.

As a fresh-faced, naïve PhD student, I recall the horror I felt upon learning that my hard work would be at the mercy of a veiled, political peer-review process, that I’d be left with little option but to sign away my rights to publishers, and too often forced to choose between burning a hole in my wallet or forgoing access to a potentially critical paper!

The open science movement offers an alternative to this unjust system. In its purest form, the movement advocates for making scientific research and its dissemination an entirely transparent process, freely accessible to all levels of society. The aim of this two-part series is to present an overview of the various incarnations of open science and recent paradigm shifts. In particular, I address some of the more radical elements of the movement, existing open science opportunities and the reasons behind life scientists’ relatively slow adoption of open science. This first installment details the ongoing struggle for open access, the growing angst towards closed peer review and fundamental shifts on the horizon in both the ways we communicate (i.e. preprints) and carry out science (i.e. open data and open notebook science). Featuring in the next issue of Physiology News, the second half of the series (Opening Up Science Education by Vivien Rolfe) will illustrate how open education has democratised access to knowledge and fostered engaging and creative approaches to learning and teaching.

Open access – ‘tear down this wall’

In 1991, Sir Tim Berners-Lee gifted us the World Wide Web, forever changing the way we would access information. Fast-forward a mere quarter of a century later, and mobile-broadband networks now reach an estimated 84% of the global population and almost half of the planet engages in regular internet use. This ultimately resulted in giants like JSTOR, PubMed and Google Scholar displacing libraries as bastions of knowledge, and has ushered in an era of private, freely available, internet-based repository/database-searching from the comfort of one’s office or home.

During the rapid transition from paper to PDF, publishers took advantage of opportunities to boost profits (Fig. 1), and continued the practice of hiking subscription fees at rates far in excess of annual inflation (Dingley, 2006).

In the face of increasing soft copy demand and dwindling print production costs they chose to erect electronic paywalls and enforce copyright transfers, a crude yet effective transplant of their existing business model. It was this perceived blatant profiteering, exemplified by the continued imposition of nonsensical ‘traditional print’ service charges for online materials (e.g. colour figures, page counts and supplements), and ongoing disenfranchisement of authors from their work that fueled growing frustrations and a hunger for alternatives.

A change in winds occurred in 1996 when the editor of the Journal of Clinical Investigation whimsically declared: ‘The vexing issue of the day is how to appropriately charge users for this electronic access. The nonprofit nature of the JCI allows consideration of a truly novel solution—not to charge anyone at all!’ Early adopters of this refreshing ‘open access’ approach were perfectly positioned to reap the benefits of increased exposure from new, government-run, digital repositories such as PubMed Central, and would later go on to inspire pioneers of wholly open access publishing models like Biomed Central and PLOS (Public Library of Science).

The culmination of the Budapest Open Access Initiative, Bethesda Statement on Open Access Publishing and Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities from 2002 to 2003 proved most influential in the open access movement. Together, they made a call to action, setting forth the principles of open access and its definition, boldly stating: ‘By ‘open access’ to this literature, we mean it’s free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.’

The legacy of these declarations was the ‘Open Access Mandate’, a policy which requires researchers to make their articles open access by self-archiving the author’s final peer-reviewed (and often non-typeset) version in a freely accessible repository, so called ‘Green OA’ (usually after an embargo period), or by publishing them in an open access journal, also known as ‘Gold OA’. As of April 2017, open access mandates had been adopted by more than 860 research and funding organisations worldwide.

This has been coupled with an explosive proliferation of open access scholarly journals across all disciplines, which now total over 9,427 from 129 countries, with Gold OA representing a significant proportion of new life science articles (Fig. 2). Governments have also implemented mandates, including the infamous UK REF (Research Excellence Framework) implementing an open access policy last year and the Council of the European Union calling for immediate open access as the default by 2020 (Enserink, 2016).

In spite of great leaps forward in open access, little has been done that addresses works that predate, or are unaffected by, open access mandates. As a result, this half-measure has been rejected by those unwilling to settle for anything less than the total liberation of scientific knowledge. The most notable of these crusaders was the focus of an award-winning documentary, The Internet’s Own Boy: The Story of Aaron Swartz (Knappenberger, 2014). In 2001, the fourteen-year-old Aaron developed code for the RSS feed and by fifteen was collaborating with Sir Tim Berners-Lee to enhance internet functionality. He also co-founded Reddit and worked on the team that launched Creative Commons – the easy to use copyright licenses that empowers authors and has become the mainstay of open access publishing. Surprisingly, it was his highly influential Guerilla Open Access Manifesto and the events that followed which would inspire the movement. Published in 2008, the manifesto called on academics to revolt against the system by downloading copies of paywall-protected articles to publish online for the whole world to see (Swartz, 2008). Two years later, while a research fellow at Harvard, Aaron would singlehandedly download over 4 million documents from JSTOR using MIT’s network. His intention was to publish the entire JSTOR repository online and make it freely accessible, but he was arrested in 2011 before finishing. Despite JSTOR reaching a settlement in the civil case with Swartz wherein he surrendered the downloaded data, Federal prosecutors sought to make an example of this ‘hacktivist’ and brought 13 felony charges against him to the tune of 50 years in prison and $1 million in fines. Tragically, on 11 January 2013, afflicted with depression and under enormous pressure from the trial, Aaron Swartz committed suicide at the age of 26.

In the aftermath, Alexandra Elbakyan – a young neuroscience graduate student and computer programmer – was hailed as the spiritual successor to Aaron Swartz. Launched in September 2011, her Sci-Hub Project became the first pirate website in the world to provide free access to more than 62 million research papers. It works by first searching for an existing copy of the paper in question on LibGen (Library Genesis), a repository of pirated literary materials (where one can also find a myriad of academic textbooks). If it cannot deliver a copy, Sci-Hub will bypass publishers’ paywalls and download a copy to LibGen by running through multiple institutional access systems and utilising login credentials anonymously donated by academics sympathetic to the cause. By February 2016, Sci-Hub was averaging 200,000 downloads per day from all corners of the globe. Surprisingly, some of the most intense use was concentrated in resource-rich European and American Universities, suggesting that aside from genuine access issues, the convenience of a fail-proof centralised search engine is just too great a temptation to ignore.

According to Science Magazine, Sci-Hub is either ‘… an awe-inspiring act of altruism or a massive criminal enterprise, depending on whom you ask.’ Elsevier would certainly place it in the latter category, with Alexandra residing in Russia to evade lawsuits and extradition, and Sci-Hub safeguarding itself against shut-down attempts by taking refuge on the dark web and replacing compromised domain names as quickly as a decapitated hydra sprouting new heads. Despite some success with heavy-handed litigious efforts in the past, there is now great anxiety among publishers that refusal to adapt may leave them on the losing side of this culture war. For example, over half of all materials hosted on the popular scholarly networking site ResearchGate were uploaded in direct defiance of publishers’ copyright, while nearly 88% of surveyed Science Magazine readers see nothing wrong with downloading pirated papers with three in five having actually used Sci-Hub in the past (Travis, 2016). Even those averse to dabbling in the morally grey can now comfortably circumvent paywalls using a free, fast web-browser extension called Unpaywall, which trawls 5,300+ public repositories to retrieve legal copies of papers.

This changing landscape leaves us with a question: in a world where total open access seems inevitable (by either legitimate or nefarious means), can the centuries-old scientist–publisher relationship survive and do traditional publishers belong in the future of online community-driven science communication?

Open review – ‘out of the darkness’

Scholarly communication forms the bedrock of science and is the currency in which academics trade. Its integrity and quality was to be safeguarded by editorial oversight and peer review; a process instituted by the Royal Society of Edinburgh when Medical Essays and Observations became the first formally peer-reviewed publication in 1731. Applied irregularly and in numerous variations, it would not be until the mid-20th century that the practice was to become commonplace, finally transmuting into the externally recruited, anonymous, peer-review process we are familiar with today. Now, amidst crises in both study replication and academic publishing, we must question if our current system of peer review has become unfit for purpose and is in need of radical redress.

Much of the criticism is levied against masked peer review centres on the potential abuses of power by editors and reviewers shielded by anonymity. While the vast majority are honourable, authors must trust that reviewers will not suppress dissent against mainstream theories, exploit ‘insider information’ to gain advantage, plagiarise or even deliberately delay publications from competing groups with unnecessary revisions.

Although not a new idea, open peer review seeks to solve these issues by introducing complete transparency to the process. In this system, reviewers’ comments and correspondence with editors and authors is published alongside their names in transparency documents. The advantages of this approach are many, as it not only deters professional misconduct but also credits reviewers for their work (via initiatives like Publons), allays authors’ misgivings, and educates the reader on the perceived weaknesses of the study and the steps taken to strengthen key conclusions. In the past, publishers have been slow to implement this with critics fearing that reviewer self-censorship would diminish peer review rigor. However, the data does not support this assertion with no negative impact on review quality found between open versus closed peer review in randomised controlled trials.

In parallel with the normalisation of open access practices, open peer review has experienced a resurgence in popularity. Recently, several journals have introduced editorial transparency policies either mandating or giving authors the option to have their pre-publication peer review proceedings published (although not always with reviewers’ identities disclosed). This has led to a doubling in the number of European life science peer reviews published (with reviewers named) in 2016 compared to 2014, although the percentage is still low at ~2% of all peer reviews (Fig. 3).

Funding bodies have also shown keen interest in moving towards more transparent systems, paying close attention to the emergence of novel post-publication open peer review models, such as that spearheaded by F1000Research. This unorthodox process typically involves a quick sanity check by in-house editors before articles are published online (often within days of submission), then later arranging for invited open peer review post-publication to Pubmed-indexed articles. In acknowledgement of the value of this approach, NCBI developed PubMed Commons to introduce an additional level of peer review post-publication to PubMed articles. Here, indexed authors can openly question or share information on the
work of their peers, with responses publicly displayed under the relevant abstract for discussion.

Proponents of post-publication peer review argue that its greatest strength is the increased speed of access to information. The median time from submission to publication has hovered at 4–5 months over the past 30 years, although the traditional model has also meant that over half of all authors have had to endure a wait of 1–5 years for a paper to be published during their careers (Powell, 2016). Notwithstanding the delay in time to publication, there is still a general reluctance to replace pre-publication with solely post-publication review models for fear it may reduce the overall quality of papers. Though attitudes may soon change as funders recognise the potential benefits of providing their fellows with rapid, transparent publishing routes, and begin to gradually make moves to create their own custom solutions. So far the Wellcome Trust and Gates Foundation have contracted F1000Research to manage their own open publishing ventures, namely Wellcome Open Research and Gates Open Research, with the European Commission rumoured to soon follow suit with its own platform. However, the two systems need not be mutually exclusive, and life scientists can borrow a tried and tested middle-ground solution from their cousins in the physical sciences – preprints!

Since 1991 physicists have been depositing electronic, open access, preprint manuscripts on the online repository arXiv, whereas the vast majority of life scientists have either never heard of preprint servers or don’t fully understand them. In physics, astronomy, mathematics and computer science it is now standard practice that completed manuscripts be deposited on public preprint servers prior to, or with simultaneous submission to, scholarly journals. This has been accepted as a means for authors to rapidly disseminate important findings to boost the visibility and citation impact of their work (e.g. astrophysics papers with a preprint counterpart are cited twice as much as non-preprinted publications), whilst also establishing priority for discoveries, documenting evidence of ongoing projects for career progression, and gathering early feedback to improve their papers for later formal peer review publication.

The first life science preprint server Nature Precedings was launched in 2007, although with seemingly little appetite they ceased accepting new submissions after a 5-year period. Shortly after, a renewed interest in open science led to exponential growth in preprint numbers and creation of
several servers catering to the life science community. The lion’s share of these new preprints now go to bioRxiv – a platform launched by Cold Spring Harbor Laboratory in 2013 modelled on the success of arXiv (Fig. 4). It now hosts over 10,000 preprints which are searchable on prepubmed.org and recently received a cash injection from Facebook co-founder Mark Zuckerberg and his physician wife, Priscilla Chan, to develop an open-source platform and web-friendly article formats.

Nonetheless, many still fear that preprints could lead to getting scooped by competitors, missing out on credit for ideas or jeopardise the chances of manuscripts appearing in peer-reviewed journals. To address these concerns, a series of ASAPbio meetings (Accelerating Science and Publication in biology) started in 2016 with invited stakeholders representing junior and senior working scientists, funding agencies, scientific societies, industry, databases and journals (asapbio.org/meetings). Since then, there has been a cascade of major events that should quell concerns and encourage all life scientists to embrace preprints. First, Crossref has started accepting preprints as a content type that it will assign a DOI (Digital Object Identifier) to, which in turn makes preprints fully citable materials (e.g. preprints appear regularly in the reference lists of Nature and Science), enabling version tracking and linkage with the final peer-reviewed version. Second, the world’s largest research funders, namely the Wellcome Trust, UK Medical Research Council, Howard Hughes Medical Institute, and the NIH (USA National Institute of Health), all announced policies to allow researchers to cite their own preprints in grant applications and reports (Luther, 2017). Third, there is a growing list of journals and publishers that now no longer consider preprints as ‘prior publication’, such as Elsevier, Springer, Nature and Wiley, a list which also includes The Physiological Society’s own publications: The Journal of Physiology and Experimental Physiology. Lastly, after receiving a $1 million grant and the backing of major international funding agencies like the European Research Council, ASAPbio issued a RFA (request for applications) to create a central preprint aggregating service for life science similar to PubMed Central. The Center for Open Science (COS) with the support of 12 preprint servers (e.g. arXiv, bioRiv), 15 repositories (e.g. Mouse Phenome Database, Protein Data Bank, Figshare) and several stakeholders responded to the RFA with a proposal to set up The Commons.

This solution would bring us closer to the reality of a ‘one-stop shop’ for all life science preprints that will both greatly accelerate research and establish preprints as the currency for determining priority of a discovery. And for those who are still not convinced by this, I’d highly recommend reading the ‘Ten simple rules to consider regarding preprint submission’ for more on this topic (Bourne et al., 2017).

Open research – ‘what’s mine is yours and what’s yours is mine’

Today we live in a world where separating fact from fiction has become a non-trivial task. Our senses are bombarded daily by ‘alternative facts’, experts find themselves mired in false equivalencies, and political talking heads present opinions, feelings and anecdotal evidence on an equal footing with that which is empirically true. Be it climate change denial, anti-vaxxers or the war on GMO foods, the growing trend in anti-science rhetoric is alarming and the erosion of the public trust in science must stop!

Replication and reproducibility are core to the scientific method, yet several decades of the ‘publish or perish’ mantra have eroded these foundations. According to Nature, more than 70% of surveyed researchers have been unable to reproduce other scientist’s experiments, though it is feared this barely scratches the surface of the issue. A multitude of harmful practices such as selective reporting, ‘p-hacking’, inadequately detailed methods and limited strategies to decrease bias have all culminated in an unprecedented replication crisis which feeds into the ugly spectre of science’s ‘credibility problem’ (Baker, 2016).

Moving to an open science future may be the only way to stem the tide. In order to increase scientific literacy and restore trust, we must re-engage the public as active stakeholders. Secrecy breeds suspicion, and as a community, scientists can no longer afford to conduct their affairs behind closed doors. Open research offers us an innovative way to face our problems head-on and restore public confidence in scientific research.

Open data is a key element of open research, the profound benefit of which was best demonstrated by the Human Genome Project. It was the first major initiative to encourage the free distribution of research data into the public domain, often releasing new DNA sequences within 24 hours of completion with the initial draft sequence finally published in 2001. Today there are 900+ life science data repositories which researchers can freely use for reproducing studies, hypothesis testing and data mining, provided the original source is attributed.

For a growing number of journals wishing to support reproducibility, it is a condition of publication that raw data be hosted alongside the paper or deposited in a citable public repository. Demand for the raw data underlying a paper’s figures is not the only driver for the burgeoning creation of repositories, with open citable platforms like Figshare catering for unpublished and published research outputs in any file format from posters and presentations to datasets and code. This taps into a wealth of information that would otherwise remain locked away in hard-drives and notebooks, giving new life to miniature studies, negative results and optimisation experiments which may be critical to replicating crucial results.

While the move to open data has been generally welcomed as positive, there is mounting tension between speeding access to data and protecting the interests of those who laboured to collect them. This was evident when the team behind the groundbreaking SPRINT clinical trial learned that almost a third of their planned papers were at risk of being scooped after the NIH and New England Journal of Medicine gave competing researchers early access for a data-challenge competition. However, even if they are beaten to the punch on subsequent findings, the community will still await validation of any new results and show due deference to the confirmatory studies performed by the original discoverers. However, when the morals and ethics of situations like these are considered, especially for those studies immediately pertinent to human health or publicly funded, there can be no question that the potential for open data to expedite discovery far outweighs the downsides for those that collected them.

In that same vein, open notebook science, the logical extreme of open research, argues the benefits of transparency will always outweigh the cost. The practice involves placing raw and processed data with any associated materials online as it is generated, and explicitly includes making available failed, less significant or otherwise typically unpublished experiments. While most researchers get queasy at the thought of revealing their work habits and data with warts and all, scientists studying the Zika virus have been lauded for providing a daily-updated online open notebook for their research. Their brave move has paved the way for future pioneers of open research and was accompanied by a joint statement from over 30 global health bodies calling on all research data gathered during future public health emergencies to be made available as rapidly and openly as possible (Islom, 2016).

References

Baker M (2016). 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454. [DOI: 10.1038/533452a]

Bourne PE, Polka JK, Vale RD, Kiley R (2017). Ten simple rules to consider regarding preprint submission. PLOS Computational Biology 13(5), e1005473. [DOI: 10.1371/journal.pcbi.1005473]

Dingley B (2006). U.S. periodical prices – 2005. American Library Association 45, 1-16. <Available at http://www.ala.org/alcts/sites/ala.org.alcts/files/content/resources/collect/serials/ppi/05usppi.pdf> [Accessed 24 May 2017]

Enserink M (2016). In dramatic statement, European leaders call for ‘immediate’ open access to all scientific papers by 2020. Science Magazine – ScienceInsider. [DOI: 10.1126/science.aag0577]

Islom H (2016). Sharing data during Zika and other global health emergencies. Wellcome Trust Blog. <Available at https://blog.wellcome.ac.uk/2016/ 02/10/sharing-data-during-zika-and-other-global-health-emergencies/> [Accessed 21 May 2017]

Knappenberger B (2014). The Internet’s Own Boy: The Story of Aaron Swartz. Participant Media. <Available at https://archive.org/details/TheInternetsOwnBoyTheStoryOfAaronSwartz> [Accessed 21 May 2017]

Luther J (2017). The stars are aligning for preprints. The Scholar Kitchen. <Available at https://scholarlykitchen.sspnet.org/2017/04/18/stars-aligning-preprints/> [Accessed 21 May 2017]

Site search

Filter

Content Type

Sign up to our newsletter

Sign up to our newsletter