Show me the data!

I note with interest that article publication charge data from the University of Edinburgh has been released on Figshare today.

There are some fascinating numbers in there and I applaud the transparency.

One particular article that took my eye is this one:
Paradoxical effects of heme arginate on survival of myocutaneous flaps


Page charges were paid for this article amounting to £1330.45, and that’s just for page charges – the journal did not make the article open access, nor was it asked to. This was for ‘page charges’ alone.

I also noted the research was paid for by the MRC – a top-class UK government-funded agency. As I am a full UK taxpayer, I feel especially entitled to read this research!

The MRC has a very clear policy on open access – the article must either:

1.) be made immediately open access by the publisher upon publication; ‘journal-mediated OA’ (sometimes called ‘gold’)


2.) via the route of ‘repository-mediated access’ some kind of copy of the work must be made publicly accessible no more than 6 months after publication (sometimes called ‘green’)

Since the article clearly wasn’t open access at the publisher, I assume the authors have elected to choose the repository-access method. The article was formally published on 1st January 2014, so between then and now, clearly at least 6 months have elapsed. 7 months and 20 days to be precise. So where is the full text of this article?

It’s not in PubMed (abstract-only)
Nor EuropePubMed (abstract-only)
Nor the University of Edinburgh institutional repository (abstract-only)

So it would appear to me that the rules of the funding body (MRC) may have been broken here (sincere apologies if I am wrong about this), something all too easy to do if the repository route is chosen.

Wouldn’t it have been better to spend those page charges on making the paper immediately open access?

In the mean time, I have sent the University of Edinburgh open access team ( an email to ask where the full text for this paper is, and I await their reply.

[Update: the conference itself will be in November, 2014 - this is just the first announcement!]

I’m super excited to announce I’m part of the international organizing committee for OpenCon 2014:

OpenCon 2014





You can read the official first press release about this event here:


here’s an excerpt from it:

“From Nigeria to Norway, the next generation is beginning to take ownership of the system of scholarly communication which they will inherit,” said Nick Shockey, founding Director of the Right to Research Coalition. “OpenCon 2014 will support and accelerate this rapidly growing movement of students and early career researchers advocating for openness in research literature, education, and data.

The first event of its kind, OpenCon 2014 builds on the success of the Berlin 11 Satellite Conference for Students and Early Stage Researchers, which brought together more than 70 participants from 35 countries to engage on Open Access to scientific and scholarly research. The interest, energy, and passion from the student and researcher participants and the Open Access movement leaders who attended made a clear case for expanding the event in size and duration, and to broaden the scope to related areas of the Openness movement.”


Last year, I was also part of the organizing committee for the event that this has grown from – the Berlin 11 Satellite conference:






The Berlin 11 Satellite Conference was really exciting but only a 1-day event before the ‘main’ Berlin 11 event – an assemblage of students and ECR’s from literally all over the world (attending with generous full funding support), including representatives from (in no particular order) China, India, Saudi Arabia, Georgia, Tanzania, Tasmania(!), Kenya, Nigeria, Ghana, Uganda, Columbia, FYR Macedonia,  Mexico, Brazil, Sweden, Holland, Denmark, Poland, Portugal, Canada, the US, the UK… So don’t worry about where you are in the world – as long as you’re a student or ECR you’ll be eligible to apply for OpenCon 2014 (places are limited though!).

As a reminder, at the event last year we had Jack Andraka and Mike Taylor amongst the guest speakers. It was such a comprehensive success that it’s been expanded into a full 3-day event this year, expanding scope too, to includmeandjacke Open Data and OER, not just OA (they’re all obviously inter-related problems; better to tackle the integrated set of problems rather than aspects in isolation!).

Applications for OpenCon 2014 will open in August. For more information about the conference and to sign up for updates, visit

I promise you this – it’s going to be BIG and I’m stoked to be part of an international organizing committee helping to make this happen.

OpenCon 2014 is also looking for additional sponsorship, particularly for Travel Scholarships to ensure global representation at this meeting, so if you have a marketing budget to spend, or are feeling generous please do have a look at the sponsorship opportunities.

I’ve been invited to come in and have an informal chat about open access with the Linnean Society on March 24th this month. Particularly with regard to what is and what is not ‘open access’ in terms of Creative Commons licences. I write this blog post to spur on other advocates to try and encourage their society journals to use proper, open access compliant article licencing that facilitates rather than prevents text & data mining.

I have Tom Simpson at LinnSoc to thank for reaching out to make this happen. Thanks Tom!

It started from some tweets I sent a few days ago about an interesting new Zoo J Linn paper by Martin Brazeau & Matt Friedman. I’d include a pretty figure from this paper if I was allowed to, but unfortunately because it’s licensed with the Creative Commons Attribution-NonCommercial-NoDerivs License (CC BY-NC-ND) I can’t. To repost just a figure from the paper would be to create a smaller derivative work which the licence does not allow – I am only allowed to repost the *whole* article with absolutely no changes which is rather impractical for a 43 page article! Wiley in particular have a history of threatening scientist bloggers for reproducing a single figure from an article (read the Shelley Batts story here).

restricted access

It’s not just bloggers, and the outreach possibilities for the paper that are harmed with the use of such restrictive licenses – it also causes problems for RCUK funded researchers. Matt Friedman is based at Oxford at the moment – if the funding for this work came from any of the UK research councils, then the choice of the CC BY-NC-ND license could cause him problems – it is NOT compliant with the RCUK’s policy on open access. Wiley should know better than to offer this license to UK-based authors, but they have a significant conflict of interest in ensuring researchers choose more restrictive licencing options so that they can continue to be the sole proprietor of glossy reprint copies (ensured by the -NC clause). Both the -NC & the -ND clauses incidentally prevent the figures from being re-used on Wikipedia, another sad restriction for the authors who must have put a lot of effort into them.

In the realm of academic science, the application of that particular license to the paper-as-a-whole-work just doesn’t make sense. Many digital research projects need to be able excerpt, transform and translate research outputs such as academic papers, and in some cases create commercial value from this. My current BBSRC-funded research project ‘PLUTo: Phyloinformatic Literature Unlocking Tools. Software for making published phyloinformatic data discoverable, open, and reusable‘ relies on being allowed to transform, excerpt and republish extracted content from scientific papers. With Peter Murray-Rust we’re using text & image mining tools to generate open, re-usable phylogenetic data directly from the published literature, often directly from PDFs.  The Linnean Society have several good quality, well-respected journals which publish phylogenetic content, so they’re very much in the scope of our PLUTo work.

But clauses such as -ND stop us from using this material. It’s clear in the license terms and conditions – we are not allowed to make any derivative works from the original. So any papers using CC BY-NC-ND we will have to avoid. We cannot use them, and therefore they will not be cited by our project which is rather a shame for their authors.

Above all the CC BY-NC-ND license simply isn’t compliant with the very definition of open access as laid down over a decade ago at the Berlin, Budapest, Bethesda meetings. Wiley are knowingly mis-labelling articles using non-compliant licences as ‘open access’ even though they are by definition NOT open access. I hope the Linnean Society can spur Wiley to do something about this as it is not good for the journal, or its authors. Other journals using non-compliant licencing use terms like ‘public access‘ or ‘free access‘ or ‘sponsored access‘. Why can’t Wiley follow this lead? Open access is more than just free access – it enables re-use which is critical for research projects like mine. Please stop the ‘openwashing‘.


Further Reading:

Hagedorn, G., Mietchen, D., Morris, R., Agosti, D., Penev, L., Berendsohn, W., and Hobern, D. 2011. Creative commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information. ZooKeys 150:127-149.

Mounce, R. 2012. Life as a palaeontologist: Academia, the internet and creative commons. Palaeontology Online 2:1-10.

Klimpel, P. Consequences, Risks, and side-effects of the license module Non-Commercial – NC [PDF] 1-22.


Today I received proof that Elsevier are also sending takedown notices to UK universities – asking them to takedown copies of their staff’s academic research papers, hosted on university webpages. The full text is further down this post (in red). It is not just, it is not just the University of Calgary, University of California-Irvine, or Harvard University. Elsevier very probably are sending takedown notices to institutions and websites across the globe.
No-one is safe from these legal threats.

Not only that, but they seem to be encouraging universities to be pro-active and takedown more than just the specific articles identified in the DMCA notice they send! They are encouraging universities to limit access to their research works. This is simply disgraceful (even though I acknowledge they are technically, legally within their rights to do this because of the way in which their copyright transfer agreements are written, which incidentally many academics are effectively forced to sign in order to get published and make progress in their careers).

For background information read:

How one publisher is stopping academics from sharing their research. The Washington Post 19/12/2013

Elsevier steps up its War On Access SVPOW 17/12/2013


Librarians and university web admins: please publicly come out with more examples like this. Researchers, readers and taxpayers desperately need to know about this. Silence and subterfuge benefits no-one, these chilling effects must be publicly revealed.

This is the email I received with certain parts redacted:

*** Sent via Email – Inappropriate postings of Elsevier’s journal articles / DMCA Notice of Copyright Infringement ***

Dear Sir/Madam,

I write on behalf of Elsevier to bring to your attention the inappropriate posting of final published journal articles to your institutional website. I am President at Attributor (A Digimarc Company), which assists some of the world’s most prominent publishers, including Elsevier, with digital content protection ( Following the discussion below, a formal DMCA takedown request is included as Appendix A.

As you probably know, Elsevier journal article authors retain or are permitted a wide scope of scholarly use and posting on their own sites and for use within their own institutions. Those rights are more expansive when it comes to author preprints or accepted manuscripts than with respect to the final versions of published journal articles. Elsevier recognizes that in some cases authors or their institutions may not be fully aware of these rights and can by mistake post the final version of their articles to institutional websites or repositories. Unfortunately, it has come to our attention that copies of final published journal articles have, perhaps inadvertently, been posted for public access to one of your institutional websites.

I therefore request your cooperation to remove or disable access to these articles on your site, including but not limited to the articles identified in Appendix A. We have identified merely a sample in Appendix A, and as a publisher of close to 2,000 journals this might mean that more articles published by Elsevier could be found on your site. Please may I therefore draw your attention to Elsevier’s posting policy and ask for your attention to ensuring that your posting practices comply with this?

In particular I note that Elsevier currently doesn’t permit posting of the final published journal article, and if there is a mandate or systematic posting mechanisms in place then Elsevier asks for a cost-free agreement with the institution before accepted author manuscripts are posted.
I would also recommend considering the use of DOI links as a way to access to the version of records of a published article. This would allow authors to list their work and to provide easy access to peers.

Finally, should you need any help in properly identifying a final published article to prevent any future improper posting, please do get in touch via the email address below.

I appreciate your anticipated cooperation and if you have any questions or feedback, or if you believe you have received this message in error (as you have received permission to post this article from Elsevier), please contact:
Thank you.

Eraj Siddiqui
Attributor (A Digimarc Company)

Appendix A

Copyright Infringement Notice

This notice is sent pursuant to the Digital Millennium Copyright Act (DMCA), the European Union’s Directive on the Harmonisation of Certain Aspects of Copyright and Related Rights in the Information Society (2001/29/EC), and/or other laws and regulations relevant in European Union member states or other jurisdictions.

Please remove or disable access to the infringing pages or materials identified below, as they infringe the copyright works identified below.

I certify under penalty of perjury, that I am an agent authorized to act on behalf of the owner of the intellectual property rights and that the information contained in this notice is accurate.

I have a good faith belief that use of the material listed below in the manner complained of is not authorized by the copyright owner, its agent, or the law.

My contact information is as follows:

Organization name: Attributor Corporation as agent for [Publisher Company]
Phone: 650-340-9601
Mailing address:
400 South El Camino Real
Suite 650,
San Mateo, CA 94402

My electronic signature follows:
/E Siddiqui/
E. Siddiqui
Attributor, Inc.

***List of Works and Location of Infringing Page or Material ***

Infringing page/material that I demand be disabled or removed in consideration of the above:


Infringing page/material that I demand be disabled or removed in consideration of the above:

Rights Holder: Reed Elsevier

Original Work: [redacted]
Infringing URL: [redacted]


Dutch Universities too are receiving DMCA’s from Elsevier:


@Wowter via Twitter

I’d just like to point out to anyone who asks, particularly CRC Press (part of Taylor&Francis Group, who are in turn part of Informa PLC) that by posting the full text of my book chapter to I am *not* breaching the copyright transfer agreement I signed.

Upon receiving a copyright transfer agreement as a PDF from them via email – I edited the PDF to reword the agreement to terms that were more agreeable to me (e.g. I did NOT want to transfer my copyright to them for my work).

The bit of wording I changed is as follows:

As such, copyrights in the Work will not inure to the benefit of the Publisher, the Publisher will not own the publication, its title and component parts, and all publication rights. This does not permit the Publisher, in its name, to copyright in the Contribution, make applications to register its copyright claim, and to renew its copyright certificate.

I signed this reworded form as PDF (displayed below, signature removed) and returned it to them. I have now kindly received a free ‘author copy’ of the printed book and my chapter has clearly been included so it’s too late for CRC press to exclude my chapter. I can only assume they agreed to the reworded terms of the contract I signed and sent them.

I doubt CRC press would even be bothered by my actions to be honest. They are allowing another of their books to be completely posted online for free, so in comparison to that, my action here is puny – but it certainly emboldens me for the next time I may have to sign a CTA form…

CRC Press are welcome to non-exclusively publish my book chapter. Thank you CRC Press for agreeing to my terms and conditions.


Lessons one might learn from this exercise:


A word of warning though… I wouldn’t recommend relying on this method of editing CTA’s to get what you want. I was just lucky this time. Choosing an open access publication venue from the start is always the best option (if possible).

See also:

Mike Taylor 2010. Who Owns My Sauropod History Paper?


October 21st, 2013 | Posted by rmounce in Open Access - (3 Comments)

I handed in my thesis not long ago, on Thursday 3rd October 2013. No idea when my viva is yet. I can’t blog many of the chapters because I haven’t convinced my manuscript co-authors of the value of preprints, yet. I’m also a bit unsure as to how some of the other chapters will be received and thus I’ll wait until after the viva before I decide what to do next with it.

Given it’s open access week this week, there is one bit of my thesis I should definitely share: the acknowledgements!

I can’t possibly thank everyone enough for the help I’ve received over the past 4 years – my knowledge, skills, and connections have been vastly extended. Note in particular the bit I’ve highlighted in bold just for this blog post – I want everyone to know how absolutely reliant I’ve been on ‘alternate’ forms of literature access during my research – this is the new ‘normal’ for many early career researchers I fear, until open access is more prevalent we’ll have to continue to hunt, scavenge, beg, steal, and borrow for every PDF. My generation of researchers grew-up using Napster, Isohunt, Copyright infringement is an everyday activity for many of us – WE DONT CARE. Have you been to a conference? How many of the pictures on the speakers slides weren’t technically infringing someone else’s copyright? WE DONT CARE. One can shut down or block specific portals, but doing so doesn’t really solve the basic problem: from what I’ve seen, time and time again, copyright’s only role in science is to obstruct it. My biggest hope for Open Access Week 2013 is that someone will torrent Elsevier’s back catalogue – journal/publisher torrents have been done before and will be done again!  It probably won’t happen, but I can dream…


I would like to thank my supervisor, Matthew Wills for putting up with me all this time. I
have been lucky to have such accommodating and understanding support. I also must
thank my lab mates Martin Hughes, Anne O’Connor, Sylvain Gerber, Katie Davis, Rob
Sansom and everyone else in the Biodiversity Lab at the University of Bath – we had some
great times and some brilliant times together. Sincere thanks also to the University of Oslo
Bioportal computing cluster for providing me free cloud computation for my work.
Many people have helped spur my imagination along the way with ideas for different
chapters of this thesis. For this I would like to thank Ward Wheeler, Pablo Goloboff, Mark
Siddall, Dan Janies, Steve Farris and the generous financial support of the Willi Hennig
Society. I want to thank all those in the palaeontology community who have shared their
published data with me, particularly Graeme Lloyd for his stirling work in making
dinosaurian data available – I hope I have done something interesting with the data I have
used and opened eyes to new possibilities. I also want to thank all those in the open
science community – Peter Murray-Rust, Todd Vision, Heather Piwowar, Mark Hahnel,
Martin Fenner, Geoffrey Boulton, Jenny Molloy and so many more I’ve had the pleasure of
meeting in person. The energy and enthusiasm I drew from countless online discussions
on Facebook, Google+ and Twitter was truly inspirational.
For facilitating greater access to scientific literature I must heartily thank the Natural
History Museum, London library and archives, the #icanhazpdf community on Twitter,
Wikipaleo on Facebook, References Wanted on FriendFeed,, and SciHub.
Without these additional literature access facilitators I would not have been able to read
half the sources I cite in this thesis.
I must thank my wife Avril for her patience with me especially during the write-up phase,
for allowing me to go away to all these amazing conferences abroad, and for tolerating all
those long nights into mid-morning when I was tapping away on my noisy keyboard.
Finally, I thank my family: Richard, Rosemary & Tara for repeatedly encouraging me to
finish my thesis – I got there in the end!

Hack4ac recap

July 9th, 2013 | Posted by rmounce in BMC | eLife | Hack days | Open Access | Open Data | Open Science | PeerJ | PLoS - (4 Comments)

Last Saturday I went to Hack4Ac – a hackday in London bringing together many sections of the academic community in pursuit of two goals:

  • To demonstrate the value of the CC-BY licence within academia. We are interested in supporting innovations around and on top of the literature.
  • To reach out to academics who are keen to learn or improve their programming skills to better their research. We’re especially interested in academics who have never coded before


The list of attendees was stellar, cross-disciplinary (inc. Humanities) and international. The venue (Skills Matter) & organisation were also suitably first-class – lots of power leads, spare computer mice, projectors, whiteboards, good wi-fi, separate workspaces for the different self-assembled hack teams, tea, coffee & snacks all throughout the day to keep us going, prizes & promo swag for all participants…

The principal organizers; Jason Hoyt (PeerJ, formerly at Mendeley) & Ian Mulvany (Head of Tech at eLife) thus deserve a BIG thank you for making all this happen. I hear this may also be turned into a fairly regular set of meetups too, which will be great for keeping up the momentum of innovation going on right now in academic publishing.

The hack projects themselves…

The overall winner of the day was ScienceGist as voted for by the attendees. All the projects were great in their own way considering we only had from ~10am to 5pm to get them in a presentable state.



This project was initiated by Jure Triglav, building upon his previous experience with Tiris. This new project aims to provide an open platform for post-publication summaries (‘gists’) of research papers, providing shorter, more easily understandable summaries of the content of each paper.

I also led a project under the catchy-title of Figures → Data where-by we tried to provide added-value by taking CC-BY bar charts and histograms from the literature and attempting to re-extract the numerical data from those plots with automated efforts using computer vision techniques. On my team for the day I had Peter Murray-Rust, Vincent Adam (of HackYourPhD) and Thomas Branch (Imperial College). This was handy because I know next to nothing about computer vision – I’m Your Typical Biologist ™  in that I know how to script in R, perl, bash and various other things, just enough to get by but not nearly enough to attempt something ambitious like this on my own!

Forgive me the self-indulgence if I talk about this  Figures → Data project more than I do the others but I thought it would be illuminative to discuss the whole process in detail…

In order to share links between our computers in real-time, and to share initial ideas and approaches, Vincent set-up an etherpad here to record our notes. You can see the development of our collaborative note-taking using the timeslider function below (I did a screen record of it for prosperity using recordmydesktop):

In this etherpad we document that there are a variety of ways in which to discover bar charts & histograms:

  • figuresearch is one such web-app that searches the PMC OA subset for figure captions & figure images. With this you can find over 7,000 figure captions containing the word ‘histogram’ (you would assume that the corresponding figure would contain at least one histogram for 99% of those figures, although there are exceptions).
  • figshare has nearly 10,000 hits for histogram figures, whilst BMC & PLOS can also be commended for providing the ability to search their literature stack by just figure captions, making the task of figure discovery far more efficient and targeted.

Jason Hoyt was in the room with us for quite a bit of the hack and clearly noted the search features we were looking for – just yesterday he tweeted: “PeerJ now supports figure search & all images free to use CC-BY (inspired by @rmounce at #hack4ac)” [link] – I’m really glad to see our hack goals helped Jason to improve content search for PeerJ to better enable the needs (albeit somewhat niche in this case) of real researchers. It’s this kind of unique confluence of typesetters, publishers, researchers, policymakers and hackers at doing-events like this that can generate real change in academic publishing.

The downside of our project was that we discovered someone’s done much of this before. ReVision: Automated Classification, Analysis and Redesign of Chart Images  [PDF] was an award-winning paper at an ACM conference in 2011. Much of this project would have helped our idea, particularly the classification of figures tech. Yet sadly, as with so much of ‘closed’ science we couldn’t find any open source code associated with this project. There were comments that this type of non-code sharing behaviour, blocking re-use and progress, are fairly typical in computer science & ACM conferences (I wouldn’t know but it was muttered…).  If anyone does know of the existence of related open source code available for this project do let me know!

So… we had to start from a fairly low-level ourselves: Vincent & Thomas tried MATLAB & C based approaches with OpenCV and their code is all up on our project github. Peter tried using AMI2 toolset, particularly the Canny algorithm, whilst I built up an annotated corpus of 40 CC-BY bar charts & histograms for testing purposes. Results of all three approaches can be seen below in their attempts to simplify this hilarious figure about dolphin cognition from a PLOS paper:

The plastic fish just wasn't as captivating...

“Figure 5. Total time spent looking at different targets.” from Siniscalchi M, Dimatteo S, Pepe AM, Sasso R, Quaranta A (2012) Visual Lateralization in Wild Striped Dolphins (Stenella coeruleoalba) in Response to Stimuli with Different Degrees of Familiarity. PLoS ONE 7(1): e30001. doi:10.1371/journal.pone.0030001 CC-BY

Peter’s results (using AMI2):


Thomas’s results (OpenCV & C):


Vincent’s results (OpenCV & MATLAB & bilateral filtering)

We might not have won 1st prize but I think our efforts are pretty cool, and we got some laughs from our slides presenting our days’ work at the end (e.g. see below). Importantly, *everything* we did that day is openly-available on github to re-use, re-work and improve upon (I’ll ping Thomas & Vincent soon to make sure their code contributions are openly licensed). Proper full-stack open science basically!

some figures are just awful


Other hack projects…

As I thought would happen I’ve waffled on about our project. If you’d like to know more about the other projects hopefully someone else will blog about them at greater length (sorry!) I’ve got my thesis to write y’know! ;)

You can find more about them all either on the Twitter channel #hack4ac or alternatively on the hack4ac github page. I’ll write a little bit more below, but it’ll be concise, I warn you!

  • Textmining PLOS Author Contributions

This project has a lovely website for itself: and so needs no more explanation.

  • Getting more openly-licensed content on wikipedia

This group had problems with the YouTube API I think. Ask @EvoMRI (Daniel Mietchen) if you’re interested…

  • articlEnhancer

Not content with helping out the PLOS author contribs project Walther Georg also unveiled his own article enhancer project which has a nice webpage about it here:

  • Qual or Quant methods?

Dan Stowell & co used NLP techniques on full-text accessible CC-BY research papers, to classify all of them in an automated way determining whether they were qualitative or quantitative papers (or a mixture of the two). The last tweeted report of it sounded rather promising: “Upstairs at #hack4ac we’re hacking a system to classify research papers as qual or quant. first results: 96 right of 97. #woo #NLPwhysure” More generally, I believe their idea was to enable a “search by methods” capability, which I think would be highly sought-after if they could do it. Best of luck!
Apologies if I missed any projects. Feel free to post epic-long comments about them below ;)




This post was originally posted over at the LSE Impact blog where I was kindly invited to write on this theme by the Managing Editor. It’s a widely read platform and I hope it inspires some academics to upload more of their work for everyone to read and use

Recently I tried to explain on twitter in a few tweets how everyone can take easy steps towards open scholarship with their own work. It’s really not that hard and potentially very beneficial for your own career progress – open practices enable people to read & re-use your work, rather than let it gather dust unread and undiscovered in a limited access venue as is traditional. For clarity I’ve rewritten the ethos of those tweets below:

Step 1: before submitting to a journal or peer-review service upload your manuscript to a public preprint server

Step 2: after your research is accepted for publication, deposit all the outputs – full-text, data & code in subject or institutional repositories

The above is the concise form of it, but as with everything in life there is devil in the detail, and much to explain, so I will elaborate upon these steps in this post.

Step 1: Preprints

Uploading a preprint before submission is technically very easy to do – it takes just a few clicks, but the barrier that prevents many from doing this in practice is cultural and psychological. In disciplines like physics it’s completely normal to upload preprints to and their submission to a journal in some cases has more to do with satisfying the requirements of the Research Excellence Framework exercise than any real desire to see it in a journal. Many preprints on arXiv get cited and are valued scientific contributions, even without them ever being published in a journal. That said, even within this community author perceptions differ as to the exact practice of when to upload a preprint in the publication cycle.

Within biology it’s relatively unheard of to upload a preprint before submission but that’s likely to change this year because of an excellent well-put article advocating their use in biology and the very many different outlets available for them. My own experience of this has been illuminating – I recently co-authored a paper openly on github and the preprint was made available with a citable DOI via figshare. We’ve received a nice comment, more than 250 views and a citation from another preprint. All before our paper has been ‘published’ in the traditional sense. I hope this illustrates well how open practices really do accelerate progress.

This is not a one-off occurrence either. As with open access papers, freely accessible preprints have a clear citation advantage over traditional subscription access papers:


Outside of the natural sciences the situation is also similar; Martin Fenner notes that in the social sciences (SSRN) and economics (RePEc) preprints are also common either in this guise, or as ‘working papers’ – the name may be different but the pre-submission accessibility is the same. Yet I suspect, like in biology, this practice isn’t yet mainstream in the Arts & Humanities – perhaps just a matter of time before this cultural shift occurs (more on this later on in the post…)?

There is one important caveat to mention with respect to posting preprints – a small minority of conservative, traditional journals will not accept articles that have been posted online prior to submission. You might well want to check Sherpa/RoMEo before you upload your preprint to ensure that your preferred destination journal accepts preprint submissions. There is an increasing grass-roots led trend apparent to convince these journals that preprint submissions should be allowed, of which some have already succeeded.

If even much-loathed publishers like Elsevier allow preprints, unconditionally, I think it goes to show how rather uncontroversial preprints are. Prior to submission it’s your work and you can put it anywhere you wish.


Step 2: Postprints


Unlike with preprints, the postprint situation is a little trickier. Publishers like to think that they have the exclusive right to publish your peer-reviewed work. The exact terms of these agreements will vary from journal to journal depending on the exact terms of the copyright or licencing agreement you might have signed. Some publishers try to enforce ‘embargoes’ upon postprints, to maintain the artificial scarcity of your work and their monopoly of control over access to it. But rest assured, at some point, often just 12 months after publication, you’ll be ‘allowed’ to upload copies of your work to the public internet (again SHERPA/RoMEO gives excellent information with respect to this).

So, assuming you already have some form of research output(s) to show for your work, you’ll want these to be discoverable, readable and re-usable by others – after all, what’s the point of doing research if no-one knows about it! If you’ve invested a significant amount of time writing a publication, gathering data, or developing software – you want people to be able to read and use this output. All outputs are important, not just publications. If you’ve published a paper in a traditional subscription access journal, then most of the world can’t read it. But, you can make a postprint of that work available, subject to the legal nonsense referred to above.

If it’s allowed, why don’t more people do it?

Similar to the cultural issues discussed with preprints, for some reason, researchers on the whole don’t tend to use institutional repositories (IR) to make their work more widely available. My IR at the University of Bath lists metadata for over 3300 published papers, yet relatively few of those metadata records have a fulltext copy of the item deposited with them for various reasons. Just ~6.9% of records have fulltext deposits, as published back in June 2011.

I think it’s because institutional repositories have an image problem: some are functional but extremely drab. I also hear of researchers full of disdain who say of their IR’s (I paraphrase):

“Oh, that thing? Isn’t that just for theses & dissertations – you wouldn’t put proper research there”

All this is set to change though as researchers are increasingly being mandated to deposit their fulltext outputs in IR’s. One particular noteworthy driver of change in this realm could be the newly-launched Zenodo service. Unlike or ResearchGate which are for-profit operations, and are really just websites in many respects; Zenodo is a proper repository – it supports harvesting of content via the OAI-PMH protocol and all metadata about the content is CC0, and it’s a not-for-profit operation. Crucially, it provides a repository for academics less well-served by the existing repository systems – not all research institutions have a repository, and independent or retired scholars also need a discoverable place to put their postprints. I think the attractive, modern-look, and altmetrics to demonstrate impact will also add that missing ‘sex appeal’ to provide the extra incentive to upload.


Providing Access to Your Published Research Data Benefits You

A new preprint on PeerJ shows that papers with associated open research data have a citation advantage. Furthermore other research has shown that willingness to share research data is related to the strength of the evidence and the quality of the results. Traditional repository software was designed around handling metadata records and publications. They don’t tend be great at storing or visualizing research data. But a new development in this arena is the use of CKAN software for research data management. Originally CKAN was developed by the Open Knowledge Foundation to help make open government data more discoverable and usable; the UK, US, and governments around the world now use this technology to make data available. Now research institutions like the University of Lincoln are also using this too for research data management, and like Zenodo the interface is clean, modern and provides excellent discoverability.


Repositories are superior for enabling discovery of your work

Even though I use & ResearchGate myself. They’re not perfect solutions. If someone is looking for your papers, or a particular paper that you wrote these websites do well in making your output discoverable for these types of searches from a simple Google search. But interestingly, for more complex queries, these simple websites don’t provide good discoverability.

An example: I have a fulltext copy of my Nature letter on, it can’t be found from Google Scholar – but the copy in my institutional repository at Bath can. This is the immense value of interoperable and open metadata. Academics would do well to think closely about how this affects the discoverability of their work online.

The technology for searching across repositories for freely accessible postprints isn’t as good as I’d want it to be. But repository search engines like BASE, CORE and Repository Search are improving day by day. Hopefully, one day we’ll have a working system where you can paste-in a DOI and it’ll take you to a freely available postprint copy of the work; Jez Cope has an excellent demo of this here.

Open scholarship is now open to all

So, if there aren’t any suitable fee-free journals in your subject area (1), you find you don’t have funds to publish a gold open access article (2), and you aren’t eligible for am OA fee waiver (3), fear not. With a combination of preprint & postprint postings, you too can make your research freely available online, even if it has the misfortune to be published in a traditional subscription access journal. Upload your work today!