Show me the data!

Elsevier seem to have responded to my criticism yesterday and have stopped selling the article “HIV infection en route to endogenization: two cases” from their ScienceDirect website. Take what you will from that change, but I infer that they have realised that they are in the wrong.

Actually, they are still selling it from the ScienceDirect website too. It only looked freely available to me because I myself had paid for access to it & I guess a cookie remembered me. It’s still on sale at as well as

Further update: As of 2015-03-09 17.13 PM the articles were finally freely available ‘unchained'(?) from behind Elsevier’s paywalls.


So I was very surprised to find when I woke up this morning (2015-03-07), that this article, and many other CC-licensed articles in that journal are still being sold via other Elsevier-owned websites e.g. the one below:


I couldn’t believe my eyes, so just to make sure they really were still illegally selling this article that shouldn’t be sold, I made another test purchase:


I heard back from Didier (the corresponding author) yesterday. He does not know why Elsevier are selling his article, nor did he give them permission to.

Elsevier (RELX Group) have been doing this for many years now: selling open access articles that authors/funders have paid-for to make freely available to everyone. Peter Murray-Rust, Mike Taylor and others have written about this extensively.

It is little wonder then that Elsevier is the most boycotted academic publishing company in the world: nearly 15,000 researchers have publicly declared they want nothing to do with this company.

I am yet to receive a refund or an apology. Alicia Wise did tweet me this:

“.@emckiernan13 .@TomReller .@rmounce the journal is in transition from Wiley to Elsevier; will check on transition status” but it is of little help…

Will I get my money back? I hope so…

[Update 2015-03-13: I have blogged further about this here and provided a recap here. This post has been viewed over 10,000 times. Clearly some people want to sweep this under the carpet and pretend this is just ‘a storm in a teacup’ but it did happen and people do care about this. Thanks to everyone who spread the word.]

Today, Elsevier (RELX Group) illegally sold me a Creative Commons Attribution-NonCommercial-NoDerivatives licensed article:

Colson, P. et al. HIV infection en route to endogenization: two cases. Clin Microbiol Infect 20, 1280-1288 (2014).

I’m really not happy about it. I don’t think the research funders will be happy about it either. Especially not the authors (who are the copyright holders here).

Below is a screenshot of how the content was illegally on offer for sale, for $31.50 + tax.


To investigate if it really was on sale. I decided to make a test purchase. Just to be absolutely sure. Why not? The abstract looked interesting. The abstract was all I was allowed to read. I wanted to know more.

Below is the email receipt I received confirming my purchase of the content. I have crudely redacted my postal address but it’s otherwise unaltered:


So what’s the problem here?

The article was originally published online by Wiley. As clearly indicated in the document, the copyright holders are the authors. The work was licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0).

The terms of this widely used license clearly state: “You may not use the material for commercial purposes.

Wiley respect this license. They make this content freely available on their website here. The authors, or their research funder or institution probably paid Wiley money to make sure that the article could be made freely available to the world.

But tonight, Elsevier were selling it to me and all the world via their ScienceDirect platform.
This is clearly an illegal copyright infringement.

I have tweeted Elsevier employees @wisealic & @TomReller to see how I can get a refund for my purchase at the very least. This article should never have been on sale.

I have also contacted the corresponding author (Didier) to see what his thoughts are.
I do hope the authors will take legal action against Elsevier for their criminal misdeeds here.

Open Research London launch event

January 27th, 2015 | Posted by rmounce in Open Access | Open Science - (Comments Off on Open Research London launch event)

Last week, on Monday 19th January, I co-organised the first ever Open Research London event at Imperial College London, with the help of local organisers; Jon Tennant & Torsten Reimer.


We invited two speakers for our first meeting:

They both gave excellent talks which were recorded on Imperial’s ‘Panopto’ recording system. We hope to make these available for viewing/download as soon as possible. The recordings are now publicly available! CB’s talk is available to stream here & download here, JMcA’s talk is available to stream here & download here.


We had lots of free swag to give away to attendees, including PLOS t-shirts, notebooks, USB sticks and ‘How Open Is It?‘ guides, as well as SPARC and OA Button stickers & badges – they seemed to go down well. I kept some swag back for the next event too, so if you didn’t get what you wanted this time, there will be more next time!

The speakers were kind enough to publicly post their slide-decks before their talks so you can alternatively catch-up with their content on Slideshare.

Chris Banks’ slides are embedded below:

Joe McArthur’s slides are below here:

I’ll refrain from naming names for the sake of privacy but what I most enjoyed about the event was the diversity of attendees. We had people who were ‘curious’ about Open Access and wanted to know more. We had a new PhD student, we had midway PhD students, librarians, open access publishers, and more… I believe one attendee might even have travelled back to Brighton after the event! In terms of affiliations, we had attendees from Jisc, The Natural History Museum London, Imperial College (two different campuses represented!), UCL, The National Institute for Medical Research (MRC), and AllTrials.

I was also mightily impressed that nearly all the attendees, including both speakers happily joined us in the student union (Eastside) afterwards for discussions & networking over drinks – a real sense of community here I think.

Can we do better next time? Sure we can, we must! Attendance was lower than I had hoped for but several people kindly messaged me afterwards to let me know they wanted to be there but couldn’t. I’ve no doubt that with warmer weather we’ll be able to double our attendance.


The next ORL meetup will be in mid or late March at UCL, further details TBC. 

Keep up-to-date with ORL via Twitter @OpenResLDN or our OKFN community group page:


I’m actively in the process of trying to grow the organising/steering committee for ORL. At the moment it’s just myself, Liz I-S and Jon Tennant. If you’re passionate about open research, open access, open data, reproducible research, citizen science, diversity in research, open peer-review etc… then get in contact with me:

I would love to have an OC that more broadly represents the variety of the open research community in London :)


Until next time…



[Update: I’ve submitted this idea as a FORCE11 £1K Challenge research proposal 2015-01-13. I may be unemployed from April 2015 onwards (unsolicited job offers welcome!), so I certainly might find myself with plenty of time on my hands to properly get this done…!]

Inspired by something I heard Stephen Curry say recently, and with a little bit of help from Jo McIntyre I’ve started a project to compare EuropePMC author manuscripts with their publisher-made (mangled?) ‘version of record’ twins.

How different are author manuscripts from the publisher version of record? Or put it another way, what value do publishers add to each manuscript? With the aggregation & linkage provided by EuropePMC – an excellent service – we can rigorously test this.


In this blog post I’ll go through one paper I chose at random from EuropePMC:

Sinha, N., Manohar, S., and Husain, M. 2013. Impulsivity and apathy in parkinson’s disease. J Neuropsychol 7:255-283.  doi: 10.1111/jnp.12013 (publisher version) PMCID: PMC3836240 (EuropePMC version)


A quick & dirty analysis with a simple tool that’s easy to use & available to everyone:

pdftotext -layout     (you’re welcome to suggest a better method by the way, I like hacking PDFs)

(P) = Publisher-version , (A) = Author-version

Manual Post-processing – remove the header and footer crud from each e.g. “262
Nihal Sinha et al.” (P) and “J Neuropsychol. Author manuscript; available in PMC 2013 November 21.” (A)

Automatic Post-processing – I’m not interested in numbers or punctuation or words of 3-letters or less so I applied this bash-one-liner:

strings $inputfile | tr ‘[A-Z]’ ‘[a-z]’ | sed ‘s/[[:punct:]]/ /g’ | sed ‘s/[[:digit:]]/ /g’ |  sed s/’ ‘/\\n/g | awk ‘length > 3’ | sort | uniq -c | sort -nr > $outputfile

Then I just manually diff’d the resulting word lists – there’s so little difference it’s easy for this particular pair.



The correspondence line changed slightly from this in the author version:

Correspondence should be addressed to Nuffield Department of Clinical Neurosciences and Department Experimental Psychology, Oxford University, Oxford OX3 9DU, UK ( . (A)

To this in the publisher version (I’ve added bold-face to highlight the changes):

Correspondence should be addressed to Masud Husain, Nuffield Department of Clinical Neurosciences and Department Experimental Psychology, Oxford University, Oxford OX3 9DU, UK (e-mail: (P)


Reference styling has been changed. Why I don’t know, seems a completely pointless change. Either style seems perfectly functional to me tbh:

Drijgers RL, Dujardin K, Reijnders JSAM, Defebvre L, Leentjens AFG. Validation of diagnostic criteria for apathy in Parkinson’s disease. Parkinsonism & Related Disorders. 2010; 16:656–660. doi:10.1016/j.parkreldis.2010.08.015. [PubMed: 20864380] (A)

to this in the publisher version:

Drijgers, R. L., Dujardin, K., Reijnders, J. S. A. M., Defebvre, L., & Leentjens, A. F. G. (2010). Validation of diagnostic criteria for apathy in Parkinson’s disease. Parkinsonism & Related Disorders, 16, 656–660. doi:10.1016/j.parkreldis.2010.08.015 (P)

In the publisher-version only (P) “Continued” has been added below some tables to acknowledge that they overflow on the next page. Arguably the publisher has made the tables worse as they’ve put them sideways (landscape) so they now overflow onto other pages. In the author-version (A) they are portrait-orientated and so hence each fit on one page entirely.


Finally, and most intriguingly, some of the figure-text comes out only in the publisher-version (P). In the author-version (A) the figure text is entirely image pixels, not copyable text. Yet the publisher version has introduced some clearly imperfect figure text. Look closely and you’ll see in some places e.g. “Dyskinetic state” of figure 2 c) in (P), the ‘ti’ has been ligatured and is copied out as a theta symbol:

DyskineƟc state




I don’t know about you, but for this particular article, it doesn’t seem like the publisher has really done all that much aside from add their own header & footer material, some copyright stamps & their journal logo – oh, and ‘organizing peer-review’. How much do we pay academic publishers for these services? Billions? Is it worth it?

I plan to sample at least 100 ‘twinned’ manuscript-copies and see what the average difference is between author-manuscripts and publisher-versions. If the above is typical of most then this will be really bad news for the legacy academic journal publishers… Watch this space!


Thoughts or comments as to how to improve the method, or relevant papers to read on this subject are welcome. Collaboration welcome too – this is an activity that scales well between collaborators.