Show me the data!
Header

Author Archives: rmounce

Text mining for museum specimen identifiers

May 19th, 2015 | Posted by rmounce in Content Mining - (1 Comments)

Now I’m at the Natural History Museum, London I’ve started a new and ambitious text-mining project: to find, extract, publish, and link-up all mentions of NHM, London specimens published in the recent research literature (born digital, published post-2000).

Rod Page is already blazing a trail in this area with older BHL literature. See: Linking specimen codes to GBIF & Design Notes on Modelling Links for recent, relevant posts. But there’s still lots to be done I think, so here’s my modest effort.

 

Why?

It’s important to demonstrate the value of biological specimen collections. A lot of money is spent cataloguing, curating and keeping safe these specimens. It would be extremely useful to show that these specimens are being used, at scale, in real, recent research — it’s not just irrelevant stamp collecting.

Sometimes the NHM, London specimen catalogue has incorrect, incomplete or outdated data about it’s own specimens – there is better, newer data about them in the published literature that needs to be fed back to the museum.

An example: specimen “BMNH 2013.2.13.3” is listed in the online catalogue on the NHM open data portal as Petrochromis nov. sp. By searching the literature for BMNH specimens, I happened to find where the new species of this specimen was described: http://dx.doi.org/10.1007/s10228-014-0396-9 as Petrochromis horii Takahashi & Koblmüller, 2014. It’s also worth noting this specimen has associated nucleotide sequence data on GenBank here: http://www.ncbi.nlm.nih.gov/nuccore/AB850677.1 .

Having talked a lot about the 5 stars of open data in the context of research data recently, I wonder… wouldn’t it be really useful to make 4 or 5 star linked open data around biological specimens? From Rod Page, I gather this is part of the grand goal of creating a biodiversity knowledge graph.

For this project, I will be focussing on linking BMNH (NHM, London) specimen identifiers with publication identifiers (e.g. DOIs) and GenBank accession numbers.

 

What questions to ask?

Where have NHM, London specimens been used/published? What are the most used NHM, London specimens in research? How does NHM, London specimen usage compare to other major museums such as the AMNH (New York) or MNHN (Paris).

Materials for Mining

1.) The PubMedCentral Open Access subset – a million papers, but mainly biomedical research.
2.) Open Access & free access journals that not included in PMC
3.) figshare – particularly useful if nothing else, as a means of mining PLOS ONE supplementary materials (I read recently that essentially 90% of figshare is actually PLOS ONE supp. material! See Table 2)
4.) select subscription access journals – annoyingly hard to get access to in bulk, but important to include as sadly much natural history research is still published behind paywalls.

 

(very) Preliminary Results

The PMC OA subset is fantastic & really facilitates this kind of research – I wish ALL of the biodiversity literature was aggregated like (some) of the open access biomedical literature is. You can literally just download a million papers, click, and go do your research. It facilitates rigorous research by allowing full machine access to full texts.

Simple grep searches for ‘NHMUK’ & ‘BMNH [A-Z0-9][0-9]’, two of the commonest citation forms by which specimens may be cited reveal many thousands of possible specimen mentions in the PMC OA subset I must now look through to clean-up & link-up. In terms of journals, these ‘hits’ in the PMC OA subset come from (in no particular order): PLOS ONE, Parasites & Vectors, PeerJ, ZooKeys, Toxins, Zoo J Linn Soc, Parasite, Frontiers in Zoology, Ecology & Evolution, BMC Research Notes, Biology Letters, BMC Evolutionary Biology, Aquatic Biosystems, BMC Biology, Molecular Ecology, Journal of Insect Science, Nucleic Acids Research and more…!

specimen “BMNH 86.10.4.2″ is a great example to lookup / link-up on the NHM Open Data Portal: http://data.nhm.ac.uk/specimen/8559613e-f2a3-447c-aa1a-d476600d3293 the catalogue record has 7 associated images openly available under CC BY, so I can liven up this post by including an image of the specimen (below)! I found this specimen used in a PLOS ONE paper: Walmsley et al. (2013) Why the Long Face? The Mechanics of Mandibular Symphysis Proportions in Crocodiles. doi: 10.1371/journal.pone.0053873 (in the text caption for figure 1 to be precise).

© The Trustees of the Natural History Museum, London. Licensed for reuse under CC BY 4.0. Source.

 

 

Questions Arising

How to find and extract mentions of NHM, London specimens in papers published in Science, Nature & PNAS ? There are sure to be many! I’m assuming the last 15 years worth of research published in these journals will be difficult to scrape – they would be quite likely to block my IP address if I tried to. Furthermore, all the actual science is typically buried in supplementary file PDFs in these journals not in the ‘main’ short article. Will Science, Nature & PNAS  let me download all their supp material from the last 15 years? Is this facilitated at all? How do people actually do rigorous research when the contents of supplementary data files published in these journals are so undiscoverable & inaccessible to search?

 

It’s clear to me there are many separate divisions when it comes to discoverability of research. There’s the divide between open access (highly discoverable & searchable) and subscription access (less discoverable, less searchable, depending upon publisher-restrictions). There’s also the divide between the ‘paper’ (more searchable) and ‘supplementary materials’ (less easily searchable). Finally, there’s also the divide between textual and non-textual media: a huge amount of knowledge in the scientific literature is trapped in non-textual forms such as figure images which simply aren’t instantly searchable by textual methods (figure captions DO NOT contain all of the information of the figure image! Also, OCR is time consuming and error-prone especially on the heterogeneity of fonts and orientation of words in most figures). For example, looking across thousands of papers with phylogenetic analyses published in the journal IJSEM, 95% of the taxa / GenBank accessions used in them are only mentioned in the figure image, nowhere else in the paper or supplementary materials as text! This needs to change.

 

As should be obvious by now; this is a very preliminary post, just to let people know what I’m doing and what I’m thinking. In my next post I’ll detail some of the subscription access journals I’ve been text mining for specimens, and the barriers I’ve encountered when trying to do so.

 

Bonus question: How should I publish this annotation data?

Easiest would be to release all annotations as a .csv on the NHM open data portal with 3 columns where each column mimics ‘subject’  ‘predicate’ ‘object’ notation: Specimen, “is mentioned in”, Article DOI.

But if I wanted to publish something a little better & a little more formal, what kind of RDF vocabulary can I use to describe “occurs in” or “is mentioned in”. What would be the most useful format to publish this data in so that it can be re-used and extended to become part of the biodiversity knowledge graph and have lasting value?

Making a journal scraper

May 13th, 2015 | Posted by rmounce in Content Mining - (5 Comments)

Yesterday, I made a journal scraper for the International Journal of Systematic and Evolutionary Microbiology (IJSEM).

Fortunately, Richard Smith-Unna and the ContentMine team have done most of the hard work in creating the general framework with quickscrape (open-source and available on github), I just had to modify the available journal-scrapers to work with IJSEM.

How did I do it?

Find an open access article in the target journal e..g James et al (2015) Kazachstania yasuniensis sp. nov., an ascomycetous yeast species found in mainland Ecuador and on the Galápagos

In your browser, view the HTML source of the full text page, in the Chrome/Chromium browser the keyboard shortcut to do this is Ctrl-U. You should then see something like this, perhaps with less funky highlighting colours:

I based my IJSEM scraper on the existing set of scraper definitions for eLife because I know both journals use similar underlying technology to create their webpages.

The first bit I clearly had to modify was the extraction of publisher. In the eLife scraper this works:

but at IJSEM that information isn’t specified with ‘citation_publisher’, instead it’s tagged as ‘DC.Publisher’ so I modified the element to reflect that:

The license and copyright information extraction is even more different between eLife and IJSEM, here’s the correct scraper for the former:

and here’s how I changed it to extract that information from IJSEM pages:

The XPath needed is completely different. The information is inside a div, not a meta tag.

 

Hardest of all though were the full size figures and the supplementary materials files – they’re not directly linked from the full text HTML page which is rather annoying. Richard had to help me out with these by creating “followables”:

In his words:

any element can ‘follow’ any other element in the elements array, just by adding the key-value pair "follow": "element_name" to the element that does the following. If you want to follow an element, but don’t want the followed element to be included in the results, you add it to a followables array instead of the elements array. The followed array must capture a URL.

 

 

The bottom-line is, it might look complicated initially, but actually it’s not that hard to write a fully-functioning  journal scraper definition, for use with quickscrape. I’m off to go and create one for Taylor & Francis journals now :)

 

Wouldn’t it be nice if all scholarly journals presented their content on the web in the same way, so we didn’t have to write a thousand different scrapers to download it? That’d be just too helpful wouldn’t it?

 

 

Agreements between authors and publishers

May 9th, 2015 | Posted by rmounce in Open Access - (8 Comments)

April Clyburne-Sherin asked an interesting question on the OpenCon Discussion List recently:

I am an author on a manuscript that my lab wants to publish in a subscription journal that normally retains the copyright. The manuscript is a desirable one so they are “willing” (haha) to provide it “open access” (that was my stipulation to my lab when they started speaking with the publisher). My lab is happy with this, but I do not trust the publisher and want to be able to negotiate a publishing agreement that guarantees:
  • We retain the copyright;
  • The article will be open access forever and no version will be behind a paywall at their journal ever;
  • That there are no sign-ins, registrations, DRM viewing issues, or other ‘free” obstacles to viewing the article.

Comment: Quite rightly, April does not trust the publisher to make the published work fully open access in perpetuity, and wants to do more as an author, with the publishing agreement (a formal contract) to ensure that the publisher will actually provide the exact services she wants.

Recent events this year, whereby Elsevier, Wiley and Springer have all been caught red-handed selling access to hybrid open access articles justifies this lack of trust. It’s a sad state of affairs that authors such as April & myself no longer trust some service providers to actually provide the services we pay them for (e.g. Open Access).

Some helpful links & pointers have been provided on the discussion list, and this may be a concern many other scholarly authors have so it’s valuable to collate, discuss and publicise possible solutions to the thorny problem of publishing agreements with legacy publishers. I certainly don’t pretend to have all the answers here and I think organisations like SPARC might want to act on this one.

Lorraine Chuen links to the Canadian Association of Research Libraries (CARL) ‘Resources for Authors’ page which amongst other things discusses the Canadian SPARC Author Addendum. I knew about the US SPARC Author Addendum, but I never knew there was a Canadian version too!

Matt Menzenski links to the University of Kansas Authors & Copyright page. I particularly like An Introduction to Publication Agreements for Authors (Armstrong, 2009) that they link to at the very top – it’s really useful information.

My Suggested Solutions

For my part, I chipped-in with four different ways that in their own way either partially or wholly fulfil some or all of the criteria April is looking for:

1.) Wait for them to send you their proposed publishing agreement & change the terms to ones you find agreeable

If they send you a standard CTA (Copyright Transfer Agreement) form as PDF, you can modify the wording of that PDF to terms you prefer and send it back to them and they probably won’t even notice as long as it’s signed & doesn’t look too different. It’s cheeky, but I got away with it for a book chapter once. Be careful to remove / replace the term ‘work for hire’ – it may look like an innocuous statement but apparently this is fairly key in legal terms – I neglected to remove that from my book chapter agreement.

 

2.) Transferring away your copyright away to another person
Not as easy perhaps for multi-author papers but Mike Taylor has a good (successful-ish) anecdote about transferring his copyright to his spouse, thereby preventing the Geological Society from taking the copyright of the work.

 

3.) Claim that one of the authors is a US federal government employee
Use Section 105 of the US Copyright Act by pretending that at least one of the authors is an employee of the US Government. Works of the U.S. federal government cannot be copyrighted by their authors in the US – they must be public domain, which is in practice achieved by applying the Creative Commons Zero waiver to the paper. The CTA form may contain a check box asking about this. If not, just email them about it. Michael Eisen famously, successfully liberated a NASA space research paper from behind a paywall at Science (AAAS), using Section 105 as justification.
Will publishers really bother fact-checking your assertion about the employment of one of the authors? I don’t think so. It could land them in big trouble if they dare disregard the US Copyright Act.

 

4.) Simply do not sign, or do not return the unfavourable publishing agreement
Another risky approach is simply not to sign or not to return the CTA the publisher sends you after acceptance (with the obvious risk that this could delay publication). I think this is perhaps the most promising approach, there is strong evidence that many academics currently employ this practice. When you think about it: publishers actually need our papers or they’ll go bust. They need a constant stream of content to justify their existence. If you don’t sign-off on their stipulated terms and conditions, after acceptance, they do have real pressures to get on and publish the paper anyway, especially with the increased focus on optimising submission to publication times these days.

 

I’ll let Reinhard Diestel (mathematician, University of Hamburg) have the last word on this post, it’s a solution I’m keenly interested in trying myself:
I stopped signing away my copyright on journal papers in the late 1990s. Interestingly, almost all publishers reacted either positively or not at all when I did not return the copyright form signed as requested: in all cases did they print the paper in question, usually without additional delay, and sometimes with unexpected understanding and support. (Yes, there have been one or two cases where things were a little more difficult at first, but these too were resolved amicably in the end.)” — http://www.math.uni-hamburg.de/home/diestel/copyright.html

 

Roughly ten days after I first blogged about this (see: Springer caught red-handed selling access to an Open Access article), Springer have now made a curious public statement acknowledging this debacle:

Statement on Annals of Forest Science article


Berlin, 6 May 2015

A number of tweets posted by Prof. Luis Apiolaza on 27 April, and by others active on social media, suggest that Springer is charging for access to open access articles published in Annals of Forest Science. After looking into this issue, there is indeed an issue with the status of the article, but this has to do with the background of the journal itself.

Annals of Forest Science is a journal owned by INRA (Institut National de la Recherche Agronomique). In 2009, when the article in question first appeared, the journal was being published by another company that allowed readers to read the articles without paying a fee (“free access”). When Springer started working with INRA in 2011 we agreed to add the 2007-2010 archives to SpringerLink, Springer’s online platform, in order to ensure a smooth transition and to give a wider distribution to the most recent articles. Since the copyright was not assigned to the author, and since there is no mention of the licensing used, we incorrectly assumed that the article was not open access.

It is clear that this article was intended to be open access, and it will be made so on SpringerLink as quickly as possible. Anyone that has purchased the article will, of course, be reimbursed.

Please note that we support Green Open Access and we feed all articles from INRA journals to the HAL repository after the 12-month embargo, making the articles freely downloadable there (this is clearly written on the journal’s webpage, with a link to the HAL platform). The article in question can also be found there for free (since 2011).

This has been an oversight, and we apologize for not being more thorough and vigilant.

Contact

Ruth Francis | Springer | Corporate Communications
tel +44 203192 2732 | ruth.francis@springer.com

—————END———————-

I am pleased that Springer are committing to reimbursing all (reader) purchasers of wrongly-paywalled articles, and I shall check my bank balance regularly in the coming weeks to see if they honour this promise.

I am also pleased that Springer see fit to formally apologize for their carelessness of publishing. I note that AFAIK neither Wiley nor Elsevier have apologised for similar incidents this year.

But I’m rather bemused by this wording they have chosen: “It is clear that this article was intended to be open access, and it will be made so on SpringerLink as quickly as possible”

Indeed it seems they chose this wording carefully, because as far as I can tell with my browser, Luis’s open access article is still on sale (see screenshot below).

Update: As of 2015-07-05 13:20 (BST) the article is now no longer paywalled. At the time of writing, as can be seen below it was clearly paywalled.

screenshot

 

Springer SBM as an entity makes nearly a billion euros per year in turnover. Despite the considerable size, wealth and ‘experience’ in publishing, Springer can’t seem to unpaywall Luis’s article. Astonishing.

Today, the author of a paid-for, ‘hybrid’ open access article published in 2009, found that it was wrongly on sale at a Springer website:

FWIW it’s still freely available at the original publisher website here.

To test if Springer really were just brazenly selling a copy of the exact same open access article, I paid Springer to access a copy myself (screenshot below) and found it was exactly the same:

my receipt

I don’t actually care whether this is technically ‘legal’ any more. That doesn’t matter. This is scammy publishing. I want a refund and I will be contacting Springer shortly to ask for this. The author also hopes I get a refund – he wanted his article be open access, not available for a ransom:

 

Frankly, I’m getting tired of writing these blog posts, but it needs to be done to record what happened, because it keeps on happening.

I really think we need to setup a PaywallWatch.com c.f. RetractionWatch.com to monitor and report on these types of incidents. It’s clear the publishers don’t care about this issue themselves – they get extra money from readers by making these ‘mistakes’ and no financial penalty if anyone does spot these mistakes. Calculated indifference.

Are these known incidences just the tip of the iceberg? How do we know this isn’t happening at a greater scale, unobserved? There are more than 50 million research articles on sale at the moment. Perhaps in small part this explains the obscene profits of the legacy publishers?

It’s yet another nail in the coffin for hybrid OA – we simply can’t trust these publishers to keep this content open and paywall-free.

A recap of recent incidents of selling open access articles, without the publisher acknowledging to the reader/buyer that it is an open access article:

Springer (April, 2015) this post

Wiley (March, 2015) link

Elsevier (March, 2015) link

Elsevier (2014) link

[Update 5.30PM 2015-03-26: Wiley have now ‘freed’ the wrongly-paywalled articles in response to this. It doesn’t change the fact that these articles were wrongly on sale for 2 months and 26 days. They have also wrongly sold access to these articles.]

Wiley are currently (3PM 2015-03-26) charging for access to thousands of articles that should be free to access.

They have recently (legitimately) taken control of a journal called Limnology and Oceanography from the Association for the Sciences of Limnology and Oceanography (ASLO). The association makes clear in its guidelines for the journal that all articles are placed into Free Access after three years (source).

Yet today, I see that Wiley is selling access to articles from Limnology and Oceanography for $45.60 USD (inc. tax). I know this because I bought access to an article myself. Screenshot at the bottom. In fact volumes 41 (1996) to 1 (1956), consisting of thousands of articles are currently on sale at Wiley.

Some Questions:

How many times has Wiley sold access to articles from this journal that are greater than three years old: i.e. articles that should be free to read?

Did the Association for the Sciences of Limnology and Oceanography (ASLO) give them permission to sell access to articles that are more than three years old?

What do the authors think about access to their work being sold for $45.60 per article?

What do society members of the Association for the Sciences of Limnology and Oceanography (ASLO) think about this?

Will Wiley apologise for doing this? Elsevier hasn’t yet for a similar incident.

Will the society get fiscal compensation for this mishandling of their material?

Is this acceptable? IMO, I think not. IMO it is outrageous that Wiley are selling access to thousands of articles that should be freely available. There should be a full and open investigation into this. Relevant organisations like OASPA and UKSG should step in here. This cannot keep happening.

can this be sold?

Page View Spikes on Research Articles

March 24th, 2015 | Posted by rmounce in Open Data - (1 Comments)

For those that know me as a biologist it might perhaps surprise you to know that my most cited publication so far is on Open Access and Altmetrics (published in April 2013, 25 cites and counting…) — nothing to do with biology per se!

So I took great interest in this new publication:

Wang, X., Liu, C., Mao, W., and Fang, Z. 2015. The open access advantage considering citation, article usage and social media attention. Scientometrics, pp. 1-10. DOI: 10.1007/s11192-015-1547-0

The authors have gathered some really fascinating data measuring day-by-day altmetrics of papers at the journal Nature Communications, which at the time was hybrid: some articles behind a paywall, some articles were paid-for open access at a cost of $5200 to the authors/funders. (The cost of open access here is an absolute rip-off. I do not endorse or recommend outrageously priced paid-for open access outlets like Nature Communications. PLOS ONE costs just $1350 remember! PeerJ is just $99 per author!)

The paper is by no means perfect – I’m not saying it is – but the ideas behind it are good. Many on twitter have commented that it’s ironic that this paper on open access advantage is itself only made available behind a paywall at the publisher.

The good news is, Dr Xianwen Wang has responded to this and has made an ‘eprint’ copy (stripped of all publisher branding) freely available at arXiv as of 2015-03-19 (post-publication).  The written English throughout the manuscript is not brilliant but I feel this reflects poorly on the journal rather than the authors – it’s remarkable that Scientometrics can charge a subscription fee to subscribers if they offer no copy-editing on accepted manuscripts!  Finally, technical detail on precisely how the data was obtained is rather lacking. So that’s the critique out of the way…

My tweets about this paper have been very popular e.g.

But I wanted to dig deeper into the data. So I emailed the corresponding author; Xianwen for a copy of the data behind figure 2 and he happily and quickly sent it to me. I was fairly shocked (in a good way) that he sent the data. Most of the times I’ve sent email requests for data in the past have been ultimately unsuccessful. This is well documented in the field of phylogenetics *sad face*. The ’email the author’ system simply cannot be relied upon, and is one of many reasons why I feel all non-sensitive data supporting research should be made publicly available, alongside the article, on the day of publication.

I did my own re-analysis of the raw data Xianwen sent over, and discovered there were lots of odd jumps in data, which couldn’t really be explained by peaks in social media activity e.g. for A cobalt complex redox shuttle for dye-sensitized solar cells with high open-circuit potentials (visualized below). ~520 days after it was first published, in one single day it apparently accumulated 21,577 page views! There was also a smaller spike of 2000 page views earlier.

Article View Spikes

Xianwen had filtered these suspicious jumps out of his figures but neglected to mention that in the methods section, so upon informing him of this discrepancy he’s told me he’s going to contact the editor to sort it out. A great little example of how data sharing results in improved science? The unfiltered data looks a little bit like the plot below:

Anyway, back to the spikes/jumps in activity – they certainly aren’t an error introduced by the authors of the paper – they can also be seen via Altmetric (a service provider of altmetrics). The question is: what is causing these one-day spikes in activity?

I have alerted the team at Altmetric, and they have/will alert Nature Publishing Group to investigate further

Most of the spikes are likely to be accidental in cause but it would be good to know more. A downloading script gone awry? But there is still a possibility that within this dataset there is putative evidence for deliberate gaming of altmetrics, specifically: article views. I look forward to hearing more from Altmetric and Nature Publishing Group about this… the ball is very much in their court right now.

Moreover, now that these peculiar spikes have been detected; what, if anything, should be done about it?

How to Block Readcube and Why

March 19th, 2015 | Posted by rmounce in Generation Open | Hack days | Open Science - (8 Comments)

Wiley & Readcube have done something rather sneaky recently, and it’s not escaped the attention of diligent readers of the scientific literature.

excellent facebook comment

On the article landing page for some, if not all(?) journal articles at Wiley, in JavaScript enabled web browsers they’ve replaced all links to download the PDF file of the article with links that direct you to Readcube instead.

This is incredibly annoying – they are literally forcing us to use Readcube. That is not cool.

Some will rush to the defence of Readcube and point out that if they detect you have the rights to, you can download the PDF from within Readcube, but that’s missing the point. No-one need waste their precious time whilst Readcube takes ages to load in your browser tab, when all you wanted in the first place was the PDF.

What Readcube provides IS NOT EVEN PDF. It’s a mishmash of JavaScript, HTML and DRM technology. Thus when Wiley has icons saying “get PDF” they’re lying. Clicking the “get PDF” link does NOT send you to the PDF. It sends you to Readcube’s proprietary, rights-restricted mock-up of a PDF.

It doesn’t even render the figure images properly, sometimes missing important bits e.g. this figure (below):
cubeFAIL

Luckily there’s a simple solution: you can block Readcube in your browser settings and get simple, direct one-click access to PDF files again by selectively disabling JavaScript on all Readcube-infected websites e.g. onlinelibrary.wiley.com, nature.com and link.springer.com

Firefox users

Install the add-on called YesScript and ‘blacklist’ all Readcube-tainted websites.

Google Chrome / Chromium users

This browser is so clever you don’t even need to install anything new. Selective JavaScript blacklisting of websites is an in-built function:

A) Click the menu button in the top right hand corner of your browser
B) Select Settings
C) (scroll to bottom) Click Show advanced settings
D) Underneath the “Privacy” section, click the “Content settings” button.
E) Under the “Javascript” section, click “Manage Exceptions” and add at least these three Readcube-infected websites: onlinelibrary.wiley.com, nature.com and link.springer.com (example screenshot below)

javascript-chrome

Safari users

I haven’t tested this but the JavaScript Blocker extension looks like it should do the job.

Internet Explorer users

I’m tempted to say: install Chrome or Firefox but I’m well aware that some unfortunate academics have ‘university-managed’ computers on which they can’t easily install things. If so try the instructions for IE here. Let me know if you have better solutions for unfortunate IE users.

Before (left) and After (right) disabling JavaScript on the page.

Before (left) and After (right) disabling JavaScript on the page.

Added bonus function – extra privacy!

Would you want advertisers to be collecting data on you, knowing what you’ve been reading? It’s possible, though not proven AFAIK that the journal publishers themselves, or the advertisers they use are recording information about what articles you’re reading. They might know you read that article about average penis length three times last week for instance… Eric Hellman wrote quite an alarming post about the extent of this tracking at publisher websites recently. Thus blocking JavaScript at publisher websites provides extra privacy, not just protection against Readcube!

Above all I think we should #BlockReadcube not just for our own utility (easier access to the real PDF), but to send them a powerful message: we do not want the literature to be assimilated and enclosed in rights-restrictions by new technology. We do not want non-consenting ‘cubification of the research literature. We are Starfleet, and as far as I’m concerned: Readcube is the Borg.

assimilate-readcube

PS If you like some of the features of Readcube, try Utopia Docs – it’s free and it’s released under an Open Source license, and it doesn’t force you to use it!

Update 2015-03-20: This post does not indicate I’m suddenly ‘in favour’ of PDF’s by the way, as some seem to have interpreted. If Wiley wanted to do something good, they should publish their full text XML on site like other good publishers do e.g. PLOS, eLife, Hindawi, MDPI, Pensoft, BMC, Copernicus… If they did this then readers could choose to use innovative open source viewing software such the eLife Lens. That kind of change would add value & choice, rather than subtract value (& rights) as they have in this case.

Further discussion of Readcube and rights-restrictions:

http://biochemistri.es/post/104293317361/nature-and-its-extended-family-of-journals-were

http://rossmounce.co.uk/2014/12/02/beggar-access/