Show me the data!

Worldwide Dissemination

March 22nd, 2013 | Posted by rmounce in Conferences | Content Mining | Open Access | Open Data | Travel - (Comments Off on Worldwide Dissemination)

In the last 2 weeks I’ve given talks in Brussels & Amsterdam.

The first one was given during a European Commission (Brussels) working group meeting on Text & Data Mining. There were perhaps only ~30 people in the room for that.

The second presentation was given just a few days ago at Beyond The PDF 2 (#btpdf2) in Amsterdam.

I uploaded the slides from both of these talks to Slideshare just before or after I gave each talk to help maximize their impact. Since then they’ve had nearly 1000 views according to my Slideshare analytics dashboard.

It’s not just the view count I’m impressed with. The global reach is also pretty cool too (see below, created with BatchGeo):

View My Slideshare Impact 08/Mar/2013 to 22/Mar/2013 in a full screen map

Now obviously, these view counts don’t always mean that the viewers always went through all the slides, and a minority of the view-count are bots crawling the web but still I’m pretty pleased. Imagine if I hadn’t uploaded my Content Mining presentation to the public web? I would have travelled all the way to Brussels and back again (in the same day!) for the benefit of *just* ~30 people (albeit rather important people!). Instead, over 800 people have had the opportunity to view my slides, from all over the world (although, admittedly mostly just US & Europe).

The moral of this short story: upload your slides & tweet about them whenever you give a talk!
You may not appreciate just how big your potential audience could be. Something academics sceptical of Open Access should perhaps think about?

Particular thanks should go to @openscience for helping disseminate these slides far and wide. During just a 60 minute period, upon first release, thanks to @openscience and others my PDF metadata slidedeck got over 100 views this Wednesday!

Next step… must work on getting these stats into an ImpactStory widget for the next version of my CV!

I just got forwarded this email. (Names, Dates & email addresses have been removed or replaced).

I’m extremely concerned about this and thus am republishing this email to draw attention to it. Wiley are really pushing their expensive hybrid Open Access option ‘Online Open‘ that does not represent value for money in my opinion – it’s US$3000 for most journals which is rather a lot.

Of course they are welcome to advertise this option but it’s rather disingenuous in my opinion to make NO mention whatsoever to authors that there are other means of compliance e.g. the ‘green’ route to open access (self-archiving). If we’re not careful some UK academics may assume they must publish via the gold open access route to be RCUK-compliant, especially if they are bombarded with emails like this from all the major corporate publishers! Not cool…

——– Original Message ——–
Subject: Journal X complies with the Open Access policies of
RCUK and Wellcome Trust
From: Person at Wiley Journal X
To: Person X


Dear Person X:

As an author who has submitted a paper to Journal X we wanted to let you know that from 1st April 2013 you will be given the choice to publish under a Creative Commons ‘Attribution’ license (CC-BY) license when using OnlineOpen, the open access option for the journal. This will ensure that if you are funded by Research Councils UK (RCUK) or the Wellcome Trust you can continue to comply with their open access policies. The option to publish Online Open is offered post-acceptance, outside the peer-review process.

The Research Councils UK (RCUK) and The Wellcome Trust (WT) have recently announced new open access policies, effective from 1 April 2013. Both policies state that to be compliant, journals must offer a “pay to publish” (gold OA) option. When an article publication charge is paid the policies also mandate the use of the Creative Commons ‘Attribution’ license (CC-BY). The CC-BY license allows others to modify, build upon and/or distribute the licensed work (including for commercial purposes) as long as the original author is credited.

Are you funded by RCUK or the Wellcome Trust?

To comply with your funder’s open access policies now and beyond 1st April 2013 you can select Wiley OnlineOpen.

OnlineOpen offers:

1. Publication in your first choice journal
2. Open access to articles: freely available on Wiley Online Library, PubMed Central and UKPMC
3. Authors retain copyright and get the choice to publish under a CC-BY License
4. Compliance with requirements of the Wellcome Trust, RCUK and the other UKPMC Funders (see for list of funders.)

To keep up-to-date, please use the following link to sign up to receive future OnlineOpen emails:

Covering the cost of open access

With OnlineOpen the author, their funding agency, or institution pays a fee to ensure that the article is made open access. WT and RCUK are providing UK research institutions with funds to pay for article publication charges via a block allocation to support their new open access policies. This is in addition to the funding provided by the UK government to the top 30 UK research-intensive institutions to help cover the article publication charges associated with open access publishing.

In addition, Wiley has set up payment accounts with a number of UK institutions. Authors affiliated with, or funded by, an organization that has a Wiley Open Access Account can publish without directly paying any publication charges:

It is therefore advised that authors funded by the RCUK or the Wellcome Trust check with their institution regarding this available funding to pay for OnlineOpen.

More information about OnlineOpen, CC-BY licence and open access policies of the RCUK and the Wellcome Trust can be found on these websites:

I’ve been quoted in a Nature News story about Open Access journal licencing.

I’m a staunch defender of the use of the Creative Commons Attribution licence, as it’s a good licence for academic research.

Here’s just some of what I sent Richard Van Noorden (Nature News) by email. I don’t blame him for only using select quotes. But I do feel much of this provides additional useful context, so I am republishing it here for everyone to read:

I believe RCUK want their research publications to be made available under the CC BY licence because it allows *anyone* to re-use them. That specifically includes commercial organisations. This is a good thing. Academic researchers aren’t good at commercializing their research. I for one would be delighted if someone could make money out of my research publications. I already get paid by RCUK to do research. I don’t need more money from licensing royalties on something I could have written 50 years ago (remember copyright law in many jurisdictions has extended protection to the life of the author plus 70 years!). I do research to find new knowledge and help the scientific community and society as a whole. I know many other researchers also have this philosophy about their work. It is a privilege to be given public funds with which to perform exciting research. Furthermore as RCUK fully fund my research, why should *I* have control over access to the outputs of that research? As far as I’m concerned if they funded the work, they have the right to dictate how it is published to ensure maximum benefit as they see fit. Researchers who carry out RCUK funded research have the right to be formally acknowledged as people who made these discoveries, and this is ensured and protected by the BY module. By mandating the CC BY license for gold OA articles, RCUK are ensuring maximum benefit from the money they may pay for the publication of it (but note that not all gold OA journals charge an APC. There are many excellent high-quality fee-free gold OA journals and I would encourage authors to publish in these good outlets).

Obviously, please link to my chart if you wish, the newest version is here:

You can even republish it if you wish, without even asking my permission. All content on my blog unless otherwise indicated is made available for re-use under the Creative Commons Attribution Licence :)

( I cannot for obvious reasons guarantee that this plot is still correct. Prices change all the time. I have data to show that on average across 97 BMC journals the mean price increase in APCs from 2012 to 2013 prices was just over %5 )

Furthermore, journals can change the licence under which they publish. I alerted Mike Taylor that Acta Palaeontologia Polonica was not using a Creative Commons licence to publish. He in turn contacted an editor about this, and now the journal happily publishes all new articles under CC BY. Simple as that. Changing licenses is a simple process that costs journals nothing – it is easy to do.

I suspect many free access journals and authors who publish in them would see no problem in granting full open access with CC BY. I suspect they don’t currently do this only because they are not aware of the problems this causes to those that wish to re-use content. Copyright law in many countries and jurisdictions unfortunately requires permission to be sought to re-use works (e.g. textmining, format shifting, printing-off copies for educational use in the classroom) even if they are freely (gratis) available to read on the internet.

This ‘free’ only provides ocular access as Jan Velterop terms it. Open Access as defined by the original Budapest Open Access Initiative statement

permits any users to “…read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.”

This statement was recently reaffirmed …with pretty much exactly the same definition as originally given.

Thus only articles made available under licences that are compliant with that definition are truly Open Access. One such licence that is compliant with the definition of Open Access is the Creative Commons Attribution Licence (CC BY) but it may not be the only compliant licence.

The Creative Commons Attribution Non-commercial licence (CC BY-NC) is not compliant with the definition of Open Access because it prevents commercial uses of such licenced material – BOAI clearly states *any* users. Note that even non-profit companies and charities can be prevented from re-using content by this – if there is commerce involved (e.g. donations, advertising) re-use is blocked in this setting. Many people who use this licence think they are just blocking use by for-profit companies but it is much wider than this.

I have a project running at the moment to get the licencing details for the 985 journals featured in Jevin West et al’s recent cost-effectiveness of open access plot These are a selection of just those high-quality (Thompson Reuters JCR ranked) free access journals. The vast majority of these use CC BY. This set of JCR-ranked journals seemed like a fair sample to me.

of those 985 free journals (over 500 of them so far), mostly use CC BY (survey not completed yet, work still in progress) examples:

American Physical Society journals

AOSIS OpenJournals

BioMed Central journals

European Geosciences Union journals

Frontiers journals

Genetics Society of America

Hindawi journals

MDPI journals

Pensoft journals

PLOS journals

SAGE journals

Springer journals

Versita journals

Wiley journals

+ many society & very small publisher journals

Thus whether by number of journal titles, or article volume – CC BY is the most used license. (Given the article volume of BMC + PLOS + Hindawi + MDPI + Frontiers + Pensoft is significant it’s also likely to absolutely dwarf that of the number of articles put out by the non CC BY journals. That’s a safe estimation)

Across these journals the use of CC BY-NC exclusively is rare. Only 19 journals (not including the optional ones where it is offered as a choice of licence) amongst the 620 scored so far. These 19 are mostly Brazilian, which is notable and odd (even though I’ve been doing this alphabetically it’s still significant):



Revista Brasileira de Psiquiatria

Sao Paulo Medical Journal

South African Journal of Surgery

Brazilian Journal of Biology

BMJ Open

Acta Botanica Brasilica

Horticultura Brasileira


Revista Brasileira de Ci ncia do Solo

Revista Brasileira de Fruticultura

Brazilian Journal of Infectious Diseases

Journal of the Brazilian Society of Mechanical Sciences and Engineering

International Brazilian Journal of Urology


Jornal de Pediatria

Revista Brasileira de Pol’tica Internacional

Revista Brasileira de Farmacognosia/ Brazilian Journal of Pharmacognosy

CC BY-NC-ND users:


Journal of Toxicologic Pathology

NATL INST SCIENCE COMMUNICATION journals (Indian, 10 of them)

CC BY-NC-SA users:

CBE Life Sciences Education

Journal of Engineering Technology

Journal of Microbiology & Biology Education


Medknow journals (14 journals)

Sadly, there are also a significant number of journals that do not indicate any kind of Creative Commons license. One such alarming one is the CDC journal ‘Emerging Infectious Diseases’. It is lamentable that content in this important free-to-read medical research journal requires permission to be sought to re-use and/or textmine. In these ambiguous re-use cases one must assume the default state of “All Rights Reserved” even though the PDF is free (gratis) to view, for anything else permission must be sought.

source data:

There are many examples of such unintended problems caused by the NC license module detailed in this excellent publication recently translated from the original German by members of the Open Knowledge Foundation:

Consequences, Risks, and side-effects of the license module Non-Commercial – NC

Such NC content cannot be used in Wikipedia or newspapers

Educators that charge their pupils fees cannot use NC content without permission

CC BY-NC is incompatible with CC BY-SA content. No mashups, remixes, or combinations of these (and btw Wikipedia publishes its content under CC BY-SA so incompatibility is a BIG PROBLEM). CC BY content is compatible with CC BY-SA.

Many blogs are ad-supported, these generate income and thus no matter how little are classed as commerce and thus NC content cannot be reused without permission here either.

“It is also commercial use if an image is printed in a book that is published by a publishing house, entirely independent of whether the author receives a remuneration or possibly even has to pay a printing fee to make the publication possible. The publishing house acts with a commercial interest in either case.”

“…NC restrictions are most minutely heeded where their consequences are least intended.”

“Am I ready to act against the commercial use of my content? If not, you should consider not to use the NC module in the first place”

See also this Zookeys paper for problems with NC:


Just a quick post.

I happened to see @wisealic Tweet about her “new Atira/Pure colleagues” yesterday. I didn’t know what Atira was, but I’d heard of PURE.

I googled it to find out more… and soon found the official Elsevier press release , dated August 15, 2012 (so this isn’t really new news). But combined with recent rumours it does worry me. Elsevier own perhaps a fifth of the academic literature, whatever the true figure it’s a significant share. Despite the research that went into most of those papers being publicly or charitably-funded, Elsevier now rent access to this work back to us (the world) for vast sums of money each and every year.

Not to mention the fake journals they published, the arms dealings their parent company (Reed Elsevier) was involved in, their initial support for the RWA (since withdrawn), the megabundling of journals, the non-provision of open bibliographic metadata (even NPG release this!), the obscene profit margins (and to be fair they’re not the only corporate publisher making a killing here by selling freely provided academic work),  there are 1001 reasons why –  this isn’t an exhaustive list of all the evils…

So Elsevier are not a well-loved company in academia at the moment – more than 13,000 people have signed a boycott of them.

There are rumours that Elsevier are in talks to buy Mendeley at the moment. And Atira/PURE now part of the Elsevier (Umbrella?) corporation are I think the exclusive(?) providers of the research information ‘management’ systems that the UK will be using for it’s next Research Evaluation Framework (REF formerly RAE) exercise in 2014.

So… Elsevier own a significant portion of our papers,  and they may soon own a significant chunk of the bibliographic metadata stored by academics (Mendeley data) and all the commercial insight and advantage that gives, AND they own the company that is managing the data that evaluates UK academics and more round the world no doubt.

I do wonder if there isn’t a significant conflict of interest if thousands of UK academics have publicly boycotted Elsevier and now their academic work is going to be evaluated by… Elsevier. Academic jobs thoroughly depend on the results of these evaluations as I understand it, and heads will roll if the results at an institution are below expectations.

From a purely business perspective many financial analysts would rightly applaud these acquisitions as “good business moves” (good for profits no doubt). But from an ethical standpoint? Elsevier now seem to have a worrying empire of services built around academia and a significant amount of data which presumably they can pool together from each of these different services to gain additional insight? They also have a very poor record when it comes to providing open data. Why are we still giving them our data so easily – they’re only going to rent it back to us at a later date?

To me it’s clear, we’re giving up far too much of our data to this company and they do not have our best interests at heart – shareholder profits are by definition their primary goal. They have a sizeable monopoly on academic data in all it’s forms which they can and do leverage and I suspect we’re going to be made to pay for this mistake in the future as we have with hugely inflated journal subscription prices.

Is it just me that’s worried?