Show me the data!

This article is cross-posted from the main Open Knowledge Foundation blog, where I occasionally post. I realise I haven’t had time to post here on my own blog for over a month now(!), so I may well copy across a few more posts I’ve written for OKF.

Here at the Open Knowledge Foundation, we know Open Science is tough, but ultimately rewarding. It requires courage & leadership to take the open path in science.

Nearly a week ago on the open-science mailing list we started putting together a list of established scientists who have in some way or another made significant contributions to open science or lent their esteemed reputation to calls for increased openness in science. Our open list now has over 130 notable scientists, among whom 88 are Nobel prize winners.

In an interesting parallel development, the White House has just put out a call to help identify “Open Science” Champions of Change — outstanding individuals, organizations, or research projects promoting and using open scientific data for the benefit of society.


Anyone can nominate an Open Science candidate for consideration by May 14, 2013.

What more proof do we need that open science is both good, and valued in society? This marks a tremendous validation of the open science movement. The US government is not seeking to reward any scientist; only open scientists actively working to change the world for the better will win this recognition.

We’re still a long way from Open Science being the norm in science. But perhaps now, we’re a crucial step closer to important widespread recognition that Open Science is good, and could be the norm in the future. We eagerly await the unveiling of the winning Open Science champions at the White House on the 20th June later this year.

My final repost today (edited) from the Open Knowledge Foundation blog. It’s a little old, originally posted on the 16th of April, 2013 but I think it definitely deserves to be here on my blog as a record of my activities…

So… it’s over.

For the past twelve months I was immensely proud to be one of the first Open Knowledge Foundation Panton Fellows, but that has now come to an end (naturally). In this post I will try and recap my activities and achievements during the fellowship.


The broad goals of the fellowship were to:

  • Promote the concept of open data in all areas of science
  • Explore practical solutions for making data open
  • Facilitate discussions surrounding the role and value of openness
  • Catalyse the open community, and reach out beyond its traditional core

and I’m pleased to say that I think I achieved all four of these goals with varying levels of success.



Outreach & Promotion – I went to a lot of conferences, workshops and meetings during my time as a Panton Fellow to help get the message out there. These included:


At all of these I made clear my views on open data and open access, and ways in which we could improve scientific communication using these guiding principles. Indeed I was more than just a participant at all of these conferences – I was on stage at some point for all, whether it was arguing for richer PDF metadata, discussing data re-use on a panel or discussing AMI2 and how to liberate open phylogenetic data from PDFs.

One thing I’ve learnt during my fellowship is that just academic-to-academic communication isn’t enough. In order to change the system effectively, we’ve got to convince other stakeholders too, such as librarians, research funders and policy makers. Hence I’ve been very busy lately attending more broader policy-centred events like the Westminster Higher Education Forum on Open Access & the Open Access Royal Society workshop & the Institute of Historical Research Open Access colloquium.

Again, here in the policy-space my influence has been international not just domestic. For example, my trips to Brussels, both for the Narratives as a Communication Tool for Scientists workshop (which may help shape the direction of future FP8 funding), and the ongoing Licences For Europe: Text and Data Mining stakeholder dialogue have had real impact. My presentation about content mining for the latter has garnered nearly 1000 views on slideshare and the debate as a whole has been featured in widely-read news outlets such as Nature News. Indeed I’ve seemingly become a spokesperson for certain issues in open science now. Just this year alone I’ve been asked for comments on ‘open’ matters in three different Nature features; on licencing, text mining, and open access from an early career researcher point-of-view – I don’t see many other UK PhD students being so widely quoted!

Another notable event I was particularly proud of speaking at and contributing to was the Revaluing Science in the Digital Age invite-only workshop organised jointly by the International Council for Science & Royal Society at Chicheley Hall, September 2012. The splendour was not just in the location, but also the attendees too – an exciting, influential bunch of people who can actually make things happen. The only downside of such high-level international policy is the glacial pace of action – I’m told, arising from this meeting and subsequent contributions, a final policy paper for approval by the General Assembly of ICSU will likely only be circulated in 2014 at the earliest!



The most exciting outreach I did for the fellowship were the ‘general public’ opportunities that I seized to get the message out to people beyond the ‘ivory towers’ of academia. One such event was the Open Knowledge Festival in Helsinki, September 2012 (pictured above). Another was my participation in a radio show broadcast on Voice of Russia UK radio with Timothy Gowers, Bjorn Brembs, and Rita Gardner explaining the benefits and motivation behind the recent policy shift to open access in the UK. This radio show gave me the confidence & experience I needed for the even bigger opportunity that was to come next – at very short notice I was invited to speak on a live radio debate show on open access for BBC Radio 3 with other panellists including Dame Janet Finch & David Willetts MP! An interesting sidenote is that this opportunity may not have arisen if I hadn’t given my talk about the Open Knowledge Foundation at a relatively small conference; Progressive Palaeontology in Cambridge earlier that year – it pays to network when given the opportunity!



The fellowship may be over, but the work has only just begun!

I have gained significant momentum and contacts in many areas thanks to this Panton Fellowship. Workshop and speaking invites continue to roll in, e.g. next week I shall be in Berlin at the Making Data Count workshop, then later on in the month I’ll be speaking at the London Information & Knowledge Exchange monthly meet and the ‘Open Data – Better Society’ meeting (Edinburgh).

Even completely independent of my activism, the new generation of researchers in my field are discovering for themselves the need for Open Data in science. The seeds for change have definitely been sown. Attitudes, policies, positions and ‘defaults’ in academia are changing. For my part I will continue to try and do my bit to help this in the right direction; towards intelligent openness in all its forms.

What Next?

I’m going to continue working closely with the Open Knowledge Foundation as and when I can. Indeed for 6 months starting this January I agreed to be the OKF Community Coordinator, Open Science before my postdoc starts. Then when I’ve submitted my thesis (hopefully that’ll go okay), I’ll continue on in full-time academic research with funding from a BBSRC grant I co-wrote partially out in Helsinki(!) at the Open Knowledge Festival with Peter Murray-Rust & Matthew Wills, that has subsequently been approved for funding. This grant proposal which I’ll blog further about at a later date, comes as a very direct result of the content mining work I’ve been doing with Peter Murray-Rust for this fellowship using AMI2 tools to liberate open data. Needless to say I’m very excited about this future work… but first things first I must complete and submit my doctoral thesis!

So, the new RCUK open access policy is now in play… and guess what – there’s plenty of journals out there that are not accommodating it at the moment. Perhaps this is just out of ignorance? Perhaps this is an area where a little nudge from interested parties e.g. open access advocates, RCUK-funded academics, and other concerned people might help?

With this aim I have just emailed the editorial board of the Taylor & Francis journal Systematics and Biodiversity to let them know that their journal is not currently RCUK-compliant (see screenshot from FACT below).

Systematics & Biodiversity is not compliant

Dear Elliot, (and British-based members of the Systematics and Biodiversity editorial board)

As you may know, Research Councils UK (RCUK) have instituted a new open access policy to further the dissemination and re-use of all RCUK-funded research. Included within RCUK is BBSRC, MRC, NERC & STFC.
Further details here

This policy came into force on 1st April 2013.

All papers to be published in future, arising from RCUK-funded research must now comply with this new RCUK policy or they will not be REF-able. Publishing in a non-compliant journal may also adversely affect future grant applications.

The University of Nottingham has made a helpful website to let people check before they submit their manuscripts – which journals are RCUK policy compliant, and which journals are NOT compliant.
It is called the ‘Funders & Authors Compliance Tool

I was browsing this website and discovered that your journal: ‘Systematics and Biodiversity‘ (ISSN: 1477-2000) is NOT compliant with the new RCUK policy, for researchers with BBSRC, MRC, NERC or STFC funding.

This may deter many British academics from submitting manuscripts to your journal…
May I suggest you discuss this at your next editorial board meeting?

The simplest route to compliance would be to talk to your publisher – Taylor & Francis, point out the issue, and convince them to either:

* allow all RCUK-funded research to be published under the Creative Commons Attribution Licence (CC BY)
* allow the Accepted version of RCUK-funded articles to appear in open access repositories after a 6 month embargo

It may also interest you to note that an entire editorial board recently resigned from a Taylor & Francis journal over a similar dispute relating to open access & licencing. I’m sure you need not necessarily take such drastic action.

All the best,


I’ll let you know as and when I get a response to this email.

Perhaps you know of a journal in your field that you’d also like to see offer RCUK-compliant publishing options? Check with the FACT tool, and let that editorial board know – they may be able to do something about it

In the last 2 weeks I’ve given talks in Brussels & Amsterdam.

The first one was given during a European Commission (Brussels) working group meeting on Text & Data Mining. There were perhaps only ~30 people in the room for that.

The second presentation was given just a few days ago at Beyond The PDF 2 (#btpdf2) in Amsterdam.

I uploaded the slides from both of these talks to Slideshare just before or after I gave each talk to help maximize their impact. Since then they’ve had nearly 1000 views according to my Slideshare analytics dashboard.

It’s not just the view count I’m impressed with. The global reach is also pretty cool too (see below, created with BatchGeo):

View My Slideshare Impact 08/Mar/2013 to 22/Mar/2013 in a full screen map

Now obviously, these view counts don’t always mean that the viewers always went through all the slides, and a minority of the view-count are bots crawling the web but still I’m pretty pleased. Imagine if I hadn’t uploaded my Content Mining presentation to the public web? I would have travelled all the way to Brussels and back again (in the same day!) for the benefit of *just* ~30 people (albeit rather important people!). Instead, over 800 people have had the opportunity to view my slides, from all over the world (although, admittedly mostly just US & Europe).

The moral of this short story: upload your slides & tweet about them whenever you give a talk!
You may not appreciate just how big your potential audience could be. Something academics sceptical of Open Access should perhaps think about?

Particular thanks should go to @openscience for helping disseminate these slides far and wide. During just a 60 minute period, upon first release, thanks to @openscience and others my PDF metadata slidedeck got over 100 views this Wednesday!

Next step… must work on getting these stats into an ImpactStory widget for the next version of my CV!

I just got forwarded this email. (Names, Dates & email addresses have been removed or replaced).

I’m extremely concerned about this and thus am republishing this email to draw attention to it. Wiley are really pushing their expensive hybrid Open Access option ‘Online Open‘ that does not represent value for money in my opinion – it’s US$3000 for most journals which is rather a lot.

Of course they are welcome to advertise this option but it’s rather disingenuous in my opinion to make NO mention whatsoever to authors that there are other means of compliance e.g. the ‘green’ route to open access (self-archiving). If we’re not careful some UK academics may assume they must publish via the gold open access route to be RCUK-compliant, especially if they are bombarded with emails like this from all the major corporate publishers! Not cool…

——– Original Message ——–
Subject: Journal X complies with the Open Access policies of
RCUK and Wellcome Trust
From: Person at Wiley Journal X
To: Person X


Dear Person X:

As an author who has submitted a paper to Journal X we wanted to let you know that from 1st April 2013 you will be given the choice to publish under a Creative Commons ‘Attribution’ license (CC-BY) license when using OnlineOpen, the open access option for the journal. This will ensure that if you are funded by Research Councils UK (RCUK) or the Wellcome Trust you can continue to comply with their open access policies. The option to publish Online Open is offered post-acceptance, outside the peer-review process.

The Research Councils UK (RCUK) and The Wellcome Trust (WT) have recently announced new open access policies, effective from 1 April 2013. Both policies state that to be compliant, journals must offer a “pay to publish” (gold OA) option. When an article publication charge is paid the policies also mandate the use of the Creative Commons ‘Attribution’ license (CC-BY). The CC-BY license allows others to modify, build upon and/or distribute the licensed work (including for commercial purposes) as long as the original author is credited.

Are you funded by RCUK or the Wellcome Trust?

To comply with your funder’s open access policies now and beyond 1st April 2013 you can select Wiley OnlineOpen.

OnlineOpen offers:

1. Publication in your first choice journal
2. Open access to articles: freely available on Wiley Online Library, PubMed Central and UKPMC
3. Authors retain copyright and get the choice to publish under a CC-BY License
4. Compliance with requirements of the Wellcome Trust, RCUK and the other UKPMC Funders (see for list of funders.)

To keep up-to-date, please use the following link to sign up to receive future OnlineOpen emails:

Covering the cost of open access

With OnlineOpen the author, their funding agency, or institution pays a fee to ensure that the article is made open access. WT and RCUK are providing UK research institutions with funds to pay for article publication charges via a block allocation to support their new open access policies. This is in addition to the funding provided by the UK government to the top 30 UK research-intensive institutions to help cover the article publication charges associated with open access publishing.

In addition, Wiley has set up payment accounts with a number of UK institutions. Authors affiliated with, or funded by, an organization that has a Wiley Open Access Account can publish without directly paying any publication charges:

It is therefore advised that authors funded by the RCUK or the Wellcome Trust check with their institution regarding this available funding to pay for OnlineOpen.

More information about OnlineOpen, CC-BY licence and open access policies of the RCUK and the Wellcome Trust can be found on these websites:

Just a quick post to say that I think Beall’s list of “predatory journals” should be expanded to include dubious subscription access journals.

I think it’s rather unfair on the open access movement to claim it’s just the open access business model that faces this kind of desperate exploitation.

It’s long been known that even big established scholarly publishers have published fake journals in the past but there are also independent, low-quality fakes out there, like the new DeNovo journal that’s recently published a “peer-reviewed” (?) paper on the Sasquatch Genome.

DeNovo journal

This paper is behind a paywall. It’s a hybrid subscription/oa journal that’s accepting submissions right now. I haven’t seen a single good word from any scientist I know about this paper. Here’s just some popular reviews of it.

Are there any other really poor quality subscription access journals out there that should be listed on this list of journals/publishers to avoid?


I’ve been quoted in a Nature News story about Open Access journal licencing.

I’m a staunch defender of the use of the Creative Commons Attribution licence, as it’s a good licence for academic research.

Here’s just some of what I sent Richard Van Noorden (Nature News) by email. I don’t blame him for only using select quotes. But I do feel much of this provides additional useful context, so I am republishing it here for everyone to read:

I believe RCUK want their research publications to be made available under the CC BY licence because it allows *anyone* to re-use them. That specifically includes commercial organisations. This is a good thing. Academic researchers aren’t good at commercializing their research. I for one would be delighted if someone could make money out of my research publications. I already get paid by RCUK to do research. I don’t need more money from licensing royalties on something I could have written 50 years ago (remember copyright law in many jurisdictions has extended protection to the life of the author plus 70 years!). I do research to find new knowledge and help the scientific community and society as a whole. I know many other researchers also have this philosophy about their work. It is a privilege to be given public funds with which to perform exciting research. Furthermore as RCUK fully fund my research, why should *I* have control over access to the outputs of that research? As far as I’m concerned if they funded the work, they have the right to dictate how it is published to ensure maximum benefit as they see fit. Researchers who carry out RCUK funded research have the right to be formally acknowledged as people who made these discoveries, and this is ensured and protected by the BY module. By mandating the CC BY license for gold OA articles, RCUK are ensuring maximum benefit from the money they may pay for the publication of it (but note that not all gold OA journals charge an APC. There are many excellent high-quality fee-free gold OA journals and I would encourage authors to publish in these good outlets).

Obviously, please link to my chart if you wish, the newest version is here:

You can even republish it if you wish, without even asking my permission. All content on my blog unless otherwise indicated is made available for re-use under the Creative Commons Attribution Licence :)

( I cannot for obvious reasons guarantee that this plot is still correct. Prices change all the time. I have data to show that on average across 97 BMC journals the mean price increase in APCs from 2012 to 2013 prices was just over %5 )

Furthermore, journals can change the licence under which they publish. I alerted Mike Taylor that Acta Palaeontologia Polonica was not using a Creative Commons licence to publish. He in turn contacted an editor about this, and now the journal happily publishes all new articles under CC BY. Simple as that. Changing licenses is a simple process that costs journals nothing – it is easy to do.

I suspect many free access journals and authors who publish in them would see no problem in granting full open access with CC BY. I suspect they don’t currently do this only because they are not aware of the problems this causes to those that wish to re-use content. Copyright law in many countries and jurisdictions unfortunately requires permission to be sought to re-use works (e.g. textmining, format shifting, printing-off copies for educational use in the classroom) even if they are freely (gratis) available to read on the internet.

This ‘free’ only provides ocular access as Jan Velterop terms it. Open Access as defined by the original Budapest Open Access Initiative statement

permits any users to “…read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.”

This statement was recently reaffirmed …with pretty much exactly the same definition as originally given.

Thus only articles made available under licences that are compliant with that definition are truly Open Access. One such licence that is compliant with the definition of Open Access is the Creative Commons Attribution Licence (CC BY) but it may not be the only compliant licence.

The Creative Commons Attribution Non-commercial licence (CC BY-NC) is not compliant with the definition of Open Access because it prevents commercial uses of such licenced material – BOAI clearly states *any* users. Note that even non-profit companies and charities can be prevented from re-using content by this – if there is commerce involved (e.g. donations, advertising) re-use is blocked in this setting. Many people who use this licence think they are just blocking use by for-profit companies but it is much wider than this.

I have a project running at the moment to get the licencing details for the 985 journals featured in Jevin West et al’s recent cost-effectiveness of open access plot These are a selection of just those high-quality (Thompson Reuters JCR ranked) free access journals. The vast majority of these use CC BY. Remember that whilst DOAJ lists 8000+ journals there is little quality control, it is acknowledged that there are some predatory OA journals listed there, and that I certainly don’t have time to investigate 8000+ journals! Thus this set of quality-assured JCR-ranked journals seemed like a fair sample to me.

of those 985 free journals (over 500 of them so far), mostly use CC BY (survey not completed yet, work still in progress) examples:

American Physical Society journals

AOSIS OpenJournals

BioMed Central journals

European Geosciences Union journals

Frontiers journals

Genetics Society of America

Hindawi journals

MDPI journals

Pensoft journals

PLOS journals

SAGE journals

Springer journals

Versita journals

Wiley journals

+ many society & very small publisher journals

Thus whether by number of journal titles, or article volume – CC BY is the most used license. (Given the article volume of BMC + PLOS + Hindawi + MDPI + Frontiers + Pensoft is significant it’s also likely to absolutely dwarf that of the number of articles put out by the non CC BY journals. That’s a safe estimation)

Across these journals the use of CC BY-NC exclusively is rare. Only 19 journals (not including the optional ones where it is offered as a choice of licence) amongst the 620 scored so far. These 19 are mostly Brazilian, which is notable and odd (even though I’ve been doing this alphabetically it’s still significant):



Revista Brasileira de Psiquiatria

Sao Paulo Medical Journal

South African Journal of Surgery

Brazilian Journal of Biology

BMJ Open

Acta Botanica Brasilica

Horticultura Brasileira


Revista Brasileira de Ci ncia do Solo

Revista Brasileira de Fruticultura

Brazilian Journal of Infectious Diseases

Journal of the Brazilian Society of Mechanical Sciences and Engineering

International Brazilian Journal of Urology


Jornal de Pediatria

Revista Brasileira de Pol’tica Internacional

Revista Brasileira de Farmacognosia/ Brazilian Journal of Pharmacognosy

CC BY-NC-ND users:


Journal of Toxicologic Pathology

NATL INST SCIENCE COMMUNICATION journals (Indian, 10 of them)

CC BY-NC-SA users:

CBE Life Sciences Education

Journal of Engineering Technology

Journal of Microbiology & Biology Education


Medknow journals (14 journals)

Sadly, there are also a significant number of journals that do not indicate any kind of Creative Commons license. One such alarming one is the CDC journal ‘Emerging Infectious Diseases’. It is lamentable that content in this important free-to-read medical research journal requires permission to be sought to re-use and/or textmine. In these ambiguous re-use cases one must assume the default state of “All Rights Reserved” even though the PDF is free (gratis) to view, for anything else permission must be sought.

source data:

There are many examples of such unintended problems caused by the NC license module detailed in this excellent publication recently translated from the original German by members of the Open Knowledge Foundation:

Consequences, Risks, and side-effects of the license module Non-Commercial – NC

Such NC content cannot be used in Wikipedia or newspapers

Educators that charge their pupils fees cannot use NC content without permission

CC BY-NC is incompatible with CC BY-SA content. No mashups, remixes, or combinations of these (and btw Wikipedia publishes its content under CC BY-SA so incompatibility is a BIG PROBLEM). CC BY content is compatible with CC BY-SA.

Many blogs are ad-supported, these generate income and thus no matter how little are classed as commerce and thus NC content cannot be reused without permission here either.

“It is also commercial use if an image is printed in a book that is published by a publishing house, entirely independent of whether the author receives a remuneration or possibly even has to pay a printing fee to make the publication possible. The publishing house acts with a commercial interest in either case.”

“…NC restrictions are most minutely heeded where their consequences are least intended.”

“Am I ready to act against the commercial use of my content? If not, you should consider not to use the NC module in the first place”

See also this Zookeys paper for problems with NC:


Just a quick post.

I happened to see @wisealic Tweet about her “new Atira/Pure colleagues” yesterday. I didn’t know what Atira was, but I’d heard of PURE.

I googled it to find out more… and soon found the official Elsevier press release , dated August 15, 2012 (so this isn’t really new news). But combined with recent rumours it does worry me. Elsevier own perhaps a fifth of the academic literature, whatever the true figure it’s a significant share. Despite the research that went into most of those papers being publicly or charitably-funded, Elsevier now rent access to this work back to us (the world) for vast sums of money each and every year.

Not to mention the fake journals they published, the arms dealings their parent company (Reed Elsevier) was involved in, their initial support for the RWA (since withdrawn), the megabundling of journals, the non-provision of open bibliographic metadata (even NPG release this!), the obscene profit margins (and to be fair they’re not the only corporate publisher making a killing here by selling freely provided academic work),  there are 1001 reasons why –  this isn’t an exhaustive list of all the evils…

So Elsevier are not a well-loved company in academia at the moment – more than 13,000 people have signed a boycott of them.

There are rumours that Elsevier are in talks to buy Mendeley at the moment. And Atira/PURE now part of the Elsevier (Umbrella?) corporation are I think the exclusive(?) providers of the research information ‘management’ systems that the UK will be using for it’s next Research Evaluation Framework (REF formerly RAE) exercise in 2014.

So… Elsevier own a significant portion of our papers,  and they may soon own a significant chunk of the bibliographic metadata stored by academics (Mendeley data) and all the commercial insight and advantage that gives, AND they own the company that is managing the data that evaluates UK academics and more round the world no doubt.

I do wonder if there isn’t a significant conflict of interest if thousands of UK academics have publicly boycotted Elsevier and now their academic work is going to be evaluated by… Elsevier. Academic jobs thoroughly depend on the results of these evaluations as I understand it, and heads will roll if the results at an institution are below expectations.

From a purely business perspective many financial analysts would rightly applaud these acquisitions as “good business moves” (good for profits no doubt). But from an ethical standpoint? Elsevier now seem to have a worrying empire of services built around academia and a significant amount of data which presumably they can pool together from each of these different services to gain additional insight? They also have a very poor record when it comes to providing open data. Why are we still giving them our data so easily – they’re only going to rent it back to us at a later date?

To me it’s clear, we’re giving up far too much of our data to this company and they do not have our best interests at heart – shareholder profits are by definition their primary goal. They have a sizeable monopoly on academic data in all it’s forms which they can and do leverage and I suspect we’re going to be made to pay for this mistake in the future as we have with hugely inflated journal subscription prices.

Is it just me that’s worried?