Show me the data!
Header

The new RCUK draft Open Access mandate

March 19th, 2012 | Posted by rmounce in Open Access - (Comments Off on The new RCUK draft Open Access mandate)

Research Councils UK (RCUK) – a partnership of seven core UK research funding bodies (AHRC, BBSRC, EPSRC, ESRC, MRC, NERC, and STFC), has recently released a very welcome draft policy document detailing their proposed Open Access mandate, for all research which they help fund.

The new proposed policies include (quoting from the draft):

  • Peer reviewed research papers which result from research that is wholly or partially funded by the Research Councils must be published in journals which are compliant with Research Council policy on Open Access.
  • Research papers which result from research that is wholly or partially funded by the Research Councils should ideally be made Open Access on publication, and must be made Open Access after no longer than the Research Councils’maximum acceptable embargo period. [6 months for all except AHRC & ESRC for which 12 months is the maximum delay permitted].
  • researchers are strongly encouraged to publish their work in compliance with the policy as soon as possible. [added emphasis, mine]

As a researcher funded by BBSRC myself – I’m thrilled to read this document.

It shows a clear understanding of the issues, including explicit statements on the need of different types of access – both manual AND automated:

The existing policy will be clarified by specifically stating that Open Access includes unrestricted use of manual and automated text and data mining tools. Also, that it allows unrestricted re-use of content with proper attribution – as defined by the Creative Commons CC-BY licence

 

But as a strong supporter of the Panton Principles for Open Data in Science, and Science Code Manifesto, I’m a little disappointed that the policy improvements with respect to data and code access are comparatively minor. Such underlying research materials need only be ‘accessible’ with few further stipulations as to how. AFAIK this allows researchers to make their data available via pigeon-transport (only) on Betamax tapes, 10 years after the data was generated *if there is no ‘best practice’ standard in one’s field.

The BBSRC’s data sharing policy for example seems to favour cost-effectiveness over transparency: “It should also be cost effective and the data shared should be of the highest quality.” and maddeningly seems to give researchers ownership over data, even though the data was obtained using BBSRC-funding: “Ownership of the data generated from the research that BBSRC funds resides with the investigators and their institutions.” This seems rather devoid of logic to me – if taxpayers paid for this data to be created, surely they should have some ownership of it? Finally ”Where best practice does not exist, release of data within three years of its generation is suggested.” 3 years huh? And that’s only a suggestion! Does anyone actually check that data is made available after those 3 years? I suspect not.

Admittedly, it would be hard to create a good one-size fits all policy, and policing it would cost more money, but I do feel that data & code sharing policies could be tightened-up in places, to enable more frictionless sharing, re-using and building-on previous research outputs.

So all in all this is a great step in the right direction towards Open Scholarship, particularly for BBB-compliant Open Access.

Related reactions and comments which are highly worth reading include posts by Casey Bergman, Peter Suber, and Richard Van Noorden.

Creative Commons Licence This blog post is licensed under a Creative Commons Attribution 3.0 Unported License, so feel free to redistribute, remix and re-use! All that I ask for is attribution :)

Journal mega-bundles & TheCostOfKnowledge

February 12th, 2012 | Posted by rmounce in Open Access - (Comments Off on Journal mega-bundles & TheCostOfKnowledge)

[Rather than summarise what’s already been said about Elsevier and their for-excessive-profit practices in recent weeks, I’ll just lazily assume you’ve read it all… right then. Here’s what I have to add.]

This post is a real-world anecdote of the problems that Elsevier’s journal bundling & excessive profiteering*** causes. Just one of many reasons which persuaded me to sign my name along with 5,000+ other academics over at The Cost of Knowledge, to register my disapproval of what Elsevier (and other publishers) are doing with scholarly works.

Recently, I discovered to my dismay that my institutional library (University of Bath) had cut it’s subscription (and therefore easy access) to an important journal in my field. Literally, one week I had free access to the journal content, and then the next week I found I didn’t!

The journal is Biology Letters, a general biology journal by Royal Society Publishing [RSP from now on]. **

 

Intellectual property owned by Royal Society Publishing (taken from Wikipedia)

Interestingly RSP take a relatively enlightened stance on Open Access, and have made some interesting statements in the past, such as this gem [from a statement published way back in 2008]:

“…some companies do appear to be making excessive profits from the publication of researchers’ papers”

I think RSP, is a non-profit organisation (source) and hence it doesn’t surprise me that they have such prescient criticism of Elsevier & co to offer. They aren’t in the business of excessive profiteering like some.

So… RSP’s Biology Letters has been cut from our subscriptions budget. Why? – was the very first question I emailed the subject librarian at my institution. To their credit, I got some wonderfully informative replies from our librarian staff – I have no doubt they’ve done their best, given the limited powers they have. Like all institutions, we don’t have an unlimited budget. Something had to be cut, and unfortunately it was our subscription to Biology Letters. Which by the way, would only have cost us £852 for an institutional online-only subscription.

Why was this journal, of which I read/used at least 15 separate articles of in 2011 alone, cut from our subscriptions instead of a journal like… Elsevier’s ‘International Journal of Coal Geology‘?*

I think this is a fair question to ask. Biology Letters has a higher impact factor, not that the journal Impact Factor is a particularly brilliant metric of quality and would cost a lot less (£1107 [Biol. Lett. print version+online] vs 2540 Euros; the current institutional subscription price for the print version ‘International Journal of Coal Geology’). Most damningly of all, I suspect no-one at my institution ever reads this Elsevier journal, feel free to correct me on this – I’m sure I could find plenty of other Elsevier journals that satisfy this last property.

But the answer to this question is of course not relevant to any of 3 rational above points (unfortunately) – Biology Letters can be cut because it’s vulnerable, as it’s not part of a MegaBundle sold by a large for-profit publisher. The International Journal of Coal Geology cannot be cut because access to it comes as part of a ‘Big Deal’ bundle, in which there are some *vital* journals to which we *must* have access to (and the corporation selling access, knows and exploits this). So despite the fact that no one needs it here, that it’s ~2x more expensive, and it has a lower Impact Factor – I have access to this, and many countless other bundled journals I DON’T need, and I DON’T have access to vital articles from another journal I *do* need for my research.

Welcome to the crazy world of academic publishing! Much of it simply doesn’t make sense in the Digital Age. Of current explanations, I’d say Mike Taylor’s parable explains this most clearly.

I can’t claim to have explained all of the problems and intricacies here – but rest assured it clearly doesn’t make sense to me. Journal mega-bundling is plainly inefficient, and we can’t let this practice continue.

Stop feeding the beast! The Cost of Knowledge

Footnotes:
* Through-out this post I use the example of the International Journal of Coal Geology, not out of disrespect for the editorial board, or the scholarly quality of the work presented there-in – I’m sure it’s great if you’re into Coal Geology. I only use it because a) it’s an Elsevier journal to which Elsevier very arguably adds very little value to, and b) I sincerely believe virtually no researchers at my institution make use of this journal.

** Just for the record, I don’t blame RSP or my librarians for this subscription cut happening. It’s out of their control. RSP do a great job IMO, as do my librarians.

*** I just read that one UK institution pays over £1,000,000 (yes, more than a million) every year for Elsevier’s ‘Big Deal’ bundle (source). I think this is a disgraceful ransom.

Research data should be appropriately licensed with re-use in mind

November 29th, 2011 | Posted by rmounce in Open Data | Palaeontology | Phylogenetics - (Comments Off on Research data should be appropriately licensed with re-use in mind)

I’m really pleased this new Open Access paper has just been published.

CC BY 3.0 Zookeys Special Issue 150ResearchBlogging.org

Hagedorn, G. et al. Creative commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information 150, 127-149 (2011).

Some background…

After parading my Open Data t-shirt (pictured below) around the Society of Vertebrate Paleontology meeting this month, I was invited to give an impromptu pitch in front of the great and good of the Mammal AToL project & MorphoBank people. Having pointed out to MorphoBank a while ago that they should really make explicit the terms and conditions [license] under which they make their (?) data available, I naturally advocated CC-BY 3.0 and CC0 licences. I talked about this very subject and pleaded with them NOT to use the NC clause refering to Rod Page & Peter Murray-Rust ‘s [1,2] thoughts on the matter.

Data providers vs Data re-users – need they really be in opposition?

The trouble is, a lot of (data providing) institutions seem hell-bent on ‘protecting commercial interests’, at the expense of research opportunities. So as I understand it, at the moment databases such as these face an awkward problem of either satisfying the restriction requests of data providers OR satisfying permissiveness of re-use by data re-users [such as myself!], and the needs of both camps are seldom entirely met.

Conclusion

I see this paper as an important step in persuading such restriction-minded institutions of the absolute importance of #OpenData / #PantonPrinciples and how NC clauses can genuinely obstruct and impair real academic research.
I just hope people read it and take note!

[Most of this is just a re-post of my spur of the moment G+ post here.
I’m reposting here so that this might hopefully get picked up by Research Blogging to give this paper the publicity it deserves. Much of the content is widely applicable IMO to most of scholarly communications, not just biodiversity informatics, and indeed the whole ZooKeys special issue (Open Access) is well worth a browse.]

References

[1] http://iphylo.blogspot.com/2010/12/plant-list-nice-data-shame-it-not-open.html
[2] http://blogs.ch.cam.ac.uk/pmr/2010/12/17/why-i-and-you-should-avoid-nc-licences/
[3] Hagedorn, G., Mietchen, D., Morris, R., Agosti, D., Penev, L., Berendsohn, W., & Hobern, D. (2011). Creative Commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information ZooKeys, 150 DOI: 10.3897/zookeys.150.2189

Yesterday’s post about haywire RSS feeds, reminded me that I should perhaps share a trick or two I know about RSS feeds.

This post assumes you know what an RSS feed is, and why they’re awesome. I still encounter researchers everyday who have no idea what an RSS feed is. I have no idea how they cope with the sheer volume of literature being produced these days without RSS feeds!

1.) RSS feed filtering

why?
Some journals e.g. PLoS ONE put out a *lot* of new research articles each and every week. So much so that it’s tiresome and time-wasteful to even read just the titles, let alone the abstracts of each and every new article published in this journal.

This should not be taken as a criticism of such high-volume journals. I’m very supportive of Open Access publishing, and the higher the volume of articles in Open Access rather than closed access, the better (for science) as far as I’m concerned. All one needs to do is apply some conservative filtering criteria to such feeds so that one receives only items of interest.

how?
My interest is in phylogenetics. Therefore I filter the PLoS ONE new article (all subjects) alert feed by subject specific keywords using Yahoo Pipes (see below).
adge

If the wildcard filters for ‘phylo*’ and ‘clad*’ work, then the other filters are probably redundant, but just in case y’know.
The resultant output of this feed (here), significantly tames the PLoS ONE deluge to a relevant and manageable trickle.
There are many other ways of filtering RSS feeds, but the graphical nature of Yahoo Pipes IMO makes it very recommendable.

It’s worth noting as well that PLoS provide their own filtered feeds here broken down by subject, but this isn’t helpful for me, as my research interest often pop-ups in many different subject classifications.

2.) RSS feed creation

why?
Perhaps a journal / database / website of interest to you doesn’t provide an RSS feed. So you can’t otherwise easily track updates to it. With research, I think it’s very important to keep up to date with the latest developments. Journals, databases and websites *should* of course always provide RSS feeds for you but a minority in my experience don’t.

The solution for these cases is: DIY!

how?
Again there are a huge swathe of options to help you ‘roll your own’ RSS feed. Some are reasonably complex and highly configurable e.g. http://feed43.com/ Whilst others are really simple, but not so adaptable e.g. http://page2rss.com/

The latter, simple option works very well for me, so I can keep up to date with latest additions to the MorphoBank database.

I’d be interested to know if anyone had any further recommendations for RSS feed creation tools, other RSS-related tips & tricks and/or interesting research related use-cases.

PS Should anyone wish to subscribe to my output, the RSS feed for this blog is here (and in the top right hand corner, I should probably make it a bit more obvious though!)

Further Reading:

http://iphylo.blogspot.com/2009/07/how-to-publish-journal-rss-feed.html