Show me the data!
Header

Author Archives: rmounce

Asking for what we ALL ACTUALLY WANT

November 25th, 2014 | Posted by rmounce in Open Access | Open Data - (2 Comments)

Just a quick post to congratulate the Bill & Melinda Gates Foundation for their fabulous new research policy covering both open access & open data.

One of the key things they’ve implemented for 2017 is ZERO TOLERANCE for post-publication embargoes of research. Work MUST be made openly available IMMEDIATELY upon publication to be compliant. No ifs, no buts.

Let’s just remind ourselves why other major research funders like RCUK & Wellcome Trust allow publishers to impose an embargo on academic work before it can be made public:

 

 

Do any academics want a post-publication embargo on their work, that stops it being shared, read & re-used by the widest readership possible?

grumpy-cat-no-1

 

 

Does it benefit readers, patients, policy-makers or practitioners to have a post-publication embargo delaying their access to the very latest research?

grumpy-cat-no-2

 

 

Does it benefit research funders themselves to have a post-publication embargo on work they fund?

grumpy-cat-no-3

 

 

The only stakeholder that benefits from research funder policies that allow post-publication embargoes preventing free access to research are the legacy publishers. The fact that RCUK, Wellcome Trust and many others pander to these parasitic publishers and their laughably unfit-for-purpose business model is just WRONG and it makes me angry. JUST SAY “NO” TO POST-PUBLICATION EMBARGOES!

 

grumpy-cat-no-4

 

It’s high-time that major research funders wrote policies that ask for what WE ALL ACTUALLY WANT, instead of a bullshit compromise that minimises fiscal harm to the multi-billion dollar legacy publishers.

 

I admire the Gates Foundation. They understand what we all need and they’ve implemented that in a clear and appropriate policy; optimal for readers, researchers, patients, practitioners and policy-makers. We want immediate open access, and we want it NOW! The ball is now in your court Wellcome Trust, make your move!

Just-Do-It-Now

Day 0 of OpenCon started with me missing the pre-conference drinks reception because my flight from Chicago was delayed by 2 hours. I got into Washington, D.C. (DCA) at about midnight & then had to wait half an hour for a blue line train to take me the short distance from the airport to the conference hotel — I’m a diehard for public transport! Finally arriving at the hotel past 1 o’clock in the morning. Not a great start. Sincere apologies to my excellent room mate Alfonso Sintjago, to whom I hastily introduced myself the next morning #awkward 

Day 1 started with a real bang. Michael Carroll gave a short speech. Then Pat Brown gave a long but HUGELY enjoyable talk about his role in the founding of PLOS & some excellent take home messages from the talk:

  • * Write petitions & letters for change with colleagues. Even if you fail to directly achieve all the goals or immediate aims of the petition, the act of doing it, the publicity & thought-provoking it raises can have real and important positive effects.

I saw immediate parallels of this with the recent ‘Open Peer Review Oath‘ , Jon Tennant’s & co’s ‘Open Letter to AAAS‘ , Erin McKiernan’s ‘Open Letter to the Society for Neuroscience‘, Gower & Neylon’s ‘The Cost of Knowledge, the [ongoing] Elsevier Boycott‘ and my own petition to ‘Support Palaeo Data Archiving‘ (2011). All of these, have made people sit-up and take notice. They have ALL been worthwhile activities in my opinion.

  • * Sometimes you’ve got to do odd things that might be against your ethos, to support your interests in the long term e.g. the traditional review selectivity of PLOS Biology & initially, printing paper copies of PLOS Biology.
  • * Sometimes you have to fake it to make it (N.B. said in the context of collective action, not scientific research)

 

The State of the Opens

Next there was a panel with talks and discussion on the state of Open Access, Open Data and Open Educational Resources. I was giving the Open Data talk (slides here) and found it hard to give — to be authoritative on the state & practice of open research data requires significant research, and I simply didn’t have time to really do the topic justice. I guess my main points were:

 

 

I’m so glad Victoria Stodden gave the next talk after the panel, I think I was the one on the organising committee who first suggested her for a keynote slot (sorry to brag!). Victoria did not disappoint – her talk was a remarkable display of undeniable deep-thinking & scholarship. Her reminder to us all of Merton’s Scientific Norms (1942) was an excellent grounding in the basis of open research:

  • Communalism: scientific results are the common property of the community
  • Universalism: all scientists can contribute to science regardless of race, nationality, culture, or gender
  • Disinterestedness: act for the benefit of a common scientific enterprise,
    rather than for personal gain.
  • Originality: scientific claims contribute something new
  • Skepticism: scientific claims must be exposed to critical scrutiny before being
    accepted

This was clearly appreciated by the audience and others e.g. Lorraine have already blogged about it. I also took home from the talk that it’s important to distinguish between the 3 different types of reproducibility: Empirical Reproducibility, Computation Reproducibility, and Statistical Reproducibility, and that the Bayh-Dole Act is the an awfully bad motivator for NOT opening-up research in the US (of which I pointedly reflected-on in a meeting at the NIH on day 3).

REAL TALK: at the end Stodden made a great point, which I hope was listened to: young academics should not be expected to martyr themselves for the cause of open scholarship, and that it should be the more senior academics leading the way — here, here!

Don’t martyr yourself for the cause. “Martyrdom of Saint Sebastian”. By Giovanni Bassi, 1525. Public Domain

After lunch there were parallel sessions. Uvania Naidoo led a workshop on Open Access in the Context of Developing Countries. I regret I can’t report on that session because Peter Murray-Rust and myself were holding a ContentMine workshop in the alternate room at the same time. The ContentMine session was really good fun, and very interactive — you can see the discussion from the session on the etherpad here. Jure Triglav had some great ideas around mining the literature for software citations, Nic Weber chimed-in that HPC citation /mentions would be great to do too. April Clyburne-Sherin was interested in clinical trials data mining etc… I could go on. The trick now is for us to explore these ideas and see if we can make them happen after the conference. The epidemiology/ebola, content mining looks like it’s definitely going to happen, many people were interesting in forming a collaboration around that.

Innovative Publishing Models

I’m not going to report every session in full detail. This is one where I’m probably skimping. Meredith Niles (Harvard postdoc) moderated talks and discussion by a panel consisting of Arianna Becerril (Redalyc), Pete Binfield (PeerJ), Mark Patterson (eLife) and Martin Paul Eve (Open Library of Humanities).

Meredith Niles and myself at the Day 1 evening reception. Twitter / M. Niles. All rights reserved, copyright not mine.

Meredith Niles and myself, in my new favourite t-shirt at the evening reception, Day 1. Twitter / M. Niles. All rights reserved, copyright not mine.

Huge congratulations to the organising committee for bringing this particular panel together. These are without doubt in my mind, representatives of four of the most important, innovative organizations in academic publishing right now. They all gave excellent talks but particular kudos should go to Martin Paul Eve for delivering a swish Prezi and more importantly, a passionate, invigorating talk on the possible future of OA in the humanities.

The impact of open

The line-up alone for the next session was stellar. The conference had it’s first glimpse of Erin McKiernan on stage, moderating a panel consisting of Jack Andraka, Peter Murray-Rust, and Daniel DeMarte. Forgive me for a lack of detail here, it was near the end of a long day. Jack gave his usual polished speech, with humour and grace. As well as ably fielding a couple of tough but fair questions about his patent pending. As ever, a lot of people wanted to take pictures with him and he was gracious to allow everyone who wanted a photo with him

Four people proudly pushing boundaries. Photo: mine! All rights reserved. CC-BY

Four people proudly pushing boundaries. Photo: mine! Licence: All rights reserved. CC-BY of course!

Jon Tennant (pictured above) gave Jack, as promised, a copy of his new book, which I also have a copy of. Peter Murray-Rust gave a rebel rousing talk, and an emotional slide of respect for the visionary pioneer of open notebook science, Jean-Claude Bradley, who sadly died this year.

The day ended with a closing keynote from John Wilbanks which was really the perfect cherry-on-top of the icing of a brilliant first day. It’s only been a few days but his talk slides, ‘Open as a Platform‘ have racked-up nearly 1000 views and I’m not surprised. I’d better not blather on too much, but put it this way: Wilbanks is a hero to me. I love some of things he’s said before and I’ve really taken them to heart in my work e.g. “The best time to plant a tree is 20 years ago. The 2nd best time is NOW” from ‘Data sharing as a means to a revolution‘. It was simply great to be able to chat to both Michael Carroll and John Wilbanks at the evening reception.

Miscellaneous Day 2 Highlights (If I don’t abbreviate this blog post soon, it’ll be book length)

Audrey Watters keynote talk ‘From “Open” to Justice‘ had a clear closing message: open is necessary but it’s not enough, we need meaningful political engagement, care and justice. The word ‘open’ alone does not solve all our problems (I may have paraphrased!).

Erin McKiernan‘s keynote was an inspiration to us all. ‘Being Open as an Early Career Researcher‘ was a masterclass in DOING IT THE RIGHT WAY, with an abundance of supporting evidence. I haven’t had the privilege of seeing her speak before, and had heard lots about how good a speaker she is – I wasn’t disappointed. I completely stand with Erin when she says:

If I am going to ‘make it’ in science, it has to be on terms I can live with.

I sincerely look forward to working with Erin, Prateek, Meredith, Nick and others on future projects, most immediately, the Open Access Ambassadors meeting in Munich this December.

Project Presentations

All the project panels on day 2 were excellent. It’s great to see so many of our attendees, many of whom travelled along way to be here to get time on stage to tell us about their work.

Open Access around the world

Open Access around the world. Twitter / Iryna Kuchma. All rights reserved, copyright not mine.

I was particularly taken by Ahmed Ogunlaja‘s clever response, to the question of how he approaches OA advocacy in Nigeria:

Open Access wins all of the arguments all of the time

That in itself got a round of applause. It’s no exaggeration to say there were a lot of earnest rounds applause that day; no polite applause.

Another such spontaneous round of applause came when Penny Andrews took the microphone to raise a really important point/question about diversity and social mobility in research in a calm, professional, clear tone. The audience, myself included were simply floored by how erudite it was. Stunning. This is but a small sample of what Penny brought to OpenCon:

If you only work with people who are like you, your work will only be FOR people like you. Embrace diversity, even if it’s hard #opencon2014

Late into the night at the ‘unconference’ session perhaps circa 11pm, Jure Triglav found out that his ScienceGist summaries are being used (in a good way!) by a researcher as sample data to test against a machine-based paper summary approach — I hope Jure blogs more about that, it seemed pretty cool to me. I’m also hoping ScienceGist might be used on PeerJ. Watch this space…

Mitar, gave an excellent talk, PeerLibrary has come-on a lot since I last looked at it, and he seemed to be literally overflowing with brilliant ideas, awaiting implementation. He told me had been considering applying for a Sloan Foundation grant to support his excellent work, but hadn’t yet applied, so without his knowledge/consent I decided to send a cheeky tweet to encourage him! If Sloan won’t fund his project(s), I’m sure Shuttleworth will!

 

Carolina Botero’s talk  was an important closer for day 2. So so important. Sharing Research Is Not A Crime!

I’ve a written a long post and most of it is glowingly, sickeningly positive. What didn’t go well?

Well… this is all my fault but I do feel the ‘How to be an open researcher’ session run by Erin & myself could have been smoother. We had technical difficulties setting-up the computer. BOTH our laptops only have HDMI connectors, no VGA, so we had to borrow Georgina‘s Mac & neither Erin nor I are particularly great Mac users (4-finger swiping between the browser and the presentation slides was challenging!), on linux this is very easy to do, just Alt-Tab & cycle through to the window you want. I must also apologise to Erin for launching into a mini-rant about figshare without forewarning her – I have concerns about putting too much open data on a commercial platform, that there simply isn’t enough space in this blog post to get into. Another time! But in principle I think double-teaming a lively workshop like this works really well — especially if we have slightly different viewpoints on some tool or strategy.

Day 3: On The Hill

Well, I learn’t a little about Minnesota whilst sitting in Amy J. Klobuchar‘s office. In our short time with a legislative assistant of hers, we pitched hard for Open Access & Open Educational Resources.

I highlighted that US taxpayer-funded academics give their work for free to commercial publishers, other academics peer-review this content, for free, the publisher barely does anything aside from typesetting & putting the content online, and hence most of the big publishers are consistently making 30-40% profit margins on taxpayer-funded research. [Standard knowledge basically] I was also quick to allay any concern that it would harm US businesses – I pointed out that most of the large publishers were European – Elsevier (Dutch), Springer (German), Nature Publishing Group (UK). It was a little disappointing to have only 30 minutes but that apparently was a good innings as these things go.

Whilst I honestly have no idea what will come of the Minnesota Senator meeting, the meeting at NIH was seriously productive.

NIH was simply fabulous for all involved, including NIH if you ask me! Many of the younger early career researchers got to see detailed & complicated concerns of their (relatively) more senior attendees e.g. Prateek Malwahar, Daniel Mietchen, Lauren Maggio, Karin Shorthouse and myself. I was worried that perhaps we might have ‘dominated’ the discussion a bit too much, but after discussing it with Shannon Evans afterwards – many actually really enjoyed seeing research-savvy people really dig into difficult policy issues. Natalia Norori‘s question near the end was also brilliantly appropriate, and the response rather chilling (although I should be clear, I’m not trying to shoot the messenger here!) — the USA has some deep political problems if disclosing the number of people using PubMed from outside the US is a ‘bad’ thing (those who were there will know exactly what I’m talking about!). I’m also hugely excited by the prospect of the OA_Button *potentially* getting a linkout button on Pubmed – Kent Anderson’ll love that, eh?.

Daniel Mietchen & I gave some valuable feedback on the packaging of the PubMed OA subset – the contention was that it wasn’t seeing much visible use, and yet Daniel & I both feel this is wrong — there are many users out there — it’s just hard to publish mining research because it’s often new/interdisciplinary and how does one ‘cite’ PubMed corpus usage anyhow? — it’s clearly going to be difficult to track users.

I was hugely flattered when Neil Thakur said he’s read my blog before! wow! Hope you like this post Neil.

Swapping shirts & the super-friendly culture at OpenCon

I gave out my 2 spare ‘Boycott Elsevier’ t-shirts at OpenCon this year, and I think I’ll make shirt-swapping a regular thing if I can! First, it was my immense pleasure to swap shirts with Daniel Mutonga at the organizing committee dinner. To his credit, Daniel was the one who suggested it: ‘like football players after a game’ , so I put on his MSAKE tee & he put on my ‘Boycott Elsevier’ tee. Fantastic. I think I should swap t-shirts with someone at every conference. Shannon (?) told me an interesting variation on this one, which also sounds like a good idea to implement: swapping pin badges.

I gave the other spare ‘Boycott Elsevier’ t-shirt to Erin McKiernan. We joked it would be hilarious to wear at SfN. Although, slightly concerned for how it would be received I did make clear that I didn’t mind if she chose not to wear it at SfN. She’s since tweeted me a picture wearing it in front of the Elsevier stand – exactly what I’d do! Every penny spent on those t-shirts has been totally worth it – such a good medium for non-violent, high profile activism!

The ‘backchannel’ discussion on twitter between OpenCon attendees & remote followers of the conference was also brilliant. Lots of lively, informative, intelligent threads of discussion sparked by lots of the talks, simply excellent.

It was also great to see Celya Gruson-Daniel again – she’s a real unsung hero of open science – if you aren’t aware of her project HackYourPhD go check it out NOW. Community building is immensely important and she’s clearly very good at it. It’s immensely & deservedly popular in the Franco-phone world. (I wonder if there are similar wildly successful Spanish-language open science communities? Please point them in my direction if you know of one!)

I must also thank Kurtis Baude for interviewing me about open research data in one of the breaks – his enthusiasm for spreading open science is infectious – we had a great chat together.

 

People making change for the better

People making change for the better

Being at OpenCon, more than at any other meeting, I was truly amongst friends. I was going to list everyone here in thanks but a list of 175 names isn’t much fun to read & I wouldn’t want to miss anyone out! Sorry to anyone I didn’t mention by name!

Postscript:

Rejected. Image copied from http://www.huffingtonpost.com/leslie-goshko/rejection_b_3272718.html . All rights reserved, not my copyright

Rejected. Image copied from HuffPo / Leslie Goshko. All rights reserved, not my copyright.

I have to admit, I went to OpenCon feeling a little bit low. My cranial / postcranial data comparison manuscript from my PhD had been recently rejected (again). Not on the basis that it was bad science, just that it wasn’t quite interesting enough for readers of the particular journal we (re)submitted it to. I gather this happens a lot with traditional impact-factor chasing publication strategies, and it can ruin alter career paths before they even get started. To have spent 4 years doing a PhD & 3 years of that on/off trying(ish) to publish this particular chapter and STILL have nothing, not even a preprint to publicly show for it (don’t even ask why I can’t put up a preprint. I think preprints are a great idea myself…). I was a tad depressed – let’s not pretend this doesn’t happen to us all, folks. Real Talk

Luckily, OpenCon has completely changed my mood for the better and reminded me of all the important things I did do during my PhD:

* I published *shrugs* in academic journals. I’m not even going to link to what I did manage to publish. I have a h-index, yada yada… I think all of the below were more important contributions, with more real-world impact to be honest:

* I debated Open Access live on BBC Radio 3 with MP David Willetts & others

* I gave a pretty darn good talk about content mining at the European Commission ‘Licences for Europe, Working Group 4: Text & Data Mining’ event. Which helped stave-off the unwanted imposition of ‘licensed’ content mining in Europe.

* I submitted well-reasoned, written evidence, to the UK Business, Innovation and Skills (BIS) call for information on Open Access policy

* I wrote popular & influential, blog pieces for the LSE Impact of Social Sciences blog: one on simple steps towards open scholarship, and the other on the UK Hargreaves copyright exception allowing non-commercial content mining.

I write the above list, self-indulgently to convince myself I’m not stupid. I can do clever stuff. I’m pretty sharp when it comes to research policy, and I have ideas and enthusiasm to help make research more open (== better). I think I’ve proved that now, time and time again.

Next week I’m meeting up with my supervisor and we’re going to work on revising & resubmitting that manuscript again. And thanks to OpenCon 2014 I’m actually in the mood to do that. Thanks Generation Open. You’re awesome.

Stay cool. Copied from http://indulgy.com/post/4NhbJB4QK1/try-try-again Haney. All rights reserved, not my copyright.

Stay cool. Copied from Tumblr / Haney. All rights reserved, not my copyright.

 

I note with interest that article publication charge data from the University of Edinburgh has been released on Figshare today.

There are some fascinating numbers in there and I applaud the transparency.

One particular article that took my eye is this one:
Paradoxical effects of heme arginate on survival of myocutaneous flaps

article

Page charges were paid for this article amounting to £1330.45, and that’s just for page charges – the journal did not make the article open access, nor was it asked to. This was for ‘page charges’ alone.

I also noted the research was paid for by the MRC – a top-class UK government-funded agency. As I am a full UK taxpayer, I feel especially entitled to read this research!

The MRC has a very clear policy on open access – the article must either:

1.) be made immediately open access by the publisher upon publication; ‘journal-mediated OA’ (sometimes called ‘gold’)

OR

2.) via the route of ‘repository-mediated access’ some kind of copy of the work must be made publicly accessible no more than 6 months after publication (sometimes called ‘green’)

Since the article clearly wasn’t open access at the publisher, I assume the authors have elected to choose the repository-access method. The article was formally published on 1st January 2014, so between then and now, clearly at least 6 months have elapsed. 7 months and 20 days to be precise. So where is the full text of this article?

It’s not in PubMed (abstract-only)
Nor EuropePubMed (abstract-only)
Nor the University of Edinburgh institutional repository (abstract-only)

So it would appear to me that the rules of the funding body (MRC) may have been broken here (sincere apologies if I am wrong about this), something all too easy to do if the repository route is chosen.

Wouldn’t it have been better to spend those page charges on making the paper immediately open access?

In the mean time, I have sent the University of Edinburgh open access team (openaccess@ed.ac.uk) an email to ask where the full text for this paper is, and I await their reply.

Steganography, phylogenetics and Flickr

July 4th, 2014 | Posted by rmounce in PLUTo - (0 Comments)

How best to link the figure, to the paper & the underlying data?

Whilst visiting EBI, Hinxton yesterday, Robert Hanson (computational chemist) reminded me of an interesting hack you can do to embed data in images.

Back in 2010 it was widely reported that people were using Flickr to transmit data (secretly) in images.

This general technique is called Steganography.

Turns out I can use this hack in my project too…

As a proof of concept, I’ve uploaded one recent PLOS ONE phylogeny figure to my ‘plosone-phylo’ flickr account:

https://www.flickr.com/photos/123621741@N08/14385231987/in/photostream/

In this special file, I’ve embedded the nexus, NeXML & Bibtex file from TreeBASE that correspond to the image. This website has cross-platform instructions – it’s remarkably simple.

So now if you download that special file I put on Flickr (this hack only works if you download the original, not resized versions) and ‘unzip’ the image file you’ll reveal the hidden data embedded in the image:  nexus, NeXML & Bibtex.
Try it!

Screenshot showing what to click to download the original image

Screenshot showing what to click to download the original image

 

This certainly isn’t the ‘optimal’ way of doing things. But it is a nice way of keeping everything together in ONE file. Maybe you might have a use for this hack too?

Progress Update from Meise, Belgium

June 12th, 2014 | Posted by rmounce in Open Data | Open Science - (0 Comments)

A quick blog from Meise, Belgium at the Pro-iBiosphere wrap-up event.

Yesterday I gave a talk about my progress liberating, and making searchable, OA figures from academic literature:

 

I’ve had a lot of great feedback and interest in what I’m doing with this.

Cyndy Parr has pointed out that EOL are on Flickr too, and have been marking-up photographs of taxa with ‘machine tags‘.

I will now start to experiment with how I can incorporate taxonomic & geographical machine tags into my workflow when uploading images to Flickr. As an example I have added binomial tags to two figures from an OA Zootaxa paper on ‘Urothrips': https://www.flickr.com/photos/121174006@N06/13379028204/in/set-72157642842813323

 

see bottom right hand corner for the added 'machine tags'

see bottom right hand corner for the added ‘machine tags’

 

 

 

 

 

 

 

 

 

 

 

 

Jeremy Miller from Naturalis is also very interested in OA Zootaxa content from the point of view of spiders. He gave a talk on Data Visualization on behalf of his team from the Leiden hackday. Luckily, with no prior ‘special’ mark-up, by searching ‘Araneae‘ I could show Jeremy the promise of what I’m doing on Flickr. Many phylogenies containing spider taxa came up in the search, many of which he immediately recognized as from his own open access publications! With a little bit of work to further mark-up the attributes he’s interested in, I might be able to provide something of real use – the ability to search figure images/captions across hundreds of open access journals, from many different publishers with just ONE search!

The Bouchout Declaration will be launched today at this meeting. I’m happy to say I facilitated the signing of this declaration by Open Knowledge. Many other organisations have signed this declaration and I hope it makes a splash – we need science to be open to do good science!

Finally, I’ve also potentially got a new research collaboration going (more of which later!).
It’s been well worth the trip!

[Update: the conference itself will be in November, 2014 – this is just the first announcement!]

I’m super excited to announce I’m part of the international organizing committee for OpenCon 2014:

OpenCon 2014

 

 

 

 

You can read the official first press release about this event here:

http://www.righttoresearch.org/act/opencon/announcement

 

here’s an excerpt from it:

“From Nigeria to Norway, the next generation is beginning to take ownership of the system of scholarly communication which they will inherit,” said Nick Shockey, founding Director of the Right to Research Coalition. “OpenCon 2014 will support and accelerate this rapidly growing movement of students and early career researchers advocating for openness in research literature, education, and data.

The first event of its kind, OpenCon 2014 builds on the success of the Berlin 11 Satellite Conference for Students and Early Stage Researchers, which brought together more than 70 participants from 35 countries to engage on Open Access to scientific and scholarly research. The interest, energy, and passion from the student and researcher participants and the Open Access movement leaders who attended made a clear case for expanding the event in size and duration, and to broaden the scope to related areas of the Openness movement.”

 

Last year, I was also part of the organizing committee for the event that this has grown from – the Berlin 11 Satellite conference:

berlin11

 

 

 

 

The Berlin 11 Satellite Conference was really exciting but only a 1-day event before the ‘main’ Berlin 11 event – an assemblage of students and ECR’s from literally all over the world (attending with generous full funding support), including representatives from (in no particular order) China, India, Saudi Arabia, Georgia, Tanzania, Tasmania(!), Kenya, Nigeria, Ghana, Uganda, Columbia, FYR Macedonia,  Mexico, Brazil, Sweden, Holland, Denmark, Poland, Portugal, Canada, the US, the UK… So don’t worry about where you are in the world – as long as you’re a student or ECR you’ll be eligible to apply for OpenCon 2014 (places are limited though!).

As a reminder, at the event last year we had Jack Andraka and Mike Taylor amongst the guest speakers. It was such a comprehensive success that it’s been expanded into a full 3-day event this year, expanding scope too, to includmeandjacke Open Data and OER, not just OA (they’re all obviously inter-related problems; better to tackle the integrated set of problems rather than aspects in isolation!).

Applications for OpenCon 2014 will open in August. For more information about the conference and to sign up for updates, visit www.opencon.net

I promise you this – it’s going to be BIG and I’m stoked to be part of an international organizing committee helping to make this happen.

OpenCon 2014 is also looking for additional sponsorship, particularly for Travel Scholarships to ensure global representation at this meeting, so if you have a marketing budget to spend, or are feeling generous please do have a look at the sponsorship opportunities.

PLOS ONE PHYLOGENY

May 7th, 2014 | Posted by rmounce in Content Mining | Open Data | Open Science | PLoS | PLUTo - (13 Comments)

I’m proud to announce an interesting public output from my BBSRC-funded postdoc project:
PLUTo: Phyloinformatic Literature Unlocking Tools. Software for making published phyloinformatic data discoverable, open, and reusable

MOAR PHYLOGENY!

Screenshot of some of the PLOS ONE phylogeny figure collection on Flickr

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I’ve made openly available my first-pass filter of PLOS ONE phylogeny figures (I’m not in any way claiming this is *all* of them).

This curated & tagged image collection is on Flickr for easy browsing: http://bit.ly/PLOStrees

As well as on Github for version control, open archiving, and collaboration (I have remote collaborators):

https://github.com/rossmounce/P1-phylo-part1

https://github.com/rossmounce/P1-phylo-part2

https://github.com/rossmounce/P1-phylo-part3

https://github.com/rossmounce/P1-phylo-part4

(Github doesn’t like repositories over 1GB so I’ve had to split-up the content between 4 separate repositories)

 

Why?

The aim of the PLUTo project is to re-extract & liberate phylogenetic data & associated metadata from the research literature. Sadly, only ~4% of modern published phylogenetic analysis studies make their underlying data available. Another study finds that if you ask the authors for this data, only 16% will be kind enough to reply with the requested data!

This particular data type is a cornerstone of modern evolutionary biology. You’ll find phylogenetic analyses across a whole host of journal subjects – medical, ecological, natural history, palaeontology… There are also many different ways in which this data can be re-used e.g. supertrees  & comparative cladistics. Not to mention, simple validation studies &/or analyses which extend-upon or map new data on to a phylogeny. It’s really useful data and we should be archiving it for future re-use and re-analysis. To my great delight, this is what I’m being paid to attempt to do for my first postdoc; on a grant I co-wrote – finding & liberating phylogenetic data for everyone!

 

Why PLOS ONE?

 

  •  It’s a BOAI-compliant open access journal that publishes most articles under CC BY, with a few under CC0.
    • This means I can openly re-publish figures online (provided sufficient attribution is given) — no need to worry about DMCA takedown notices or ‘getting sued’! This makes the process of research much easier. Private, non-public, access-restricted repositories for collaboration are a hassle I’d rather do without.
  • It’s a high-volume ‘megajournal’ publishing ~200 articles per day, many of which include phylogenetic analyses.
    • Thus its worthwhile establishing a regular daily or weekly method for parsing-out phylogenetic tree figures from this journal
  • Killer feature: as far as I know, PLOS are the only publisher to embed rich metadata inside their figure image files.
    • This makes satisfying the CC BY licence trivially easy — sufficient attribution metadata is already embedded in the file. Just ensure that wherever you’re uploading the file to doesn’t wipe this embedded data, hence why I chose Flickr as my initial upload platform.

 

What does this enable or make easier?

 

On it’s own, this collection doesn’t do much, this is still an early stage – but it gives us an important insight into the prevalence of certain types of visual display-style that researchers are using:

‘radial’ phylogenies

https://www.flickr.com/search?user_id=123621741%40N08&sort=relevance&text=radial

Source: Zerillo et al 2013 PLOS ONE. Carbohydrate-Active Enzymes in Pythium and Their Role in Plant Cell Wall and Storage Polysaccharide Degradation

Source: Zerillo et al 2013 PLOS ONE. Carbohydrate-Active Enzymes in Pythium and Their Role in Plant Cell Wall and Storage Polysaccharide Degradation

 

 

 

 

 

 

 

 

 

 

 

 

 

‘geophylogeny’ (phylogeny displayed relative to a map of some sort, 2D or 3D)

https://www.flickr.com/search?user_id=123621741%40N08&sort=relevance&text=geophylogeny

Source: Guo et al 2012 PLOS ONE. Evolution and Biogeography of the Slipper Orchids: Eocene Vicariance of the Conduplicate Genera in the Old and New World Tropics

Source: Guo et al 2012 PLOS ONE. Evolution and Biogeography of the Slipper Orchids: Eocene Vicariance of the Conduplicate Genera in the Old and New World Tropics

 

 

 

 

 

 

 

 

 

 

‘timescaled’ (phylogenies where the branch lengths are proportional to units of time or geological periods)
https://www.flickr.com/search?user_id=123621741%40N08&sort=relevance&text=timescaled

Source: Pol et al 2014 PLOS ONE. A New Notosuchian from the Late Cretaceous of Brazil and the Phylogeny of Advanced Notosuchians

Source: Pol et al 2014 PLOS ONE. A New Notosuchian from the Late Cretaceous of Brazil and the Phylogeny of Advanced Notosuchians

 

 

 

 

 

 

 

 

 

‘splitstrees’

https://www.flickr.com/search?user_id=123621741%40N08&sort=relevance&text=splitstree

Source: McDowell et al 2013 PLOS ONE. The Opportunistic Pathogen Propionibacterium acnes: Insights into Typing, Human Disease, Clonal Diversification and CAMP Factor Evolution

Source: McDowell et al 2013 PLOS ONE. The Opportunistic Pathogen Propionibacterium acnes: Insights into Typing, Human Disease, Clonal Diversification and CAMP Factor Evolution

 

 

 

 

 

 

 

 

 

 

 

Arguably it also facilitates complex searches for specific types of phylogeny

e.g. analyses using cytochrome b
https://www.flickr.com/search/?w=123621741@N08&q=%22cyt%20b%22%20OR%20%22cytochrome%20b%22
(you could use PLOS’s API to do this, particularly their figure/table caption search field — but you’d get a lot of false positives — this is an expert-curated collection that has filtered-out non-phylo figures)

In my initial roadmap, the plan is to do PLOS ONE, the other PLOS journals, then BMC journals, then possibly Zootaxa & Phytotaxa (Magnolia Press). There will be a Github-based website for the project soon, lots still to do…!

 

Want to know more / collaborate / critique ?

Conferences:

I’ve got an accepted lightning talk at iEvoBio in Raleigh, NC later this year about the PLUTo project.

As well as an accepted lightning talk at the Bioinformatics Open Source Conference (BOSC) in Boston, MA.

Elsewise, contact me via twitter @rmounce , the comment section on this blog post, or email ross dot mounce <at> gmail dot com

Discussing Open Access with the Linnean Society

March 13th, 2014 | Posted by rmounce in Open Access - (12 Comments)

I’ve been invited to come in and have an informal chat about open access with the Linnean Society on March 24th this month. Particularly with regard to what is and what is not ‘open access’ in terms of Creative Commons licences. I write this blog post to spur on other advocates to try and encourage their society journals to use proper, open access compliant article licencing that facilitates rather than prevents text & data mining.

I have Tom Simpson at LinnSoc to thank for reaching out to make this happen. Thanks Tom!

It started from some tweets I sent a few days ago about an interesting new Zoo J Linn paper by Martin Brazeau & Matt Friedman. I’d include a pretty figure from this paper if I was allowed to, but unfortunately because it’s licensed with the Creative Commons Attribution-NonCommercial-NoDerivs License (CC BY-NC-ND) I can’t. To repost just a figure from the paper would be to create a smaller derivative work which the licence does not allow – I am only allowed to repost the *whole* article with absolutely no changes which is rather impractical for a 43 page article! Wiley in particular have a history of threatening scientist bloggers for reproducing a single figure from an article (read the Shelley Batts story here).

restricted access

It’s not just bloggers, and the outreach possibilities for the paper that are harmed with the use of such restrictive licenses – it also causes problems for RCUK funded researchers. Matt Friedman is based at Oxford at the moment – if the funding for this work came from any of the UK research councils, then the choice of the CC BY-NC-ND license could cause him problems – it is NOT compliant with the RCUK’s policy on open access. Wiley should know better than to offer this license to UK-based authors, but they have a significant conflict of interest in ensuring researchers choose more restrictive licencing options so that they can continue to be the sole proprietor of glossy reprint copies (ensured by the -NC clause). Both the -NC & the -ND clauses incidentally prevent the figures from being re-used on Wikipedia, another sad restriction for the authors who must have put a lot of effort into them.

In the realm of academic science, the application of that particular license to the paper-as-a-whole-work just doesn’t make sense. Many digital research projects need to be able excerpt, transform and translate research outputs such as academic papers, and in some cases create commercial value from this. My current BBSRC-funded research project ‘PLUTo: Phyloinformatic Literature Unlocking Tools. Software for making published phyloinformatic data discoverable, open, and reusable‘ relies on being allowed to transform, excerpt and republish extracted content from scientific papers. With Peter Murray-Rust we’re using text & image mining tools to generate open, re-usable phylogenetic data directly from the published literature, often directly from PDFs.  The Linnean Society have several good quality, well-respected journals which publish phylogenetic content, so they’re very much in the scope of our PLUTo work.

But clauses such as -ND stop us from using this material. It’s clear in the license terms and conditions – we are not allowed to make any derivative works from the original. So any papers using CC BY-NC-ND we will have to avoid. We cannot use them, and therefore they will not be cited by our project which is rather a shame for their authors.

Above all the CC BY-NC-ND license simply isn’t compliant with the very definition of open access as laid down over a decade ago at the Berlin, Budapest, Bethesda meetings. Wiley are knowingly mis-labelling articles using non-compliant licences as ‘open access’ even though they are by definition NOT open access. I hope the Linnean Society can spur Wiley to do something about this as it is not good for the journal, or its authors. Other journals using non-compliant licencing use terms like ‘public access‘ or ‘free access‘ or ‘sponsored access‘. Why can’t Wiley follow this lead? Open access is more than just free access – it enables re-use which is critical for research projects like mine. Please stop the ‘openwashing‘.

 

Further Reading:

Hagedorn, G., Mietchen, D., Morris, R., Agosti, D., Penev, L., Berendsohn, W., and Hobern, D. 2011. Creative commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information. ZooKeys 150:127-149.

Mounce, R. 2012. Life as a palaeontologist: Academia, the internet and creative commons. Palaeontology Online 2:1-10.

Klimpel, P. Consequences, Risks, and side-effects of the license module Non-Commercial – NC [PDF] 1-22.