Show me the data!

Wow! Where to begin… In this post I shall attempt to summarise some of OKFestival 2012.

Some Background:

I had been to the Open Knowledge Conference last year (in Berlin), where I gave an invited talk on Open Palaeontology and met lots of brilliant people in the Open Science community like Bjoern Brembs, Cameron Neylon & Peter Murray-Rust. But this year the event was even bigger, and even better – teaming up with the annual Open Government Data Camp for a mega-event.

The Event Itself:

It was a little awkward that it was held so far away from most of the conference accommodation – everyone had a 20-30 minute commute before getting to the venue, and some of the talk rooms were fairly far apart. But once the conference goers got used to that it was plain sailing from there, and the Aalto University buildings themselves were wonderfully modern and well equipped for it (inc. great WiFi). I got to Helsinki on the Tuesday, and caught the tail end of the Data Journalism session that day including an excellent, inspirational talk on amongst other things. It detailed the amazing knowledge and insight gained from tracking the movement of ships with open data. I couldn’t help thinking that academics could learn a lot from these open data visualization experts (myself included!).

An interesting example of Shippr data – ships turn off their beacons once they pass the point for fear of pirates…

Wednesday – my chance to make a difference

I really liked the way that the conference had an introductory session to the days parallel events in the morning from 10am – 11am. If one was unsure of which stream to go to – these Morning Plenaries gave each topic stream a chance to pitch their events in a short slot to the awaiting audience. I thought this was very helpful given there were 13 separate topic streams at the conference!

I was involved in two sessions this day. Firstly the Open Access discussion panel, the video for which is here with Tim Hubbard (Sanger Institute), Carlos Russel (World Bank), Peter Murray-Rust (University of Cambridge / Open Knowledge Foundation) and Tom Olijhoek & Mark MacGillivray (Open Access Index):

It’s a long video, we covered many topics, with excellent contributions from the audience including Puneet Kishnor from Creative Commons and Matt Todd from the Open Source Drug Discovery team amongst others.

Then after this there was the research data session with contributions from Mark Wainwright on CKAN, Mark Hahnel on Figshare and Joss Winn of the Orbital project.

Finally we finished with the Panton Fellowships Session with talks from myself and Sophie Kershaw on what we’d been doing in our fellowship work:

The day was rounded off with a hugely inspirational talk from Matt Todd summarising his Open Source Drug Discovery work in the main lecture theatre, with a lovely if expensive meal afterwards in Lasipalatsi Ravintola.


I spent some quality time with Peter working on a BBSRC grant proposal.
I also thoroughly enjoyed Hans Rosling’s fantastic key note presentation which I urge you all to watch – it was brilliant, and thrilling to be there live in the audience for.


If there’s one thing that impresses me most of all about OKFestival, it’s this: it’s not just about talking – they do things here too. Lots of ‘hacking’ sessions on Friday to create new tools and collate awesome new data. Most conferences are extremely boring in that it’s just talk after talk after talk. Things get done here, new collaborations are started, fresh links across disciplinary boundaries are made connecting journalism with academia, economic development with open architectural design, and other incredible trans-disciplinary mashups. It’s a joy to behold.

I’m really glad I came to OKFestival, as ever I got a lot out of it.

Next year it’ll be in Switzerland (?), I hope I didn’t just make that up… I seem to remember that it was announced to be there but I couldn’t find any confirmation from Google. Rest assured I’ll try and be there though!

I said I would make an update on Tuesday (today), so if I get this posted before midnight I will (just) have met that  goal…

In this (minor) update I have:

added: Ubiquity Press (great low cost option!), SPIE (scored for 1-column per page), SAGE Open, Frontiers, WileyOpenAccess, OxfordOpen (OUP hybrid option), GigaScience, Open Biology (Royal Society)

added the label for: Pensoft (sincerest apologies, it is tied with Copernicus and was on the 0.1 plot, just unlabelled!)

changed the categorization of: Scientific Reports (NPG) [I have put it in a no-mans-land between CC BY and CC BY NC since they give authors a choice of licenses. I think this is a bad idea as it allows authors to make the mistake of choosing a less open licence (are there really any common circumstances in which they might want a less open, free to read licence?)]


As noted elsewhere there are actually a lot of completely fee-free Gold Open Access journals out there (I shall try and make a listing of them in a future post), they’re just not perhaps all that well-known. GigaScience and Open Biology (Royal Society) are temporarily completely fee-free options that certainly look like good recommendations!


I shall endeavour to add-in more of a variety of the various differently priced BMC journals in the next update of the plot. Basically I believe most of them lie in the range between BMC Research Notes, and BMC Biology.

My site stats show that in just a few days v0.1 of the plot had nearly 1000 pageviews, which is HUGE for my otherwise low-key blog!

And it has had real impact already. Thanks to Mike Taylor, Acta Pal. Polonica is thinking of adopting the CC BY licence. Brilliant news! It is fee-free but not explicitly licensed to allow re-use at the moment. Hopefully this will change soon.


Anyway, I have to get off the train now, so that’ll be the end of this post.




Since Sunday afternoon I’ve been at an International Council for Science (ICSU) / Royal Society invited workshop on ‘Revaluing Science in the Digital Age’.

We’ve had a fascinating set of talks from academics, publishers (PLoS, Nature, BMC), librarians, policymakers, data managers, scientific societies…

Attendees included:
Jose Cotta, European Commision

Mark Thorley (RCUK)
Chris Banks  (University Librarian and Director, Aberdeen)
Mark Hahnel (Figshare)
Max Wilkinson (UCL, Head of Research Data Service)
Dave Roberts (ViBRANT)
Rob Frost (GSK)
Catriona MacCallum (PLoS)
Mark Forster (Syngenta)
Iain Hrynaszkiewicz (BMC)
Ruth Wilson (Nature Publishing Group)
Kaitlin Thaney (Digital Science)
Stuart Taylor (Royal Society)
Robert Simpson (Zooniverse)
Paul Groth (OpenPHACTS)
and more…


I gave a talk on content mining and the importance of full BOAI-compliant Open Access with respect to this, on behalf of the Open Knowledge Foundation:

There was lots of discussion on reproducibility, provenance of data, peer review, incentives, research misconduct and ethics.

I’ve met many new people and have learnt many new things. For example, on the subject of reproducibility I talked about Roger Peng and the journal Biostatistics in discussion, and then was soon informed that there was an analogous journal in Chemistry called Organic Syntheses whereby:

In order for a procedure to be accepted for publication, each reaction must be successfully repeated in the laboratory of a member of the Editorial Board at least twice, with similar yields (generally ±5%) and selectivity similar to that reported by the submitters.

Fantastic! We were also informed that this rigorous protocol ensures that research published in this journal is very highly regarded. I’ve suggested similar such reproducibility checks for phylogenetics research before (at the Systematics Association Biennial meeting Belfast, 2011) but this was viewed as too futuristic / infeasible…

Right now we’re working on a draft statement of outcome from this workshop that ICSU can pass to its members to possibly officially agree to endorse.

So I better finish here, and get back to the discussion.
I’m rather hoping they will endorse the Panton Principles rather than reinvent the wheel (policy-wise).

Exciting times!


PS I have made a Storify of the tweets from the workshop here .