Show me the data!

A visit to the BioMedCentral offices

November 7th, 2012 | Posted by rmounce in Open Access | Open Data | Phylogenetics

Recently I had the opportunity to collaborate on an extremely timely paper on data sharing and data re-use in phylogenetics, as part of the continuing MIAPA (Minimal Information for a Phylogenetic Analysis) working group project:


Additionally, in order to also practice what we preach about data archiving, we opted (it wasn’t mandated by the journal or editors) to put the underlying data for this publication in Dryad so it was immediately freely available for re-use/scrutiny/whatever upon publication of the paper, under a CC0 waiver

Dryad (and similar free services like FigShare, MorphoBank & LabArchives) allow research data to be made available either pre-publication, on publication, or even post-publication with optional embargoes (access denied for up to 1-year after the paper is published). I’m strongly against the use of data embargoes but Dryad allow it because embargoed data is better than no data at all! I’ve seen some recent papers that have made use of this option and apparently the journals, editors & reviewers are ‘fine’ with this practice of proactively denying access to data. I guess it’s a generational thing? That sort of practise used-to understandably be okay pre-Internet when digital data was costly to distribute. But now we can freely & easily distribute supporting data, there are a multitude of reasons why we really should unless there are justifiable reasons not to e.g. privacy with sensitive medical/patient data.


I haven’t had all that much experience of the publication process so far – I’m amazed how kludgy it can be at times – far from smooth or efficient IMO. I was in charge of the Dryad data deposition for this paper among other things and because the journal isn’t integrated with Dryad’s deposition process it took me quite a few emails to work out what & when to do things but it wasn’t a major difficulty – the benefits of doing this will almost certainly outweigh the small effort cost of doing it. Those journals with a Dryad-integrated workflow will no doubt have a smoother process.

Another thing I learn’t from this manuscript was that publishers commonly outsource their typesetting to developing countries (for the cheaper labor available there). So in this instance BMC sent our MS to the Philippines to be re-typeset for publication and when the proofs came back we encountered some really comical errors e.g. Phylomatic had been re-typeset as ‘phlegmatic’. This sparked a very serendipitous conversation on Twitter, which eventually led to Bryan Vickery (Chief Operating Officer at BMC) inviting me to visit the London office of BMC to have a chat about ‘all-things-publishing’ (and btw, serious *props* to PLOS and BMC for having such nice, helpful tweeps on Twitter):

Bryan and I arranged a time and a date (after-SVP) and so I ended-up visiting BMC for more than 2 hours on Wednesday 24th October. I got to meet not only Bryan but also Deborah Kahn, Shane Canning and others including some of the editors for BMC Research Notes (thanks again for helping publish our paper!) & BMC Evolutionary Biology. Iain Hrynaszkiewicz was there too (Hi Iain!), given our enthusiasm for Open Data (do read his *excellent* paper ‘Open By Default’ in the same article collection as ours) I’m sure we’ll meet again at more workshops and events in future.

I couldn’t possibly go through everything that was explained to me there but it certainly was illuminating. I suspect many junior academics like myself have little or no clue at all as to the behind-the-scenes processes that go on with manuscripts to get them into a state ready for publication. Perhaps a publisher visit (or even short placement?) scheme like this should be run as part of a postgraduate skills training session? Moreover perhaps it could help alleviate the ‘too many PhDs, too few academic jobs‘ problem by highlighting skilled sciencey jobs like STM publishing as viable and noble alternatives to the extremely overpopulated rat-race for tenure-track academic jobs. STM publishing isn’t even an ‘exit’ from academia. People like Jenny Rohm (chair of Science is Vital) have demonstrated that one can go into STM publishing and still go back into academia after this.

The cost of peer-review & publishing


This part of the post has sat on the backburner for a long time because it’s a complex one.

From what I was told (and I could well believe) organizing peer-review can be an immensely variable process. Sometimes it can very simple. Automated processes such as peer2ref can be used to select appropriate reviewers for a manuscript, if these reviewers accept and get on with it nicely and in a timely fashion the process can be of very little administrative burden. However there are also times when maybe 10 or 12 reviewers need to be contacted before 2 may agree and then there can be complications after this leading to a very time consuming, costly and burdensome process. So organizing peer-review costs money, but it’s difficult, or perhaps commercially-sensitive (?) to put an average price on that process -> I’m still in the dark on how much this process should cost. If anyone knows of a reputable source for data on this please do let me know.


What of DOI’s?  Why do some high-volume journals like Zootaxa & Taxon operate without DOI’s? Is there really much money to be saved by dispensing with them? Well, Bryan kindly pointed me to this link here for all the salient info.

It’s just $1 per DOI. That’s nothing tbh. What’s more, it’s even cheaper to retrospectively add DOI’s to older already published content: ‘backfile’ DOI’s are just $0.15. That means Zootaxa could retrospectively add DOI’s to all ~5866 of their backfile articles (2004-2009) for just $880 !  There’s plenty of other things that would need fixing before that happened though, Zootaxa doesn’t even have proper article landing pages as was pointed out to me by Rod Page. No doubt there would also be some labour cost associated with getting someone to add DOI’s to all those thousands of articles. Still, it looks cheap to me. I still feel justified in my annoyed rant I sent to TAXACOM a while ago about this pressing issue with respect to DOI’s and responsibility of publishers.

This also has ramifications for some of the changes I’ve been pushing for now I’m on the Systematics Association council. Our main publication is a book and each of the chapters *could* but currently don’t have DOI’s issued for them, I suggested we issue DOI’s at the last council meeting, but alas it’s not up to me, we need co-operation from our publisher to make this happen (Hi, Cambridge University Press!). Book chapter DOI’s cost just $0.25 per DOI, so I think this small cost would certainly be worth it, if it raises the discoverability and citeability of our publications.

Article submission

A final point of interest from my BMC visit: Bryan told me that BMC used to offer a means by which authors could submit their works directly via an XML authoring tool. It wasn’t popular, but I wonder whether this was perhaps because it was a little before its time? The whole process of Biologists submitting Word files, having figures and text inadvertently mangled and wrongly re-typeset at the publisher seems extremely inefficient to me. Physicists & Computational Scientists seem to get along fine with LaTeX submission processes which alleviate some but not all of the typesetting shenanigans. Perhaps it is the authors, and the authoring tools that need to change to enable more re-usable research in the future, to fully enable the potential of the semantic Web. It looks like Pensoft might be trying to go again in this direction with its Pensoft Writing Tool.

image by Gregor Hagedorn. CC BY-SA

On that note, it might be good to end with a small advert  for the Pro-iBiosphere biodiversity informatics & taxonomy workshop in February, 2013 Leiden (NL).

I very much look forward to meeting taxonomists IRL!





7 Responses