Mastodon

Month: June 2015

  • Deep indexing supplementary data files

    To prove my point about the way that supplementary data files bury useful data, making it utterly indiscoverable to most, I decided to do a little experiment (in relation to text mining for museum specimen identifiers, but also perhaps with some relevance to the NHM Conservation Hackathon): I collected the links for all Biology Letters…

  • Progress on specimen mining

    I’ve been on holiday to Japan recently, so work came to a halt on this for a while but I think I’ve largely ‘done’ PLOS ONE full text now (excluding supplementary materials). My results are on github: https://github.com/rossmounce/NHM-specimens/tree/master/results – one prettier file without the exact provenance or in-sentence context of each putative specimen entity, and one more…