Page View Spikes on Research ArticlesMarch 24th, 2015 | Posted by in Open Data
For those that know me as a biologist it might perhaps surprise you to know that my most cited publication so far is on Open Access and Altmetrics (published in April 2013, 25 cites and counting…) — nothing to do with biology per se!
So I took great interest in this new publication:
Wang, X., Liu, C., Mao, W., and Fang, Z. 2015. The open access advantage considering citation, article usage and social media attention. Scientometrics, pp. 1-10. DOI: 10.1007/s11192-015-1547-0
The authors have gathered some really fascinating data measuring day-by-day altmetrics of papers at the journal Nature Communications, which at the time was hybrid: some articles behind a paywall, some articles were paid-for open access at a cost of $5200 to the authors/funders. (The cost of open access here is an absolute rip-off. I do not endorse or recommend outrageously priced paid-for open access outlets like Nature Communications. PLOS ONE costs just $1350 remember! PeerJ is just $99 per author!)
The paper is by no means perfect – I’m not saying it is – but the ideas behind it are good. Many on twitter have commented that it’s ironic that this paper on open access advantage is itself only made available behind a paywall at the publisher.
The good news is, Dr Xianwen Wang has responded to this and has made an ‘eprint’ copy (stripped of all publisher branding) freely available at arXiv as of 2015-03-19 (post-publication). The written English throughout the manuscript is not brilliant but I feel this reflects poorly on the journal rather than the authors – it’s remarkable that Scientometrics can charge a subscription fee to subscribers if they offer no copy-editing on accepted manuscripts! Finally, technical detail on precisely how the data was obtained is rather lacking. So that’s the critique out of the way…
My tweets about this paper have been very popular e.g.
— Ross Mounce (@rmounce) March 18, 2015
— Ross Mounce (@rmounce) March 18, 2015
But I wanted to dig deeper into the data. So I emailed the corresponding author; Xianwen for a copy of the data behind figure 2 and he happily and quickly sent it to me. I was fairly shocked (in a good way) that he sent the data. Most of the times I’ve sent email requests for data in the past have been ultimately unsuccessful. This is well documented in the field of phylogenetics *sad face*. The ’email the author’ system simply cannot be relied upon, and is one of many reasons why I feel all non-sensitive data supporting research should be made publicly available, alongside the article, on the day of publication.
I did my own re-analysis of the raw data Xianwen sent over, and discovered there were lots of odd jumps in data, which couldn’t really be explained by peaks in social media activity e.g. for A cobalt complex redox shuttle for dye-sensitized solar cells with high open-circuit potentials (visualized below). ~520 days after it was first published, in one single day it apparently accumulated 21,577 page views! There was also a smaller spike of 2000 page views earlier.
Xianwen had filtered these suspicious jumps out of his figures but neglected to mention that in the methods section, so upon informing him of this discrepancy he’s told me he’s going to contact the editor to sort it out. A great little example of how data sharing results in improved science? The unfiltered data looks a little bit like the plot below:
— Ross Mounce (@rmounce) March 22, 2015
Anyway, back to the spikes/jumps in activity – they certainly aren’t an error introduced by the authors of the paper – they can also be seen via Altmetric (a service provider of altmetrics). The question is: what is causing these one-day spikes in activity?
I have alerted the team at Altmetric, and they have/will alert Nature Publishing Group to investigate further
— Sara Rouhi (@RouhiRoo) March 22, 2015
Most of the spikes are likely to be accidental in cause but it would be good to know more. A downloading script gone awry? But there is still a possibility that within this dataset there is putative evidence for deliberate gaming of altmetrics, specifically: article views. I look forward to hearing more from Altmetric and Nature Publishing Group about this… the ball is very much in their court right now.
Moreover, now that these peculiar spikes have been detected; what, if anything, should be done about it?