Show me the data!
Header

Today I received proof that Elsevier are also sending takedown notices to UK universities – asking them to takedown copies of their staff’s academic research papers, hosted on university webpages. The full text is further down this post (in red). It is not just Academia.edu, it is not just the University of Calgary, University of California-Irvine, or Harvard University. Elsevier very probably are sending takedown notices to institutions and websites across the globe.
No-one is safe from these legal threats.

Not only that, but they seem to be encouraging universities to be pro-active and takedown more than just the specific articles identified in the DMCA notice they send! They are encouraging universities to limit access to their research works. This is simply disgraceful (even though I acknowledge they are technically, legally within their rights to do this because of the way in which their copyright transfer agreements are written, which incidentally many academics are effectively forced to sign in order to get published and make progress in their careers).

For background information read:

How one publisher is stopping academics from sharing their research. The Washington Post 19/12/2013

Elsevier steps up its War On Access SVPOW 17/12/2013

300px-Elsevier_poster_with_text

Librarians and university web admins: please publicly come out with more examples like this. Researchers, readers and taxpayers desperately need to know about this. Silence and subterfuge benefits no-one, these chilling effects must be publicly revealed.

This is the email I received with certain parts redacted:

*** Sent via Email – Inappropriate postings of Elsevier’s journal articles / DMCA Notice of Copyright Infringement ***

Dear Sir/Madam,

I write on behalf of Elsevier to bring to your attention the inappropriate posting of final published journal articles to your institutional website. I am President at Attributor (A Digimarc Company), which assists some of the world’s most prominent publishers, including Elsevier, with digital content protection (www.digimarc.com/guardian). Following the discussion below, a formal DMCA takedown request is included as Appendix A.

As you probably know, Elsevier journal article authors retain or are permitted a wide scope of scholarly use and posting on their own sites and for use within their own institutions. Those rights are more expansive when it comes to author preprints or accepted manuscripts than with respect to the final versions of published journal articles. Elsevier recognizes that in some cases authors or their institutions may not be fully aware of these rights and can by mistake post the final version of their articles to institutional websites or repositories. Unfortunately, it has come to our attention that copies of final published journal articles have, perhaps inadvertently, been posted for public access to one of your institutional websites.

I therefore request your cooperation to remove or disable access to these articles on your site, including but not limited to the articles identified in Appendix A. We have identified merely a sample in Appendix A, and as a publisher of close to 2,000 journals this might mean that more articles published by Elsevier could be found on your site. Please may I therefore draw your attention to Elsevier’s posting policy and ask for your attention to ensuring that your posting practices comply with this?
http://www.elsevier.com/about/open-access/open-access-policies/article-posting-policy#published-journal-article

In particular I note that Elsevier currently doesn’t permit posting of the final published journal article, and if there is a mandate or systematic posting mechanisms in place then Elsevier asks for a cost-free agreement with the institution before accepted author manuscripts are posted.
I would also recommend considering the use of DOI links as a way to access to the version of records of a published article. This would allow authors to list their work and to provide easy access to peers.

Finally, should you need any help in properly identifying a final published article to prevent any future improper posting, please do get in touch via the email address below.

I appreciate your anticipated cooperation and if you have any questions or feedback, or if you believe you have received this message in error (as you have received permission to post this article from Elsevier), please contact: UniversalAccess@Elsevier.com
Thank you.

Sincerely,
Eraj Siddiqui
Attributor (A Digimarc Company)

Appendix A

Copyright Infringement Notice

This notice is sent pursuant to the Digital Millennium Copyright Act (DMCA), the European Union’s Directive on the Harmonisation of Certain Aspects of Copyright and Related Rights in the Information Society (2001/29/EC), and/or other laws and regulations relevant in European Union member states or other jurisdictions.

Please remove or disable access to the infringing pages or materials identified below, as they infringe the copyright works identified below.

I certify under penalty of perjury, that I am an agent authorized to act on behalf of the owner of the intellectual property rights and that the information contained in this notice is accurate.

I have a good faith belief that use of the material listed below in the manner complained of is not authorized by the copyright owner, its agent, or the law.

My contact information is as follows:

Organization name: Attributor Corporation as agent for [Publisher Company]
Email: counter-notice@attributor.com
Phone: 650-340-9601
Mailing address:
400 South El Camino Real
Suite 650,
San Mateo, CA 94402

My electronic signature follows:
Sincerely,
/E Siddiqui/
E. Siddiqui
Attributor, Inc.

***List of Works and Location of Infringing Page or Material ***

Infringing page/material that I demand be disabled or removed in consideration of the above:

*** INFRINGING PAGE OR MATERIAL ***

Infringing page/material that I demand be disabled or removed in consideration of the above:

Rights Holder: Reed Elsevier

Original Work: [redacted]
Infringing URL: [redacted]

UPDATE:

Dutch Universities too are receiving DMCA’s from Elsevier:

2013-12-20-113623_939x846_scrot

@Wowter via Twitter

I’d just like to point out to anyone who asks, particularly CRC Press (part of Taylor&Francis Group, who are in turn part of Informa PLC) that by posting the full text of my book chapter to Academia.edu I am *not* breaching the copyright transfer agreement I signed.

Upon receiving a copyright transfer agreement as a PDF from them via email – I edited the PDF to reword the agreement to terms that were more agreeable to me (e.g. I did NOT want to transfer my copyright to them for my work).

The bit of wording I changed is as follows:

As such, copyrights in the Work will not inure to the benefit of the Publisher, the Publisher will not own the publication, its title and component parts, and all publication rights. This does not permit the Publisher, in its name, to copyright in the Contribution, make applications to register its copyright claim, and to renew its copyright certificate.

I signed this reworded form as PDF (displayed below, signature removed) and returned it to them. I have now kindly received a free ‘author copy’ of the printed book and my chapter has clearly been included so it’s too late for CRC press to exclude my chapter. I can only assume they agreed to the reworded terms of the contract I signed and sent them.

I doubt CRC press would even be bothered by my actions to be honest. They are allowing another of their books to be completely posted online for free, so in comparison to that, my action here is puny – but it certainly emboldens me for the next time I may have to sign a CTA form…

CRC Press are welcome to non-exclusively publish my book chapter. Thank you CRC Press for agreeing to my terms and conditions.

Contract

Lessons one might learn from this exercise:

DO NOT GIVE AWAY THE COPYRIGHT TO YOUR WORK!
PUBLISHERS DO NOT ‘NEED’ ALL YOUR COPYRIGHT TRANSFERRED TO THEM TO PUBLISH.
ALL THAT IS NEEDED IS FOR YOU TO GRANT THEM A NON-EXCLUSIVE LICENSE TO PUBLISH.

A word of warning though… I wouldn’t recommend relying on this method of editing CTA’s to get what you want. I was just lucky this time. Choosing an open access publication venue from the start is always the best option (if possible).

See also:

Mike Taylor 2010. Who Owns My Sauropod History Paper?
http://svpow.com/2010/10/13/who-owns-my-sauropod-history-paper/

Acknowledgements

October 21st, 2013 | Posted by rmounce in Open Access - (3 Comments)

I handed in my thesis not long ago, on Thursday 3rd October 2013. No idea when my viva is yet. I can’t blog many of the chapters because I haven’t convinced my manuscript co-authors of the value of preprints, yet. I’m also a bit unsure as to how some of the other chapters will be received and thus I’ll wait until after the viva before I decide what to do next with it.


Given it’s open access week this week, there is one bit of my thesis I should definitely share: the acknowledgements!

I can’t possibly thank everyone enough for the help I’ve received over the past 4 years – my knowledge, skills, and connections have been vastly extended. Note in particular the bit I’ve highlighted in bold just for this blog post – I want everyone to know how absolutely reliant I’ve been on ‘alternate’ forms of literature access during my research – this is the new ‘normal’ for many early career researchers I fear, until open access is more prevalent we’ll have to continue to hunt, scavenge, beg, steal, and borrow for every PDF. My generation of researchers grew-up using Napster, Isohunt, Library.nu. Copyright infringement is an everyday activity for many of us – WE DONT CARE. Have you been to a conference? How many of the pictures on the speakers slides weren’t technically infringing someone else’s copyright? WE DONT CARE. One can shut down or block specific portals, but doing so doesn’t really solve the basic problem: from what I’ve seen, time and time again, copyright’s only role in science is to obstruct it. My biggest hope for Open Access Week 2013 is that someone will torrent Elsevier’s back catalogue – journal/publisher torrents have been done before and will be done again!  It probably won’t happen, but I can dream…

Acknowledgements

I would like to thank my supervisor, Matthew Wills for putting up with me all this time. I
have been lucky to have such accommodating and understanding support. I also must
thank my lab mates Martin Hughes, Anne O’Connor, Sylvain Gerber, Katie Davis, Rob
Sansom and everyone else in the Biodiversity Lab at the University of Bath – we had some
great times and some brilliant times together. Sincere thanks also to the University of Oslo
Bioportal computing cluster for providing me free cloud computation for my work.
Many people have helped spur my imagination along the way with ideas for different
chapters of this thesis. For this I would like to thank Ward Wheeler, Pablo Goloboff, Mark
Siddall, Dan Janies, Steve Farris and the generous financial support of the Willi Hennig
Society. I want to thank all those in the palaeontology community who have shared their
published data with me, particularly Graeme Lloyd for his stirling work in making
dinosaurian data available – I hope I have done something interesting with the data I have
used and opened eyes to new possibilities. I also want to thank all those in the open
science community – Peter Murray-Rust, Todd Vision, Heather Piwowar, Mark Hahnel,
Martin Fenner, Geoffrey Boulton, Jenny Molloy and so many more I’ve had the pleasure of
meeting in person. The energy and enthusiasm I drew from countless online discussions
on Facebook, Google+ and Twitter was truly inspirational.
For facilitating greater access to scientific literature I must heartily thank the Natural
History Museum, London library and archives, the #icanhazpdf community on Twitter,
Wikipaleo on Facebook, References Wanted on FriendFeed, Library.nu, and SciHub.
Without these additional literature access facilitators I would not have been able to read
half the sources I cite in this thesis.
I must thank my wife Avril for her patience with me especially during the write-up phase,
for allowing me to go away to all these amazing conferences abroad, and for tolerating all
those long nights into mid-morning when I was tapping away on my noisy keyboard.
Finally, I thank my family: Richard, Rosemary & Tara for repeatedly encouraging me to
finish my thesis – I got there in the end!

Hack4ac recap

July 9th, 2013 | Posted by rmounce in BMC | eLife | Hack days | Open Access | Open Data | Open Science | PeerJ | PLoS - (4 Comments)

Last Saturday I went to Hack4Ac – a hackday in London bringing together many sections of the academic community in pursuit of two goals:

  • To demonstrate the value of the CC-BY licence within academia. We are interested in supporting innovations around and on top of the literature.
  • To reach out to academics who are keen to learn or improve their programming skills to better their research. We’re especially interested in academics who have never coded before

DSCF3425

The list of attendees was stellar, cross-disciplinary (inc. Humanities) and international. The venue (Skills Matter) & organisation were also suitably first-class – lots of power leads, spare computer mice, projectors, whiteboards, good wi-fi, separate workspaces for the different self-assembled hack teams, tea, coffee & snacks all throughout the day to keep us going, prizes & promo swag for all participants…

The principal organizers; Jason Hoyt (PeerJ, formerly at Mendeley) & Ian Mulvany (Head of Tech at eLife) thus deserve a BIG thank you for making all this happen. I hear this may also be turned into a fairly regular set of meetups too, which will be great for keeping up the momentum of innovation going on right now in academic publishing.

The hack projects themselves…

The overall winner of the day was ScienceGist as voted for by the attendees. All the projects were great in their own way considering we only had from ~10am to 5pm to get them in a presentable state.

ScienceGist

 

This project was initiated by Jure Triglav, building upon his previous experience with Tiris. This new project aims to provide an open platform for post-publication summaries (‘gists’) of research papers, providing shorter, more easily understandable summaries of the content of each paper.

I also led a project under the catchy-title of Figures → Data where-by we tried to provide added-value by taking CC-BY bar charts and histograms from the literature and attempting to re-extract the numerical data from those plots with automated efforts using computer vision techniques. On my team for the day I had Peter Murray-Rust, Vincent Adam (of HackYourPhD) and Thomas Branch (Imperial College). This was handy because I know next to nothing about computer vision – I’m Your Typical Biologist ™  in that I know how to script in R, perl, bash and various other things, just enough to get by but not nearly enough to attempt something ambitious like this on my own!

Forgive me the self-indulgence if I talk about this  Figures → Data project more than I do the others but I thought it would be illuminative to discuss the whole process in detail…

In order to share links between our computers in real-time, and to share initial ideas and approaches, Vincent set-up an etherpad here to record our notes. You can see the development of our collaborative note-taking using the timeslider function below (I did a screen record of it for prosperity using recordmydesktop):

In this etherpad we document that there are a variety of ways in which to discover bar charts & histograms:

  • figuresearch is one such web-app that searches the PMC OA subset for figure captions & figure images. With this you can find over 7,000 figure captions containing the word ‘histogram’ (you would assume that the corresponding figure would contain at least one histogram for 99% of those figures, although there are exceptions).
  • figshare has nearly 10,000 hits for histogram figures, whilst BMC & PLOS can also be commended for providing the ability to search their literature stack by just figure captions, making the task of figure discovery far more efficient and targeted.

Jason Hoyt was in the room with us for quite a bit of the hack and clearly noted the search features we were looking for – just yesterday he tweeted: “PeerJ now supports figure search & all images free to use CC-BY (inspired by @rmounce at #hack4ac)” [link] – I’m really glad to see our hack goals helped Jason to improve content search for PeerJ to better enable the needs (albeit somewhat niche in this case) of real researchers. It’s this kind of unique confluence of typesetters, publishers, researchers, policymakers and hackers at doing-events like this that can generate real change in academic publishing.

The downside of our project was that we discovered someone’s done much of this before. ReVision: Automated Classification, Analysis and Redesign of Chart Images  [PDF] was an award-winning paper at an ACM conference in 2011. Much of this project would have helped our idea, particularly the classification of figures tech. Yet sadly, as with so much of ‘closed’ science we couldn’t find any open source code associated with this project. There were comments that this type of non-code sharing behaviour, blocking re-use and progress, are fairly typical in computer science & ACM conferences (I wouldn’t know but it was muttered…).  If anyone does know of the existence of related open source code available for this project do let me know!

So… we had to start from a fairly low-level ourselves: Vincent & Thomas tried MATLAB & C based approaches with OpenCV and their code is all up on our project github. Peter tried using AMI2 toolset, particularly the Canny algorithm, whilst I built up an annotated corpus of 40 CC-BY bar charts & histograms for testing purposes. Results of all three approaches can be seen below in their attempts to simplify this hilarious figure about dolphin cognition from a PLOS paper:

The plastic fish just wasn't as captivating...

“Figure 5. Total time spent looking at different targets.” from Siniscalchi M, Dimatteo S, Pepe AM, Sasso R, Quaranta A (2012) Visual Lateralization in Wild Striped Dolphins (Stenella coeruleoalba) in Response to Stimuli with Different Degrees of Familiarity. PLoS ONE 7(1): e30001. doi:10.1371/journal.pone.0030001 CC-BY

Peter’s results (using AMI2):

 

Thomas’s results (OpenCV & C):

 

Vincent’s results (OpenCV & MATLAB & bilateral filtering)

We might not have won 1st prize but I think our efforts are pretty cool, and we got some laughs from our slides presenting our days’ work at the end (e.g. see below). Importantly, *everything* we did that day is openly-available on github to re-use, re-work and improve upon (I’ll ping Thomas & Vincent soon to make sure their code contributions are openly licensed). Proper full-stack open science basically!

some figures are just awful

 

Other hack projects…

As I thought would happen I’ve waffled on about our project. If you’d like to know more about the other projects hopefully someone else will blog about them at greater length (sorry!) I’ve got my thesis to write y’know! ;)

You can find more about them all either on the Twitter channel #hack4ac or alternatively on the hack4ac github page. I’ll write a little bit more below, but it’ll be concise, I warn you!

  • Textmining PLOS Author Contributions

This project has a lovely website for itself: http://hack4ac.com/plos-author-contributions/ and so needs no more explanation.

  • Getting more openly-licensed content on wikipedia

This group had problems with the YouTube API I think. Ask @EvoMRI (Daniel Mietchen) if you’re interested…

  • articlEnhancer

Not content with helping out the PLOS author contribs project Walther Georg also unveiled his own article enhancer project which has a nice webpage about it here: http://waltherg.github.io/articlEnhancer/

  • Qual or Quant methods?

Dan Stowell & co used NLP techniques on full-text accessible CC-BY research papers, to classify all of them in an automated way determining whether they were qualitative or quantitative papers (or a mixture of the two). The last tweeted report of it sounded rather promising: “Upstairs at #hack4ac we’re hacking a system to classify research papers as qual or quant. first results: 96 right of 97. #woo #NLPwhysure” More generally, I believe their idea was to enable a “search by methods” capability, which I think would be highly sought-after if they could do it. Best of luck!
Apologies if I missed any projects. Feel free to post epic-long comments about them below ;)