Gavia Libraria

Lies and damn lies

The Loon is horrified. JISC should know better than this. It is thoroughly misleading, damagingly so.

SHERPA/RoMEO does not cover the entirety of the journal universe; JISC’s blog post is therefore subject to serious selection bias. Moreover, even for that portion SHERPA does cover, it’s easy to find weasel words, usually in blanket publisher listings, that render statistical analysis meaningless, such as “individual journals may have their own policies”. (Try Wiley-Blackwell’s default policy for an example.) SHERPA knows this; they’ve even started trumpeting it in bright red type (on green, ouch) all over the place. It’s wholly unclear to the Loon whether (much less how) JISC’s counts corrected for this.

Even this leaves out the publishers who present one face to SHERPA and another to authors. The Loon is already seeing a fair bit of this, and expects rather more of it over the coming year or so. Self-archiving was all right as long as no one actually did it. Now that permissions mandates are clearing the decks, publishers suddenly find self-archiving much less acceptable.

The ancient and cynical Loon is not surprised by this. She is, however, surprised that JISC would let a blog post escape it that so misrepresents on-the-ground realities around self-archiving. JISC is smarter and more grounded than that.

3 thoughts on “Lies and damn lies

  1. Peter Millington

    I don’t agree that our statistics misleading, and I would be interested to know why you think they are damaging.

    It is true that RoMEO does not cover the entirety of the journal universe, but with a sample of c.19,000 journals and covering all the major scholarly publishers, we feel that it is sufficiently representative. We believe these are the best statistics that we or anyone else can produce. Nonetheless, we hope to improve them over time.

    The weasel words you mention are a prudent caveat, given the lack of clarity in some publishers’ policies, and the fact that there are changes on a daily basis. Glad you spotted that, by the way. The jarring red on green evidently did its job!

    We are in the process at the moment of checking through individual journal titles for exceptions to the publishers’ default policies, and assigning them to the relevant special policies – see for instance Oxford University Press. It is interesting that you pick on Wiley-Blackwell, because that is the publisher that is giving us the biggest headache, as we hear it is to repository managers too.

    If you know of any publishers who “present one face to SHERPA and another to authors”, please let us know about them. SHERPA is compiled from publishers’ information – publicly available online wherever possible, but backed up with correspondence where necessary. While it is possible that a few publishers are behaving abominably, we think it more common that one hand doesn’t know what the other hand is doing. We often find publishers’ documents that contradict each other – e.g. an online policy statement versus a copyright transfer agreement. A lot of our work is in sorting out these discrepancies.

    Another thing: We process policy changes as quickly as possible, but with over 1,000 publishers to keep an eye on, it is very difficult to handle them in a timely fashion. A few publishers notify us of changes directly; sometimes we and our overseas partners spot them ourselves; but repository managers are often the first to alert us to changes. We really appreciate this, and long may it continue.

    One last point. JISC hosts the SHERPA Services Blog, but is not responsible for moderating its content. We moderate comments, but only to eliminate spam and similar abuse. The SHERPA Services Blog should not be construed as representing the official views of JISC per se.

    1. Library Loon Post author

      It is damaging because it misleads faculty and librarians into believing that self-archiving is easy and legal when it is too often neither. The Loon would vastly prefer more nuance in public statements about publisher permissions, and more groundedness in what a would-be self-archiver’s experience is likely to be. Too many faculty have come to the Loon having read such statistics as these hoping to self-archive their entire published corpus, only to be quickly discouraged by (to them) bewildering questions of article versions, embargoes, set-phrases, changes in publisher policy over time (something SHERPA rarely tracks), publisher buyouts, etc. etc. ad nauseam.

      Making self-archiving look easier than it is helps no librarian, no repository, and no researcher. It also allows publishers to pretend to more virtue than they possess; set-phrases and embargoes are barriers to self-archiving, hardly any less effective than legal barriers, and at least in part intended precisely to be barriers. (Perhaps SHERPA would consider a Hassle Index, quantifying how annoying and/or difficult it is to comply with a given publisher or journal’s requirements?)

      As for statistics quality, please address the questions regarding when blanket publisher statements were and were not understood to apply to a given journal. Controlling the journal numbers by number of articles published yearly in each journal would also be useful, where possible.

      Allow the Loon to suggest another (admittedly rather more time-consuming and certainly qualitative rather than quantitative) way to slice the information SHERPA has: pick disciplines, randomly select researchers in those disciplines, and gauge what percentage of their output they could legally self-archive, in what version, and with what formalities. Perhaps implementors of VIVO or BibApp (in the US) or the UK’s RAE could provide sample researchers and their corpora to work with.

      The Loon suspects that the results, especially the confusion and hassle involved in performing the archiving with all i’s dotted and t’s crossed, would be somewhat sobering. It should be.

  2. Peter Millington

    People interpret our statistics and our data differently. You have chosen to interpret as dangerous and discouraging, while others see it as encouraging.

    I have observed extremes in the way people use RoMEO data. At one extreme, some people (usually application developers and some academics) take RoMEO “Green” to mean they can do anything with their article, which grossly oversimplifies matters, as there may still be non-restrictive conditions that should be complied with. Having to include a set statement is one such condition. It does not prevent immediate archiving, although it does add a tiny bit of work the deposition process. Including an acknowledgement does not seem an unreasonable requirement to me.

    At the other extreme, there are people who refuse to trust RoMEO data (mostly librarians it has to be said) and who insist on contacting the publishers themselves. This is disappointing, largely unnecessary, and often counter productive. They may or may not get a response.

    The reality is that as far as I am aware, no one who uses RoMEO has yet been taken to court over illicit archiving. Some publishers even refer enquirers to RoMEO as an honest broker.

    We take your point about RoMEO not tracking changes. Actually, we do maintain a paper trail, and we are always happy to receive queries about publishers’ current and past policies. However, this is obvisously not as convenient as having this historical information available online. Rest assured that we are working on it.

    It is well outside of our remit to provide statistics based on the number of articles, although they would be the ultimate ideal. Projects of the type you suggest have been done:

    We did some work for the Wellcome Trust, checking the permissions in RoMEO against a large bibliography of publications emanating from the research they funded. You can see these results in a poster we showed at OR’09:

    Similar statistics have also been generated by Symplectic Ltd, downloadable from the following page:

    I’d be interested to hear your take on these results.