Gavia Libraria – Gavia Libraria

Harvard hears a hope

By Library Loon 26 April 2024 Scholarly communication

As big-pig journal publishers pull an Amazon-warehouse stunt, attempting infeasible speedups of editorial processes to publish ever-spiraling numbers of articles to make more money off author-side fees, Harvard is launching a journal-publishing program. Let no one say Harvard Library has no sense of timing!

The Loon doubts the overlay-journal option will see much uptake. For all the “why don’t they just” ballyhoo among the less rational of Old Man of Academe open-access advocates, overlay journals remain a blip on the landscape, if that. They simply don’t fit academe’s admittedly ossified mental model of what a journal is and what journal processes entail. Still, the Loon approves of Harvard offering the option, if only to provide sufficient real-world evidence of the futility of the overlay journal concept to make its deluded fanatics shut up at last.

Harvard Library is not shy about targeting its efforts partly at editorial boards disaffected with the big pigs; it is happy “to convert existing journals to open access.” The Harvard brand name should be attractive enough to have a reasonable shot at actually accomplishing this. As always with a new initiative, the first two or three journals should be the most difficult… but if Harvard Library is as canny as the Loon thinks it is, those crucial pioneers are already in the bag, or close to.

(Harvard Library has a good track record for canny OA moves, and has historically been refreshingly free of the magical thinking that infests the OA movement. It hired Peter Suber. It was the first and is still nearly the only institution to implement patchwork mandates with a well-resourced plan to gather up the bounty into its institutional repository. It was the first the Loon recalls to announce flatly that no amount of money would satiate the big pigs’ bottomless greed. All of this in the face of the usual lack of support from faculty.)

That Harvard Library understands that money is necessary to run a journal, and is willing to help editorial boards seek that money, bodes rather well. (Overlay-journal fanatics and similarly deluded Old Men of Academe rarely-to-never admit this.) The Loon’s main concern, former typesetter and XML minion that she is, is about what sort of production apparatus Harvard Library is ginning up. As many, many dusty neglected Open Journal Systems implementations can attest, faculty know nothing about editing and production processes save just enough to notice and disapprove when they are missing.

She can certainly forgive that the announcement does not explain this—it is not as though the faculty at whom the announcement is aimed would understand or care. The Loon cares, though, not because she hankers to return to production (though she would willingly dust off her old skills, were the remuneration adequate) but because not planning for and resourcing professional-grade production processes could mean the entire initiative falls at the first fence. “Why is Harvard putting its name on doublespaced Times New Roman with one-inch margins?!” would drop the final curtain at once.

She suspects Harvard Library knows this and is behaving accordingly, but she would certainly prefer to be sure.

In any case, she wishes Harvard Library well with this. If well-managed, it could signal an intriguing shift in the Great Game.

Thank you

By Library Loon 29 September 2023 Metablogging

A book came out yesterday in which the Loon’s Boring Alter Ego had been outed.

The online open-access version has already been fixed.

The Loon is deeply, pathetically grateful to the book’s editorial and production team. She had frankly expected stonewalling and pushback. Instead, she received sympathy and impressively immediate action. (The Loon, remember, has done typesetting. She knows that her ask was not an insignificant one!)

Thank you, very much, from the bottom of the Loon’s wizened avian heart.

In which the Loon soliloquizes

By Library Loon 16 June 2023 Research data, Scholarly communication

(The rhythm of the Shakespeare speech cannot
survive intact this Loonly recasting;
if you would be so kind as to forgive,
the angry Loon will soon return to prose.
She mentions only that the noble Brembs
did not originally use the word
“clerical,” but “menial” to sum the work of such
as Loons, librarians, and preservation wonks.
Oh, nor is this the first or only time
that Brembs has wantonly accused the work
of publishing, scholcomm, and data pros
of being facile, vain, and valueless.
In short, the Loon has had it with this shit.)

Serfs, IT, librarians, lend me your ears;
I come to bury our work, not to praise it.
The research that men do lives after them;
The rest of it is merely clerical;
So we are assuredly told. The noble Bjorn B
Hath told you that this work is clerical:
If it were so, it was a grievous fault,
And grievously must we peons suffer for it.
Here, under leave of Bjorn B and the rest—
For Bjorn B is an academic man;
So are they all, all academic men—
Come I to share this widespread disrespect.
I spent much time to learn the work I do:
But Bjorn B hath said that it is clerical;
And Bjorn B is an academic man.
I have salvaged many datasets from death
And shepherded them to a safer home
Does this complex work to you seem clerical?
When that the law hath changed, I have explained:
And “clerical” is condescending dross:
Yet Bjorn B says our work is clerical;
And Bjorn B is an academic man.
You all have seen the standards wilderness
which is my job to tame and make of use,
That, too, must change, its data too: is this clerical?
Yet Bjorn B says that it is clerical;
And, sure, he is an academic man.
I speak not to disprove what Bjorn B spoke,
But here I am to speak what I do know.
The open access movement disdains all
And greatly thins its own thin ranks thereby.
O judgment! thou art fled from learnèd jerks!
And needed work, unvalued, can’t go on,
While nasty scornful oafs proclaim it clerical,
And I must pause till someone give it worth.

Cynical in the right ways: eLife’s peer-review choice

By Library Loon 26 October 2022 Scholarly communication

So, eLife is changing its peer-review process:

eLife’s peer-review process is changing. From January 2023, eLife will no longer make accept/reject decisions after peer review. Instead, every preprint sent for peer review will be published on the eLife website as a “Reviewed Preprint” that includes an eLife assessment, public reviews, and a response from the authors (if available).

The more the Loon thinks about this, the more she finds it cynical in the best ways, the right ways.

Let us by all means be clear: peer review as presently practiced is a disaster. It’s trivially gameable, horrifically biased, trivially corruptible for ax-grinding or academic feuds or any number of other less-than-admirable ends, and utterly ineffective at keeping garbage out of The Litrachoor. The people at eLife clearly know all this—admirable by itself; too many academics and even librarians don’t—and to change their process in the ways they have, they must have decided a lot of it is tied to the high-stakes go/no-go decision. So they surgically removed that power from reviewers. Cynical and clever.

Does this mean, then, that eLife will inevitably become a dumpster fire, overrun by trash that would never make it past a proper review? Oh, no, not at all. Public reviews should quite neatly do for that—the sort of academic cockroach who games peer review to publish plagiarized or p-hacked or image-diddled or corporately-gamed or otherwise beyond-unacceptable work expects to scuttle behind confidential review. Cockroaches should flee public review like the—well, roaches they are. Review cartels and fake review scams will have at least some trouble persisting.

Moreover, public reviews also stab several styles of ax-grinding to the heart: the feuding scholar who has been handed their enemy’s work to review; the racist, sexist abomination (the Loon will not call these excrescences “scholars”); the senior scholar who no-goes anything that questions or invalidates their prior work; and the scholar of any vintage who uses peer review as a power trip. They also, in point of fact, protect junior-scholar reviewers who make honest negative assessments of more senior scholars’ work—no more who-said-what-ing in private email, the arguments can be evaluated on their merits.

So cynical and so clever, eLife. Well done.

What’s more, eLife can now experiment with building (or installing already-built) detection mechanisms (automated and human) for modern academic sins that peer reviewers rarely catch: data problems, image-diddling, p-hacking, and the like. This removes a burden peer reviewers should honestly never have had to carry and don’t have time or tools for. It’s right that this style of detection should happen at the preprint-server/journal level. The Loon will watch eLife for signs of evaluation innovation with considerable interest.

This leaves peer reviewers to do what good ones do best: assess and help improve the work. As always, the Loon dips her beak in respect to the many reviewers who have improved the Boring Alter Ego’s publications.

Cur OCLC delenda est? or, the Loon, the witch, and the audacity of this Pritch

By Library Loon 22 June 2022 Technobabble

(Apologies for the post title, which is a groaner even for the Loon. The Loon simply could not resist it.)

The Loon, as she mentioned, has quite a few reasons to want to see OCLC staked and turned to dust, but let us get one out of the way as quickly as possible. Any organization, for-profit or non-, with the naked audacity to think hiring female furniture is a dandy livener for a professional conference reception needs to be buried and the earth over its grave salted. There is no excuse for that. It was and is sexist, dehumanizing, and wrong.

Enough of that repellent subject. (It’s not as though anything will be done about it at this late date. OCLC ignored social-media outcry at the time.) A little more technobabble-inflected history for you. Now, the Loon and her Boring Alter Ego are not and have never been catalogers (digital-collections metadata is more the BAE’s métier), so it is likely she will miss or mistake some details; she apologizes in advance for such solecisms. In the main, though, she thinks she can explain this pile of guano. Oh, and the Loon should perhaps mention, just for the most transparency possible, that her Boring Alter Ego was paid once to present at an OCLC event (at which, naturally, she bit the hand that was feeding her good and hard), and has been headhunted for positions at OCLC more than once (to the horrified amusement of some of the BAE’s work colleagues). The jobs would have meant immensely more money, to be sure, but the Loon’s conscience straitly forbids, as does her unshakable need to eschew workplaces that hire female furniture.

Shared cataloging did not begin with computers; thanks to early standardization of catalog and catalog-card size (and, it must be said, the early near-monopoly of one small-l loon named Melvil Dui), the Library of Congress began printing and selling catalog cards for US libraries sometime in the earlyish 1900s (the Loon isn’t sure exactly when; 1910 or 1920 maybe?). The efficiency of centralized record creation and card production should be obvious—every single library Library Handing or typing up a card for every single commodity-published book is a fairly ridiculous notion if there’s any other way to do it.

Libraries began the process of computerizing catalog records in the 1960s—likely even earlier, considering. (There exists a recently-revised book about corporate data management that contains the unaccountable sentence “An organization without metadata is like a library without a card catalog.” The Loon just… hasn’t the fight in her to tell its publisher what steaming guano the library side of that analogy is.) The data structure called MARC (for “machine readable cataloging”) designed by the peerless Henriette Avram shaped that transition, and is still (pace encoding changes, loosening record-length restrictions, a few other smallish tweaks, and the scourge that is “local [cataloging] practice”) in use in most US, UK, and Australian library systems today. Variants on MARC and MARC-based cataloging practice exist elsewhere in the world as well.

The Loon cannot bugle loudly enough that MARC was not designed for efficient information retrieval, neither searching nor browsing nor querying nor filtering nor faceting. It’s really quite bad at all that; ask any ILS developer, if you can abide the ensuing swearing. MARC was designed as a source format for mass-producing catalog cards. (The Loon does wonder sometimes what Avram knew about SGML, if anything. She might well have eschewed it for storage inefficiency. Will some enterprising Ph.D candidate in library history kindly get off their tail feathers and write a biography of Avram as their dissertation? It’s long past due.)

Mass production of catalog cards, easier record correction and updating, and sharing the cataloging load was the whole point of MARC. Remember, at this time photocopiers were not a thing, or at least not a thing within reach of most libraries. Unfortunately, libraries were already accustomed to mostly-centralized cataloging and computers were not common enough or well-enough networked at the time to build a viable peer-to-peer system, so OCLC swept right in to take the Library of Congress’s place as central record provider, swiftly becoming a de facto monopoly across most of the English-language-cataloging world.

It is no mystery how monopolies behave; economists are rather tiresomely repetitive on the subject. We see ourselves today in a situation where purported “non-profit” OCLC pays Skippy the Audacious Pritch over a million and a half a year (per the ICOLC report, which the Loon strongly recommends that you read), and sues everyone in sight, from a hotel with the temerity to paint Dewey numbers on its wall (OCLC also controls DDC) to a potential competitor (SkyRiver; long story the Loon won’t tell except to point out the anti-competitive behavior) to Clarivate/ExLibris. Charming people, OCLC. Just amazingly gracious and collaboration-minded folks, not threatening or self-dealing or absurdly entitled witches at all.

In whatever limited fairness the Loon can muster about all this, she will say that by and large she admires the work of OCLC Research. OCLC didn’t build the unit from scratch, though; it bought Research Libraries Group sometime or other in in the late oughts. Even so, a lot of good and worthwhile work has come out of that particular think tank, though the Loon will never understand why they employed the rather awful Jackie Dooley for a time. In fairness to the fairness, OCLC also has something of a track record of abandoning useful projects it can’t work out how to make money from, sometimes ones originating in OCLC Research; the way it dumped the PURL(.org) permalink scheme without a word beforehand to anyone relying on it was simply slipshod.

So OCLC’s main reason for existing is the tooling (“OCLC Connexion”), aggregation, correction, enhancement, sale (via subscription/membership model), and presentation (via WorldCat) of bibliographic and holdings records created by its member libraries. Yes, you read that right—unlike the Library of Congress back in the day, OCLC doesn’t itself catalog anything. Enjoy the parallels with scholarly communication! And yes, this also means that WorldCat’s name is a lie; it is not a truly global union catalog because many libraries, especially those where English is not a first or common national language, are not OCLC members. This is not to say that OCLC does not do real work; like even the laziest, most entitled of the big-pig publishers, it does. It is absolutely to say that OCLC, like the big pigs, holds libraries (collectively) to grossly elevated ransom hugely disproportionate to the actual work it does. The surplus seems to go to Skippy the Overpaid Pritch and lawsuits, mostly… and remember that the said surplus is extracted from libraries.

(There is a rant the Loon may yet rant on how often libraries blunder into this type of exploitation, partly due to inability to initiate, much less maintain, a proper commons. OCLC, big-pig publishers, the ILS, the institutional repository, many kinds of proprietary software, more… but not today.)

The key to this whole system, computationally, is an OCLC-specific identifier for bibliographic records called the “OCLC number.” (If any OCLC numbers turn up in MetaDoor, the Loon thinks Clarivate/ExLibris will be wholly unable to make any case in court that the records containing them didn’t come from OCLC, one way or another.) The OCLC number ties together record creation, record merging (i.e. of records from different catalogers/libraries/vendors describing the same “thing,” and please do not ask the Loon to define “thing” here because she will only weep copious linked identified FRBR-shaped res-and-nomen-flavored tears from her beady red eyes and no one wants that), record corrections and updates, and connections to other library systems and processes such as interlibrary loan. For probably-obvious reasons, this number, though it properly identifies only a bibliographic record, is often used as shorthand for whatever thing (see above about defining “thing”) the record describes.

Rather like—and if the Loon had hands she would be jazz-handsing—a linked-data URI. Like URIs for RDF, the OCLC number is the lynchpin of OCLC’s bibliographic enterprise. Indeed, if not for OCLC playing dragon-on-the-hoard, the OCLC number might have near-seamlessly evolved into the linked-data identifier for the biblioverse, for the objects (not to say “things”) within its purview. As it is, any competing system, especially one with an eye to linked-data friendliness, will have to whomp up a whole new record identifier. The Library of Congress can’t easily step in here; its holdings and recordset aren’t nearly as extensive as OCLC’s. Wikidata, for all its curious and generally delightful boldness, is not a bibliographic-record database, and (from what little the Loon understands about Wikibase) might well not scale to one without the servers falling over dead.

With that for background, what is it that MetaDoor is supposed to be, and how is it supposed to compete? Huge caveat for the ensuing discussion: the Loon has no insider knowledge, and Clarivate/ExLibris is playing its cards close to its chest at present. The Loon is of necessity making some educated guesses here.

Like OCLC, MetaDoor is intended to be a database of bibliographic records contributed by library catalogers. Unlike OCLC, MetaDoor is (to start, anyway) not playing dragon-on-hoard; the records will be open-licensed such that any given record up to the entire database is takeable and forkable. A later enclosure play is quite possible—“records contributed until now are CC0; henceforth, we are pulling an OCLC, such that we own whatever you put in, and you buy it back from us.” There would be outcry, but Clarivate/ExLibris need simply bet that libraries are too foresightless, cheap, and fighty to work out how to fork the database and collectively maintain and add to it—and the Loon must say, that is a very smart bet on C/ExL’s part.

Conspicuously missing from what the Loon has seen about MetaDoor is any mention at all of the sort of record deduplication, correction, and enhancement processes that OCLC routinely performs on contributed records. Bluntly: MetaDoor will be a wild abyss of near-total chaos. The Loon doesn’t think Clarivate/ExLibris (which, after all, builds a major ILS) harbors any delusions that the quality of contributed records will be high, or even uniform. Instead, she suspects that libraries will be gently encouraged toward a sort of peer-to-peer copy-cataloging system, in which catalogers look for libraries that do good work and set up their systems to adopt those libraries’ records. If MetaDoor is thinking toward linked data, another way to approach this would be to start breaking down MARC records into granular datapoints that could be queried to suit, or built up into decent-enough MARC records. (If the FRBRoids had actually known anything about real-world relational database design, which they did not, this breaking-down and reconstitution could have begun two decades ago, but once again, here we are. Some days the Loon just despairs of librarians, or at least librarian standardistas.) The other consequence of this free-for-all is that MetaDoor will not easily, or perhaps at all, be able to build an analogue to WorldCat.

Other developers might, however, if they are willing and able to take on MetaDoor’s chaos and withstand the probably-inevitable lawsuit from OCLC. Other developers might do a lot of things, possibly quite useful and attractive things, with the data in the MetaDoor database. At least to start, Clarivate/ExLibris will be happy to let them! Any win for MetaDoor chips away at OCLC’s de facto monopoly. Beware the day OCLC folds, however; the logical business thing for Clarivate/ExLibris to do then is pull a Twitter, destroying useful APIs or charging through several available orifices for access to them.

In the Loon’s trawl through the abovementioned corporate data management book, she learned that there is a business term for the kind of chaotic mess MetaDoor is likely to be: “data lake.” Throw all the data in, forget about quality, just dump it in and see what falls out. Over time, if the data in the lake is at all useful, busy IT beavers will start cleaning up and organizing the data they’re interested in, prodding data creators into better data-quality practices, separating out coherent chunks of data into data marts, building data marts up into data warehouses, and so on. Is Clarivate/ExLibris cynically hoping that library developers will cheerfully fix up MetaDoor’s data lake at no cost to it? Seems likely. Also a decent bet, the Loon thinks—cheerful librarian fixers are a big part of how OCLC became what it is, and ExLibris constantly dumps a ton of uncompensated quality-control, usability-testing, accessibility, assessment, development-strategy, and other work on systems librarians as it is.

If both OCLC and MetaDoor sound like grossly exploitative and unfair systems, well, this is why the Loon wishes a pox on both their houses. Is there a path out?

If the Loon ruled libraryland, she would pull together a bunch of library CIOs (they’re not often called that, but they do exist) and knock their heads together until they agreed to cough up the collective funding and development/systems effort to mirror (well, mirror plus delta) MetaDoor data on a regular basis, and provide value-add services such as APIs at low (including sweat-equity) or no cost. She would then pull together a bunch of library cataloging luminaries and knock their heads together until they agreed to build record pipelines into the CIOs’ record commons alongside whatever other pipelines they have—transparently-documented pipelines that do not invoke the wrath of Skippy the Litigious Pritch and his horde of slavering lawyers.

That way, if MetaDoor tries enclosure, a replacement record commons will already exist (and not have to be built from scratch in a tearing hurry) and the cutover for libraries and their catalogers will be a lot less painful than it would otherwise be. As that commons becomes cleaner and more sophisticated, even if MetaDoor doesn’t try enclosure, it will become an increasingly viable, likely less-expensive alternative to both MetaDoor and OCLC. Virtuous circle!

But the Loon does not rule libraryland, so we shall all have to wait and see.

« Previous