Gavia Libraria

Ramblings on FRBR Group 1

The Loon is currently swimming around and around RDF, RDA, FRBR, and the whole library-linked-data thing, trying to bootstrap a mental model that makes sense to her mad avian brain.

She has been intrigued by some of the objections to FRBR Group 1 entities. (Quick refresher: Group 1 entities are work, expression, manifestation, and item, often abbreviated as WEMI. FRBR Group 2 entities are people and groups of people, and Group 3 entities are subject abstractions that don’t belong in Group 1 and Group 2—dates, places, concepts, and so on.)

Karen Coyle’s objection, if the Loon is reading it right, is that several Group 1 entities are very far from being standalone. It’s rather rare to have an Item, for example, that makes sense on its own, sans information from the Manifestation(s), Expression(s), and Work(s) it is related to. The Loon, who’s rather shaky on RDF-style data modeling to begin with, isn’t sure how necessary it is that modeled abstractions be standalone, so she must admit she has no opinion on how telling this objection is.

Ross Singer adds several objections. One is that this atomization makes pulling together an intelligible user display quite a bit more difficult. Possibly true, but (as Singer himself says) not a deal-breaker. (The Loon herself wonders how caching will work in an RDF-based catalog. There will have to be quite a lot of it, that seems clear, else libraries will essentially be turning their catalogs into metasearch engines—sending queries to umpteen SPARQL endpoints in order to pull a single record display together—and metasearch didn’t go well the first time it was tried. This means that catalogs will have to implement cache-updating measures, and… sorry, the Loon is swimming off-track.)

Another is that FRBR Group 1 maps poorly onto other bibliographic linked-data models that throw together WEMI information willy-nilly. As best the Loon can tell, the RDA Vocabularies work that Diane Hillmann, Karen Coyle, and others are doing helps considerably with this issue; it’s entirely possible (indeed, encouraged) to use an RDA property without reference to FRBR.

A third is that FRBR/RDA doesn’t seem to manage well with what RDF would call “blank nodes” in the WEMI hierarchy. Imagine, for example, that J. Random Cataloger is cataloging Mozart’s Don Giovanni and José Zorrilla’s Don Juan Tenorio. Imagine then that Golden Age Spanish literature is not J. Random’s specialty, such that he does not know that the original is Tirso de Molina’s play El burlador de Sevilla. How is he to associate the Mozart and Zorrilla works—since the association fairly leaps from the page, but neither work directly adapts the other—without considerable literary sleuthing? RDA seems to include no way to say “this is an Expression of an unknown Work” or, more usefully, “these two Works appear to be Expressions of the same third, unknown, Work.”

Now that the Loon considers this difficulty from a comparative-literature perspective, it seems both bursting with opportunity and fraught with peril. The opportunity comes with the open-world assumption: J. Random won’t know the filiation of the many adaptations of El burlador (whyever should he?), but the Loon (mostly) does—and once the Loon publishes that portion of graph in RDA terms, every library catalog everywhere can take advantage of it. Relationship modeling, in other words, only need be done once (pace the occasional need for corrections), not redundantly at every single library with catalogers.

The peril, of course, is that without blank nodes, J. Random and his cataloger colleagues are liable to put a lot of incorrect filiations out there, at least at first. Returning to the Loon’s example, why wouldn’t J. Random just look at the dates and assume that the Zorrilla is an adaptation of the Mozart? And who is to correct this, since it is entirely wrong? And does FRBR/RDA truly require catalogers to be comparative-literature scholars?

Goodness, what a lengthy sidetrack. (Though one not without interest!) Back to business.

The Loon was introduced to FRBR as a downy loon-chick taking cataloging. Even then, she thought that some of WEMI felt a bit squishy. She still thinks that. She thinks, though, that there may be a frame in which WEMI makes sense: cataloging workflows. Where is J. Random doing URI lookups, where is he adding information, and where is he relying on the existing linked-data cloud? The Loon thinks this plays out relatively neatly along WEMI lines.

El burlador de Sevilla will have a Work URI one of these fine days. With a new edition thereof in hand, J. Random will not transcribe title or author, nor need he start an authority search; he will use title and perhaps author as search terms to locate the Work’s URI, and once he does so, all the Work information (including a URI for Tirso’s authority record) magically fills in. (Imagine all the saved typing! The Loon thinks this is marvelous.)

“Expression” appears to be shorthand for a web of relationships, as Singer hints. Chances are that J. Random Cataloger won’t have to bother with it; that kind of filiation is done by subject experts, and is pressed into service via clever developers following URI link trails. (The Loon has trouble imagining patron-facing UI for this, but that’s due to the Loon’s poor visualization skills. Someone will sort out how to make patrons happy with this information, the Loon is quite sure.)

Manifestation information will be J. Random Cataloger’s next problem. Lookups, again, will often solve it (add Publisher, Date, and/or Editor, and a Manifestation URI should result), but not always, in which case J. Random Cataloger will have to transcribe some information and (the Loon believes) transmit it to a central clearinghouse for URI minting. (Perhaps this will never happen! Perhaps the publisher will mint a URI, linking to information on publisher, date, editor, contributed frontmatter, etc., that libraryland will adopt for Manifestation-level cataloging. The Loon doesn’t know how all the business relationships will shake out, much less the linked-data linkages.)

Item-level information is as close to holdings as makes no nevermind, as best the Loon can tell. This will be where the library itself mints one or more URIs, which it doubtless ships off to OCLC to be included in WorldCat. (Imagine WorldCat being able to display holdings and even query circulation status on-the-fly for all the items in all the libraries! It’s quite feasible in a linked-data world.)

The Loon isn’t a cataloger, by the way, so it’s quite possible her sense of all this is thoroughly wrongheaded. The thought occurs, for example, that she may have started her workflow on the wrong level of WEMI: if J. Random Cataloger inputs the M information, it may be dead easy for the cataloging software to pull in both W and E information from the cloud, without any additional effort whatever from J. Random. Certainly no more than one or two datapoints from the W level should need to be added!

One area where the Loon is deeply uncertain surrounds “local cataloging practice.” She believes that quite a bit of “local practice” is camouflaged usability engineering of rigid and poorly-thought-through OPAC record displays. With any luck at all, that work will move out of cataloging practice (where it truly, truly does not belong; catalogers should not have to be bothered with this!) and into UX design and software engineering. The rest of “local practice?” Well, the Loon doesn’t really understand it, so she doesn’t know how much of it will no longer be needed, nor how much of what’s left will create WEMI issues.

Do please explain to the Loon all the bits of this she’s wrong about in the comments. Even if that’s all the bits, as may well be. She’s got to build up that mental model somehow.

2 thoughts on “Ramblings on FRBR Group 1

  1. LibraryLoon Post author

    Re filiation: On consideration, the Loon isn’t sure that WEMI captures all the possibilities. It’s stretching matters somewhat to say that everything touching on Don Juan in every genre in every language everywhere traces back to El burlador de Sevilla; at some point, Don Juan took on a life of his own, wholly divorced from Tirso’s morality play. And what about myths whose origins are lost? Or folklore, where origin is rarely clear to begin with, and whose scholars do not (insofar as the Loon understands folkloric practices) model direct influence anyway?

    Eh. Perhaps these concerns aren’t strictly bibliographic, and the Loon is overintellectualizing. Nonetheless.

  2. Ross Singer

    You raise even more perils in adhering strictly to the WEMI model! I hadn’t even thought about the lack of expertise of relating between endeavors (and how many surrogate resources might need to be created in order to describe that two directly unrelated endeavors derive from a third ur-Work. Ugh.

    My concern around the blank nodes could be circumvented by having properties that shortcut us around the missing pieces. A way to link Manifestations to their Works without contriving an Expression, etc.

    Of course, I’m not sure J. Random even knows or cares what a Manifestation is and I am pretty certain he doesn’t know (or care!) what properties are associated with it vs. the Expression or Work (sacre bleu! You associated the author to the Manifestation! No! Language? Are you mad? Technically speaking, a Manifestation shouldn’t even have a title — unless it differs from the Work for some reason). I am fairly confident that J. Random’s citation manager (or whatever) will not be interested in these distinctions, instead modeling “Book”. How do we link to that?

    Mind you, far from being a cataloger, I’m not even a librarian, so my wrongheadedness may be even more prominent (it rears itself far more often than I would like, unfortunately), but I do think what we’re talking about here are legitimate concerns and I don’t think they’re really being addressed.