-->
Gavia Libraria

JSTOR, reader privacy, and slippery slopes

Sledding hills aren’t the only slippery slopes around just now. JSTOR just stepped onto a rather worrisome one.

For one thing, they’re courting a lawsuit from accessibility advocates. “Register & Read” articles will be available only as pictures of text, not actual text, is what the Loon’s heard. Visually-impaired independent scholars are out of luck.

Equally worrisome is the privacy precedent, though; the Loon wonders if reactions from librarians and faculty would have been quite so positive if the American Chemical Society or Elsevier’s Scopus had announced such a program. Folks tend to think more kindly of JSTOR.

Let’s have a glance at JSTOR’s privacy policy, shall we? (All quoted verbiage is verbatim from the just-linked page as of today, 13 January 2012. The Loon obviously can’t control JSTOR changing its policy, so read with caution, Readers from the Future.) Here’s what they collect:

To facilitate your order through the Publisher Sales Service (a service through the JSTOR archive that facilitates the purchase of articles from publishers), the Current Scholarship Program, and Individual Access, as well as your purchase of ITHAKA and its services’ brand merchandise, we collect your name, postal address, email address, credit card details, and transactional information. In connection with your creation of a MyJSTOR account, we collect your name, email address, username, and password, as well as other information you may provide. We also may collect certain non-personally identifiable information, such as the type of browser you are using (e.g., Netscape, Internet Explorer), the type of operating system you are using (e.g., Windows or Mac), and the domain name of your Internet service provider.

Let’s assume for a moment that JSTOR won’t make ugly use of names and emails, or alternately, that Register & Read users use pseudonyms and throwaway email accounts. What’s left is plenty enough, in these days of reidentification, to uniquely identify a lot of JSTOR users, perhaps nearly all of them as it’s a fairly rarefied group. (Don’t be too impressed by those 150 million annual turndowns. The number of unique individuals that number represents is considerably smaller. The Loon would also bet panfish aplenty that most turndowns come from people who have licit access from their libraries but didn’t configure a browser proxy for off-campus access.) Browser fingerprint plus ISP plus time of access (easily available from either JSTOR or ISP server logs, even if JSTOR nobly deletes IP addresses, which the Loon doubts) identifies almost anybody who isn’t on coffeehouse wifi.

How does this compare to the information a library collects? (Remember also that personal information collected by most public and academic libraries enjoys legal protection. Not so JSTOR.) The Loon has yet to hear of a library holding onto credit-card info (use? yes. hold? no), much less individual browser fingerprints (check aggregate browser usage? sure. individual? no way) or for pity’s sake ISP!

So what does JSTOR say they will and won’t do with this information?

We use the personal information collected in ways that are compatible with the purposes for which it was intended to be used: to enable your use of ITHAKA Websites and ITHAKA’s services; to facilitate the Publisher Sales Service; to create your MyJSTOR account; to facilitate your purchase of ITHAKA-branded merchandise; to respond to your inquiries; for system administration, customer support, and troubleshooting purposes; for targeted marketing and product service announcements; for sending newsletters; to improve the design of ITHAKA Websites; to enable us to enforce our Terms and Conditions of Use; and in aggregate form, to track and analyze site usage.

“For targeted marketing and product service announcements.” Oh, dear. That’s a few inches down the slippery slope already. No opt-in, JSTOR?

Even worse:

Further, in connection with postings you may make that are available to the general public on any ITHAKA-related page on social networking sites, including but not limited to statements made on the JSTOR page on Facebook and on Twitter, we may use your name, statement, comment, and affiliation in conference presentations, JSTOR’s newsletters, and marketing and announcement materials.

The Loon seems to recall that holy hell has been raised about such practices before, Facebook. But it’s all right for JSTOR? With no opt-in? (JSTOR’s Facebook page, all right, it can be argued that anyone posting there opted in. But all of Twitter?)

Quite a bit worse still:

ITHAKA does not sell or share personal information about or the purchasing history of individual users, except as set forth above and in the following circumstances:

  • if required to do so by law or if we believe in good faith that such action is necessary to comply with the law or a legal proceeding; to protect against violations of our Terms and Conditions of Use; or to protect and defend our rights and property or the rights and property of rights holders whose content is made available through ITHAKA Websites;
  • with service providers with whom we have entered into agreements to assist us with our business operations;
  • if you are accessing JSTOR through a token provided by your institution, on request from the institution solely for user verification purposes by the institution that provided the token; and
  • other third parties, such as an institution with which you are affiliated, where you explicitly consent to our sharing your information.

The last clause of the first bullet-point speaks to l’affaire Swartz, and once again, contrasts unfavorably with standard library practice. If a library catches you mass-downloading, or is reliably informed by the content provider that someone is doing so and finds out it’s you on checking (temporary, of course) logs, the library will stop you doing it and may impose other penalties such as loss of privileges. They may even remand you to an honor-code court or other such organization-internal higher authority, if your behavior broke your organization’s rules.

What the library won’t do, ever, is rat you out to the publisher. Guess what? JSTOR says they can and will. Should you worry about this? Well, do you want to end up in jail like Swartz? Sure, you’re not mass-downloading, independent scholar, not least because JSTOR won’t let you. Given that the publishers, not JSTOR, are calling the shots here, though—do you trust the same publishing industry that’s bankrolling and jawboning the Research Works Act not to define deviancy down until it catches you? Do you trust the same FBI that arrested Swartz despite JSTOR and MIT refusing to press charges not to come after you? The Loon doesn’t.

The second bullet point, though? That’s a killer. That’s every flavor of the awful the-user-is-the-product e-commerce culture that libraries have steadfastly resisted joining. JSTOR can sell you to Facebook, as @alexscat pointed out on Twitter. JSTOR can sell you to DoubleClick. JSTOR can sell you to Elsevier. JSTOR can sell you to anybody. Who wouldn’t want a highly-educated demographic with money to burn on scholarly articles?

Now, it’s possible that “service providers [who] assist us with our business operations” is a stricter standard than the Loon is suggesting; “random marketers who will throw cash at us in exchange for our userlist” may not qualify. If that’s so, JSTOR’s lawyers would do well to make that abundantly clear right now.

The last scary bit, from a reidentification-savvy standpoint:

In addition, ITHAKA shares general usage data in aggregated form so that no personal information is identifiable to participating institutions, content providers, researchers, and the general public.

Oh, dear, JSTOR. This was a bad idea when AOL did it. It was a bad idea when Netflix did it. It was even a bad idea when Harvard researchers did something like it. It’s still a bad idea.

Libraries, of course, do not do this; libraries crunch numbers internally (for values of “internal” that sometimes include “consortium-internal”) and sometimes publish results (with graphs/charts rather than raw data) that point to potential library-service improvements, but that’s the extent of it. (Hathi Trust, take note. You may also face this issue.)

Now, then. Perhaps all this doesn’t seem so bad to you. Perhaps it seems a fair trade. Perhaps you trust JSTOR. Fine.

JSTOR isn’t doing this to play nice. JSTOR is doing this because they see dollar signs; it’s a classic “freemium” play. Part of the reason it’s grabbing attention is that JSTOR is the first scholarly aggregator to make a freemium play for individuals (that the Loon knows of, anyway). The first—but, the Loon suspects, not the last, and therein lies another slippery slope. What happens when EBSCO tries this? Or, heaven help us all, the ACS? Do you think their privacy policies will echo JSTOR’s? Don’t you think there’s some chance they’ll be even worse? Perhaps quite a lot worse?

The Loon doesn’t like this. At all. She doesn’t believe librarians should like it, for reasons far beyond potential library disintermediation. Could it be done ethically? Likely so, but if JSTOR is the white-hat player in the aggregator race, and this is the best they can do, the Loon wouldn’t bet a single fishbone on ethical implementation industry-wide.

One last thing. Mellon Foundation? You wrought this, with your sententious insistence on “sustainability” at any cost, and you should be ashamed. Please ponder the ethics of certain forms of “sustainability,” then write some guidelines into your grants. The Loon will thank you; librarianship will thank you; the world will thank you.

2 thoughts on “JSTOR, reader privacy, and slippery slopes

  1. Jason Baird Jackson

    Thank you for looking at this closely and sharing your insights. I hope that ITHAKA leadership addresses your questions carefully and, ideally, here on your website.

    1. LibraryLoon Post author

      The Loon would rather see them address it in that privacy policy, but thank you!