Patron Data, Patron Peril

837b357dc46c47fc99560e03b8841a27?s=47 Dorothea Salo
November 10, 2020

Patron Data, Patron?Peril

Given for the University of Iowa Libraries.

837b357dc46c47fc99560e03b8841a27?s=128

Dorothea Salo

November 10, 2020
Tweet

Transcript

  1. Patron Data, Patron Peril? Keeping ourselves and our patrons safe

    Dorothea Salo Information School University of Wisconsin-Madison
  2. Apologies, first ? This talk is whomped up out of

    an amalgam of ? course slidedecks ? earlier talks ? a forthcoming article ? ongoing research (datadoubles.org, and for clarity, I do not represent this project or its other investigators today) ? It won’t hold together as well as I like my talks to do. It certainly doesn’t have pretty slides! ? I’m sorry. I ask for and appreciate your patience. ? Silver lining: I don’t mind tangents! They can’t interrupt a ?ow that doesn’t exist! So ask all the questions you like whenever you like.
  3. Pivot, second ? The request for this talk came from

    a learner in my Information Security and Privacy course. I was originally asked to catalog privacy dangers and demonstrate threat models. ? I don’t want to do that right now, though. I’m raw and tired, and I know I’m not the only one. ? Recommended, if you want this: Morrone et al’s https:// dataprivacyproject.org/learning-modules/risk-assessment/ ? So, instead, here’s my plan: ? Foundations: why privacy in libraries? ? Situation report: what are today’s threats to library privacy speci?cally? (spoiler: there are lots!) ? Blameless post-mortem: how did we let this happen? ? Testing a heuristic: “physical-equivalent privacy.” How can we think di?erently so that this stops happening?
  4. Physical-equivalent privacy? ? Yes. Article with this title forthcoming in

    a privacy-themed issue of Serials Review. I don’t know exactly when. ? I can’t make it open-access until publication. Honestly, I’m chewing my ?ngernails about that. But as soon as it goes live, I’ll put my accepted manuscript in MINDS@UW. ? I also have no room to criticize the publication schedule, because I turned in my manuscript a month late! (Love you, SR editors!) ? But if you want a preview (beyond this talk)… ? … go look at the slides from my NASIG 2015 keynote, especially the slide about video surveillance, because that’s where the idea began. ? http://www.bl-litho.com/dsalo/aint-nobodys-business-if-i-do-read- serials-with-notes
  5. 1. Foundations

  6. Ethics codes ? IFLA: “… respect for personal privacy, protection

    of personal data, and con?dentiality in the relationship between the user and library…” ? https://www.i?a.org/publications/node/10056 and it’s excellent, the best and most situationally-aware document libraries have ? ALA: “We protect each library user's right to privacy and con?dentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.” ? ACRL: “The privacy of library users is and must be inviolable.”
  7. The thing about ethics codes is… ? … they’re largely

    deontological. Here Are Your Principles, Go Forth And Observe Them. ? Fine as far as it goes… but doesn’t explain why why WHY these are the principles! ? Much less how to operationalize them. (Which, fair: operationalization changes constantly, but ethics codes shouldn’t.) ? Or what to do when principles collide. Which principle wins? ? I mention this because in my estimation, privacy has been taking a back seat to several other principles lately. I don’t approve. ? Allows empty lip service.
  8. So why privacy, then? ? Really excellent read, highly recommended:

    ? Steve Witt. “The Evolution of Privacy within the American Library Association, 1906–2002.” Library Trends 65:4 (2017). ? My next ?ve slides derive entirely from this piece. ? Turns out to be pragmatic consequentialism: without privacy, patrons got in trouble… and so did libraries. ? 1906: Immigrant Henry Melnek, suspected of anarchism, arrested. Chief Librarian helped with the arrest, even testi?ed against Melnek in court, disclosing his library information habits! ? Russian czarist agents were also involved (weird echoes today, right?). And a newspaper called libraries “schools of anarchism” for having anarchist materials available. Criticism of libraries went on for years!
  9. ALA president Arthur Bostwick, 1911: ? “In [the library’s] registration

    ?les it has a valuable selected list of names and addresses which may be of service in various ways either as a MAILING-LIST or as a DIRECTORY. ? “Probably there are no two opinions regarding the impropriety of allowing the list to be used for COMMERCIAL PURPOSES along either line. ? (Me, today: … really? I wish there weren’t!) ? “The use as a directory may occasionally be legitimate and is allowable after investigation and report to someone in authority. ? (Me, today: really? when? what investigation? which authorities?)
  10. Arthur Bostwick, 1911: ? “I have known of recourse to

    library registration lists ? by the police, to ?nd a fugitive from justice; ? by private detectives, ostensibly on the same errand; ? by a wife, looking for her runaway husband; ? by persons searching for lost relatives; ? and by creditors on the trail of debtors in hiding. ? (Take a moment. How many of these scenarios matter today? Which do you trust? Not trust?) ? (De?nitely notice Bostwick’s “ostensibly.” Today I’d extend this to the other points too! People and organizations LIE OFTEN and CHANGE THEIR STORIES about why they want data and how they use it!)
  11. Arthur Bostwick, 1911: ? “One thing is certain: except in

    obedience to an order of court, it is not only unjust, but ENTIRELY INEXPEDIENT from the library’s standpoint to betray to anyone a user’s whereabouts against that user's wishes or even where there is a mere possibility of his objection. ? (Me, today: just whereabouts? much more is knowable!) ? “If it were clearly understood that such consequences might follow the holding of a library card, we should doubtless LOSE MANY READERS that we especially desire to attract and hold.” ? (Me, today: Is this still true? I believe it is, but I don’t have an all- encompassing answer. That’s part of why I signed on to Data Doubles.)
  12. 1939: Code of Ethics ? Why? Because it was the

    Great Depression, and librarian labor was su?ering. ? Response: demonstrate that not just anybody could be a librarian! ? This gets deeper into questions of how professions work than I want to get, fascinating though I ?nd labor history. ? But ethics codes were de?nitely a step toward professioning up. ? I mention this because protecting one another as workers is also deeply salient today! Can we use privacy as something that sets us apart? ? To do so, we’d have to be actively protecting it, of course! It won’t help us to trumpet promises we aren’t keeping.
  13. Privacy: not a slam dunk! ? As you can imagine,

    drafting the Code was not a one-and-done thing. Editing by committee! ? Privacy and con?dentiality all but disappeared from some drafts. ? There was debate within the profession over privacy! Many librarians believed turning in anarchists was the right thing to do, for example. ? Relevant to today? Yes, absolutely. ? Privacy versus security ? Privacy versus “customer relationship management” ? Privacy versus assessment and analytics ? Privacy versus improved (?) service ? I’m hardcore about this: PRIVACY SHOULD WIN, hands down and without question. But not every librarian today is me!
  14. We value privacy BY CHOICE. Some of us wish to

    choose otherwise. Some of us have already chosen otherwise, with words and actions.
  15. 2. Situation report: where are we?

  16. Libraries are largely not protecting patron privacy at present.

  17. That’s a very big statement. Let’s see some reasons I

    say that. (this will be very incomplete; see also the work of e.g. DLF Privacy and Ethics in Technology group, Digital Shred, Alison Macrina/Library Freedom Project, Yasmeen Shorish, Sarah Lamdan, Scott Young, Heather Shipman, Melissa Morrone, Kyle M. L. Jones and collaborators, and so many, many more)
  18. Failures of infrastructure

  19. “PACKET SNIFFER” COPIES-AND-SAVES NETWORK TRAFFIC WORKS ON LOCAL NETWORKS, WIFI

    ALL TEXT, IMAGES FROM INSECURE (HTTP, NOT HTTPS) WEBSITES
  20. Imagine turning Wireshark loose in a library whose website and

    OPAC are insecure.
  21. Okaaaaaay… ? What can we do about this? ? Serve

    all library websites and services over HTTPS, not HTTP. ? Prefer wired to wi? access on in-library patron and sta? machines. Secure wi? as best we can. ? What have we done about this? ? Breeding 2018: 7.9% of academic libraries and 18.3% of public libraries serve HTTP websites, not HTTPS. WE ARE BEHIND. ? Wi? protection in libraries: no systematic investigation I know of, so we don’t know much, but I’m not sanguine. ? (It doesn’t help that wi? protocols leak privacy like sieves presently. This will change, but not as quickly as I’d like.)
  22. (we would be here all day if I started in

    on Google’s milliard privacy failures) Our ol’ pal Google
  23. From a library website…

  24. None
  25. (but Facebook is worse!) Our ol’ pal Facebook

  26. Okaaaaaay… ? What could we do about this? ? Default

    our in-library browsers away from Google toward DuckDuckGo or Qwant or searX. ? Stop using other Google services, especially YouTube and Google Analytics (use Matomo or another privacy-aware alternative instead). ? Dump Facebook. (At least stop advertising it!) ? Educate and advocate. ? What are we doing about this? Nothing.
  27. None
  28. Okaaaaaaay… ? What could we do about this? ? Install

    tracker-blockers in browsers on in-library machines. ? Refuse facial recognition and other biometrics outright. ? Academic libraries: refuse ID-card tracking outright. ? Refuse the Internet of Things outright. It’s not secure! It’s not private! ? Educate and advocate. ? What are we doing about this? Nothing. ? (with the exception of a few — too few! — advocates and educators)
  29. Hall of Shame

  30. Failures of data minimization* *DATA MINIMIZATION: collecting and storing only

    data absolutely required for unquestionably necessary operations** ** I do not believe assessment is unquestionably necessary. I am, however, unusual in that.
  31. None
  32. None
  33. Okaaaaaay… ? What could we do about this? ? Don’t

    collect data! Don’t store data! Don’t keep data! Delete data! ? Privacy policies with teeth, fully enforced. I dig San Francisco Public Library’s: https://sfpl.org/about/privacy-policy ? ALA privacy audits. This is what they’re designed for! ? Riding herd on ILS vendors, content vendors, etc. ? What are we doing about this? Not a lot! ? I have a friend who is a programmer for an ILS. Horror stories about libraries asking (asking!) to store e.g. driver’s-license image scans. ? When was the last time you deleted your proxy-server logs? ? The UW-Madison Libraries do not have a comprehensive privacy policy. The only unit that does is the Digital Collections Center.
  34. Failures of confidentiality

  35. None
  36. None
  37. # library (e)book checkouts # and date(s) of library-computer logins

    # library databases accessed # academic journals accessed Appointments with peer tutors Chat reference transactions Interlibrary loan transactions One LA project, identified (!) data on all undergraduates: # of classes attended with library instruction
  38. Okaaaaaaay… ? What could we do about this? ? Be

    very, very clear about what “con?dential” means. I see too many librarians extending it past all sense: “patron data are still con?dential because I decided they could have it!” for many values of “they.” ? (Several privacy interpretations of library ethics codes fall into this trap. I’d like to see that ?xed. Simple heuristic, for starters: if the data’s seeing use outside the library, IT AIN’T CONFIDENTIAL!) ? Train our people better. All our people. It’s not enough for me to yell at my students (though I do!). Not all library employees have ALA- accredited degrees, and “not having the degree” is no excuse for this. ? Stop letting unethical patron-data use in research, both internal and for publication, slide by. ? Refuse to add patron data to campus or municipal data warehouses. ? What are we doing about this? Not half enough.
  39. Failures of enforcing discipline on vendors

  40. —Eric Hellman spoiler: nah.

  41. https://www.snsi.info/ —Sarah Lamdan

  42. Okaaaaaaaay… ? What could we do about this? ? Guidelines.

    License terms. Model licenses, model license language. ? Stop letting NISO write these! Stop letting NISO say it speaks for libraries! NISO is not a library organization; it is also underwritten by vendors. This is an inherent, structural con?ict of interest. ? Audit vendors. They have to do accessibility VPATs; why don’t we have a privacy analogue to VPATs? ? Educate and advocate. ? What are we doing? Nothing.
  43. Failures of enforcing discipline on ourselves

  44. Patrons schooling their library on privacy? What has happened to

    my profession.
  45. If you admit that privacy is an obstacle to what

    you’re doing, consider… not doing it! Oakleaf, Megan. 2018. “Library integration in institutional learning analytics.” https:// library.educause.edu/-/media/files/library/2018/11/liila.pdf
  46. Council of UW Libraries strategic plan, 2019 (I think)

  47. Me and Minnesota… ? (my paraphrase, obviously, and I am

    obviously biased) ? Me: *gives keynote at MnLA Annual 2019* ? Me: *brings evidence of poor privacy practices in speci?c libraries/ consortia in Minnesota* ? Keynote: *goes over like lead balloon* (they can’t all be winners) ? WiLS: “Hey, Dorothea, favor? Would you give this talk as a webinar for us?” Me: “Sure.” ? A Minnesota librarian: “Hey, WiLS, Dorothea brought evidence! It was awful!” ? WiLS: “Hey, Dorothea… no evidence from speci?c libraries/consortia in your webinar, plzkthx.” ? Me: “I withdraw the webinar.” ? Me: *posts slides to SpeakerDeck anyway, because why not*
  48. Bluntly: This ain’t it, librarians. We can’t fix what we

    won’t even discuss. We can’t do right if pointing out wrong is worse than doing wrong.
  49. Libraries are largely not protecting patron privacy at present.

  50. 3. (as) Blameless (as I can make it) post-mortem

  51. This space is hard to parse. ? It took us

    literal, actual DECADES to ?gure out privacy around physical libraries and materials. ? We’re not even done ?guring it out yet! Though we have a (curiously implicit, often) shared understanding of best practices. ? No surprise we haven’t ?gured it out for online yet. It’s a lot to get our heads around! ? That said, I could wish we’d put a lot more e?ort toward it, as a profession… but that’s water under the bridge. ? I have an idea about how to make it more tractable. Hold that thought; I’ll get to it.
  52. We’re not being told what we need to know. ?

    “Dark [design] patterns:” underlie a lot of privacy dangers, online and o?-, in and outside libraries. ? Intentionally misleading/deceptive/untransparent design choices ? Secrecy and outright lies from Big Tech ? Secrecy and outright lies from Big Data pushers ? Secrecy and outright lies from Big Content ? among whom I count many library content and service vendors ? Secrecy and outright lies from government agencies ? It’s a complicated environment! Transparency would sure help!
  53. We don’t have enough experts. ? We do have some!

    Becky Yoose, in addition to folks I’ve previously mentioned. ? LDH Consulting Services: https://ldhconsultingservices.com/ ? I’m trying. So are Alison Macrina, Digital Shred, Melissa Morrone, ALA OIF/Erin Berman, DLF… ? But the intersection of privacy, technology, and libraries is hideously complicated. “Expert” is a legitimately hard place to reach! ? I’m not sure I’m there, and I both research and teach this stu?! ? I do know I can’t get somebody there in the fourteen weeks of a three-credit no-tech-prereqs course. Don’t come at me with “it’s all LIS education’s fault!” You will not like my answer.
  54. Our environments are “Big Data? READY, FIRE, AIM!” ? I

    feel this especially hard as an educator right now. The situation with pandemic exam proctoring is just appalling. ? All praise to Z Smith Reynolds Library at Wake Forest University! ? Real thing I heard from a real librarian once about patron-data analytics: “Finally I can speak to my administrators in language they understand!” ? The environments libraries exist in do not usually share or even understand library ethics! ? The people and services libraries rely on (IT, vendors, standards bodies) do not usually share or even understand library ethics!
  55. We’re scared. ? The Library Value Agenda, the CRM movement…

    they come from a place of (real, justi?ed) fear. ? We are afraid of being disintermediated, erased and made invisible… and let’s be blunt: ?red. ? We’re grasping at anything and everything to prevent that… and surveillance / data analysis is hot right now. ? This is one place clash of deontological principles turns up. ? Accountability is also a principle we believe in! What happens when that appears to mean compromising on privacy?
  56. We want to do right by patrons… ? … and

    that can be a trap. ? Deontological principle clash, again! ? (with an apologetic nod to Scott Young, who points out that “service” is not actually an ethical principle, but a practice) ? If we posit that surveilling patron behavior and analyzing patron data are the best/only ways to learn how to serve them… how do we decide not to do that? ? Now, that’s a really big “if” there — I don’t actually believe it for an instant! The evidence base for service interventions based on surveillance and Big Data is absolutely ABYSMAL. ? But that still leaves “if it DOES work, does that mean we should?”
  57. We’re being used. ? RA21 / Seamless Access / SSO

    ? very, very “about us without us” (RA21: zero librarians until the comment stage. Seamless Access: tokenized librarians) ? very, very dangerous (to more than privacy!) ? some very, very untrustworthy people and organizations involved ? the Sci-Hub wars ? I do not like what I see out of this SNSI thing. ? CRM: OrangeBoy, OCLC WISE, Gale Analytics… ? Open access —> patron data exploitation ? Sam Popowich has a devastating piece on this. Recommended. ? https://journals.library.ualberta.ca/jcie/index.php/JCIE/article/view/ 29410
  58. 4. SO NOW WHAT? “Physical-equivalent privacy,” maybe?

  59. Here’s my idea. ? Online privacy dangers tend to be

    out-of-sight, out-of-mind… unlike (most) physical privacy dangers. ? Libraries have fairly solid best practices around the privacy of using information in physical carriers. ? I’m not claiming perfection! I’m claiming thought and procedure. ? So… maybe it makes sense to ?gure out what the physical analogue to online patron-data capture/ storage/use looks like? ? To make it easier to evaluate whether we’re okay with it?
  60. Or, formalized: ? [T]he PRIVACY of an e-resource may be

    considered PHYSICAL-EQUIVALENT only when a patron using an information-equivalent physical resource would enjoy no more privacy than the same patron using the e-resource. ? (The distinction is really online/o?ine, not physical/digital. I know this, okay? I wanted the alliteration. Nitpickers step o?, please.)
  61. Warning: PEP is messy. ? I don’t pretend it’s ironclad,

    waterproof, or free of weird edge cases. It’s not! ? That’s okay, though. I’m not trying for that! ? In my Twitter bio: “Ethicists are scalpels. I am a buster sword.” ? I’m trying for a quick-and-dirty thought process (based on long-standing, time-tested practices) that librarians can use as a handy yardstick. ? Term of art for this, from psychology and neuroscience: “HEURISTIC.”
  62. Okaaaaaaay, so how…? ? Step 1: Figure out what patron

    data is captured/ stored/analyzed/used/shared/sold around a given online information use. ? This is de?nitely the hard part, not least because of all the secrecy and lies around it. ? I suggest methods in my forthcoming article, but for today’s exercises I’ll just be giving you this up-front! ? Step 2: What would have to happen for this amount of data to be captured (etc.) about a patron using an analogous physical object? ? Step 3: Is that scenario okay? If not, the analogous online scenario probably isn’t either.
  63. Three examples! (if we have time) ? Insecure (non-HTTPS) OPAC

    ? Adobe 2014 ? University of Minnesota learning analytics ? Which I called all the way out in the aforementioned keynote. ? Was I right? Was I wrong? You make the call. ? (I’ve been wrong before. I think I’m also Data Doubles’s biggest privacy hawk; even my co-investigators don’t always agree with me!)
  64. Insecure OPAC ? Makes available to anyone packet-sni?ng (e.g. with

    Wireshark) on the same local network: ? Full content of all OPAC pages browsed, including search-results pages and individual-item pages ? All URLs browsed (this is actually true of securely-served OPACs too! it makes me rethink OPAC item permalinks…) ? All search terms entered into search forms (or in URL query strings, which frankly no library web tool should be using in 2020) ? All items requested via holds, delivery, or save-this-for-later features ? Easily traceable to the device being used (including devices belonging to and used by only one patron, like a phone). ? Okay. Capture this amount of info about a patron browsing the card catalog and library shelves. Go!
  65. That seem okay to you? If not, HTTPS your website

    and OPAC.
  66. Adobe 2014 ? “Adobe Digital Editions:” common ebook-reading software, including

    for library ebooks. ? In 2014, caught sending the following user information across the Internet, sni?-vulnerable: ? user and device identi?ers ? each ebook accessed ? length of time spent reading the ebook ? percentage of ebook read ? exact pages viewed ? Capture this information about a patron reading a physical book. Leak the info equally broadly. ? Wherever the patron does the reading! In-library or out of it!
  67. What did Adobe do? ? Encrypted communication between Adobe Digital

    Editions and Adobe servers. ? No more sni?ng! ? That’s it. ? As far as we know, they’re still collecting the data. ? We still don’t know what they did or are doing with it. ? Did I mention that Adobe is a major data broker? ? And an Adobe partner/subsidiary (Mobilewalla) published a report geolocating and tracking George Floyd protesters?
  68. That seem okay to you? It terrifies me.

  69. University of Minnesota ? Remember that list of undergraduate library-use

    data points I had up earlier? It was from… ? UMinnesota’s library learning analytics project. ? I based the list on their published public publications! No inside intel! ? They did not notify students. There was no opt- out, much less actual informed consent. ? The library-use data was combined with identi?ed demographic, GPA, transcript, and other university data. ? And in C&RL, some of the published statistics are for very low-n populations, raising the chances of individual reidenti?cation. (I’m pretty sure I could do it, and I’m not experienced at reidenti?cation.) ? C&RL was told of this and chose to do nothing. NOT OKAY, C&RL.
  70. # library book checkouts # and date(s) of library-computer logins

    # library databases accessed # academic journals accessed Appointments with peer tutors Reference transactions Interlibrary loan transactions Collect this (identified!!!) data on physical library users. # of classes attended with library instruction
  71. That seem okay to you? It still seems wrong to

    me.
  72. Last thought

  73. ALL OF US

  74. I just did. Pretty loudly, I thought. Join me.

  75. This slidedeck copyright 2020 by Dorothea Salo. It is available

    under a Creative Commons Attribution 4.0 International license. Reach me at salo@wisc.edu.
97精品免费公开在线视频_ caoporn国产免费_ 超碰高清熟女一区二区