Name Authority Records and Personally Identifying Information

admin

Name Authority Records and Personally Identifying Information

20 Sep, 2022 9 min read Jot

The IFLA-LRM model and FRAD (Functional Requirements for Authority Data) before it call for, from the perspective of library catalogers, an expanded collection of personal data about the contributors to works. This was brought to my attention via the work of (Citation: Billey, 2019) Billey, A. (2019). Just Because We Can, Doesn’t Mean We Should: An Argument for Simplicity and Data Privacy With Name Authority Work in the Linked Data Environment. Journal of Library Metadata, 19(1-2). 1–17. https://doi.org/10.1080/19386389.2019.1589684 and (Citation: Serra et al., 2020) Serra, L., Schneider, J. & Segundo, J. (2020). Person Identifiers in MARC 21 Records in a Semantic Environment. Cataloging & Classification Quarterly, 58(5). 505–519. https://doi.org/10.1080/01639374.2020.1771499 . Billey encourages a thoughtful response of caution and abstinence from the onslaught of the RDA practice, while Serra and colleagues discuss the technical details of how personal identifiers are entered into MARC records.s

Institutional Morals

The work of Amber Billey is especially thought provoking at several levels because of the various kinds of assumptions that it makes and how it situates its arguments in the socio-political context. For example, at one level it argues from the ethics statement of the American Libraries Association. The author makes reference to statements number 1 and number 7:

We provide the highest level of service to all library users through appropriate and usefully organized resources; equitable service policies; equitable access; and accurate, unbiased, and courteous responses to all requests.

We distinguish between our personal convictions and professional duties and do not allow our personal beliefs to interfere with fair representation of the aims of our institutions or the provision of access to their information resources.

I particularly am interested in statement seven because, in this case institutions become the moral conviction holders rather than individuals. Institutions within the American legal context are designed to be the legal liability absorbers and as such are able agents of action shielding individuals from legal liability, socio-political blow-back, and negotiating interpersonal relationships at the moral level. The sociology of institutions change and to the degree that institutions have an open member-driven articulation of their moral convictions, the representation of institutional morals appears to represent the collective of institutional staff. However, homogeneity should never be assumed. The fact remains, catalogers hired by organizations are workers-for-hire and are not given the liberty to express (all) their opinions on (all) topics. In capitalistic societies such as the United States, it is legally required that these institutions (both for-profit and non-profit) make decisions which sustain the organization. This may at times come at the cost of individuals within the organization, but talk to HR—talent is replaceable. So as I read this article the framing of the argument is that organizations such as the Library of Congress and IFLA members have determined that for the benefit of their future sustainability that the identification of personally identifying information within the Name Authority Records is important. To limit their liability and blow-back they conveniently allow the American Library Association (ALA) to steward Resource Description and Access (RDA), and then the ALA tells information professionals that they should not interfere with these policies or practices.

Content in the Name Authority Record

The attributes under consideration and provided for by the Resource Description and Access guidelines include the following:

Name of the Person
Dates Associated with the Person
Title of the Person
Fuller Form of Name
Other Designation Associated with the Person
Gender
Place of Birth
Place of Death
Country Associated with the Person
Place of Residence, Etc.
Address of Person
Affiliation
Language of Person
Field of Activity of the Person

To me, as a privacy policy and privacy law researcher, I see some definite overlaps with Personally Identifying Information in a variety of Legal Frameworks ( GDPR, CCPA, etc.)

While I am interested in wider applications of RDA and Name Authority Files, my thoughts immediately pull me in the direction of so-called “language archives” or institutions which steward language resources. There are ample examples where personally identifying information is kept out of the record within the domains of anthropological linguistics, language documentation, along with records produced for resource preservation and stewardship supporting these areas of praxis (including language development—language revitalization). There is some degree of overlap in the reasons for not identifying these attributes of named entities in stewarded resources with those articulated by Billey, but there are others as well.

From a personal perspective I don’t see how the following are necessary for library purposes:

Gender
Place of birth
Place of death
Country associated with the person
Place of residence
Address of person
Language of person
Field of activity

For example, the language of a person may change over time, and who is to define (limit) a person’s field of activity by the use of discrete categories? My initial reaction is that to add these things to the library record in a definitive way is over-reach and invasive. This is not the same as adding these details to a collection description within a the context of an archival collection or a corpus where-about the contributors are described. The contrast is this: in a Name Authority Record, the fields are permanent and describe the entity (person) across the entire range of their existence, where as in an archival collection or a corpus, they describe the person at the point (or range) of the collection’s creation. Name Authority Records are unbounded with regards to the number of resources they are applied to, where as within the context of a collection description or a corpus description there is a scoping of the number of resources to-which the description applies.

Within the context of language science research many have vocalized the need to be able to search across collection descriptions and corpus descriptions which include things like language used in a resource, or people of a place, or an age range of a broader population (for examples of dynamics of corpora sorting see some of the topics discussed in my 2012 post Metadata Dynamics for Linguistic and Sociolinguistic Corpora). That is, re-users of language resources do want these types of constructs available to them to search by, even if coverage is inconsistent or low. However, attaching these attributes to a person rather than a resources is un-discussed by Billey in this article.

Records, User Interfaces, and Linked Data

While libraries continue to build the traditional authority file, there are other external identity management sources of linked data that could be used instead, thereby fulfilling the promise of linked data. Both Billey and Serra et al., discusses this but from different perspectives. Billey discusses the distribution of and social proliferation of authority data as linked data, while Serra and colleges discuss the technical implementation in and impact on MARC records.

Billey writes in favor of including less information in Name Authority Records supporting her argument with the following statement (page 11):

With linked data, this information travels far beyond the servers at OCLC and the Library of Congress. Regardless of data in the authority records, we still do not have a catalog or discovery layer that facilitates a search or browse experience that utilizes the RDA elements being recorded about persons. … Only a fraction of authority records contain the new elements, so the query results would not be accurate or reliable enough to be helpful to users.

Rather than arguing against inclusion of new RDA supported MARC fields it seems that this argues for an abstraction layer to resource discovery which only uses part of the record in deciding relevance for IFLA recognized end-user discovery and exploring tasks. In this context privacy is not really an issue because other data providers in the linked data “cloud” can provide additional facets by which information can be sorted. I find that the argument for not providing data in a record due to the lack of a User Interface support for the data field, to be circular, as User Interface development often looks at which data is present to determine User Interface development goals. Additionally, Both Billey and Serra et. al seem to assume that URIs related to a person (such as ORCID, VIAF, or ISNI identifiers) are stable. They seem to easily forget that even PURLs have had interruptions to resolving and DOIs suffer from link rot (Citation: Habibzadeh, 2013) Habibzadeh, P. (2013). Decay of References to Web sites in Articles Published in General Medical Journals: Mainstream vs Small Journals. Applied Clinical Informatics, 4(4). 455–464. https://doi.org/10.4338/ACI-2013-07-RA-0055 .

Privacy Considerations

I find Billey’s privacy discussion interesting and uncompelling. Not that I think the author’s assertions are invalid, but rather they are not well supported. The author asks catalogers and policy makers to think about the following by quoting questions from (Citation: Thompson, 2016) Thompson, K. (2016). More Than a Name: A Content Analysis of Name Authority Records for Authors Who Self-Identify as Trans. Library Resources & Technical Services, 60(3). 140–155. https://doi.org/10.5860/lrts.60n3.140 :

Is there potential for this information to harm the [person] through outing or violating the right to privacy?

Yet the author has yet to identify any legal rights framework under-which this question may be answered. Within the United States privacy is not a universal right acknowledged and granted by a government, but rather a transactionally regulated issue in consumer law. Only some organizations and some types of transactions at certain volumes qualify. With the European Union the right to privacy is inalienable, but there is no private right of action to “sue” violators, hence consequences and liability of violations are limited by the capacity of the prosecutors who would have legal authority to bring action. For libraries or organizations to assume that individuals have these rights is errant. Organization may see it as beneficial to their ongoing relationships with their constituencies with respect and codify that respect in policy, but they are not rights. However, with regard to the RDA based fields, an interesting question arises as to if the death of a person affects what ought to be included in the Name Authority Record.

The author further goes on quoting Thompson asking:

Is there an indication that the [person] consents to having this information shared publicly?

The problem here is one of the use of positive evidence. Positive evidence does not mean that it is in the best interest of society at large. For example, if Google had waited to scan books rather than seeking out fair-use options in new social contexts we might not have the rich set of resources we know as HaithiTrust. Waiting for positive evidence is the best way to argue for inaction. In action may not be the best way to provide privacy in a linked data world as holes in the network of data can be just as revealing or even more so than actual data.

Bibliography

Billey (2019): Billey, A. (2019). Just Because We Can, Doesn’t Mean We Should: An Argument for Simplicity and Data Privacy With Name Authority Work in the Linked Data Environment. Journal of Library Metadata, 19(1-2). 1–17. https://doi.org/10.1080/19386389.2019.1589684
Habibzadeh (2013): Habibzadeh, P. (2013). Decay of References to Web sites in Articles Published in General Medical Journals: Mainstream vs Small Journals. Applied Clinical Informatics, 4(4). 455–464. https://doi.org/10.4338/ACI-2013-07-RA-0055
Serra, Schneider & Segundo (2020): Serra, L., Schneider, J. & Segundo, J. (2020). Person Identifiers in MARC 21 Records in a Semantic Environment. Cataloging & Classification Quarterly, 58(5). 505–519. https://doi.org/10.1080/01639374.2020.1771499
Thompson (2016): Thompson, K. (2016). More Than a Name: A Content Analysis of Name Authority Records for Authors Who Self-Identify as Trans. Library Resources & Technical Services, 60(3). 140–155. https://doi.org/10.5860/lrts.60n3.140

Tags: Metadata Models IFLA-LRM Library Science GDPR Privacy Names Language Documentation Personally Identifing Information in Language Documentation
Categories: Jot

Hugh Paterson III

Collaborative Scholar

I specialize in bespoke research at the intersection of Linguistics, Law, Languages, and Technology; specifically utility and life-cycle management for information products in these spaces.

Name Authority Records and Personally Identifying Information

Institutional Morals

Content in the Name Authority Record

Records, User Interfaces, and Linked Data

Privacy Considerations

Bibliography

Hugh Paterson III

Collaborative Scholar

Related