Supporting Diversity Through a Typology of Data Providers: What is an Archive?


Recent authors (Bird & Simons 2021, 2022) suggest that all data providers to the Open Language Archives Community (OLAC) aggregator are best referred to as “archives”. This serves to draw in new potential data providers through inclusive terminology — every data provider is or has an archive. No collector of sets of language resources can be told they do not have or are not an archive. The inclusive view is also taken in the analysis of language resource stewards by Yi et al. (2022) who review different kinds of institutions and their websites as if they were the same. When reviewing stewardship practices and online presence of language resources, it becomes apparent that not all presentations of resources intend to be equal. The OLAC community through its terminology leaves open to assumption issues such as the nature of institutional support for a set of language resources, the nature of the purpose of an online index of resources, and the nature of the stewardship mandate an institution may have regarding specific resources.

So an interesting question arises: Are all the data providers to the Open Language Archives Community aggregator actually archives? We present an analysis of the 63 data providers to OLAC and suggest a more nuanced typology which includes: Archive, Repository, Library, Special Collection, Personal Portfolio, Lab or Department Portfolio, Project Portfolio, Typological Database, and Bibliography. Additionally, we situate the terms: Museum, Gallery, Networks, Centers, Publishers, Institutions with collection-specific management practice, Institutes, Historical Societies, Registries, and Services. The terms in our vocabulary have function based definitions and fall into the broad categories of: access institutions, exhibits, and reference resources. We believe that by acknowledging diversity in the types of data providers, the OLAC community can gain valuable insights into the kinds of providers that currently find value in participating in OLAC. Two other results of acknowledging the diverse nature of data providers include

1. The ability to determine where outreach efforts may need further work for greater inclusion and participation with OLAC.
2. A clearer professional discourse around the provisioning of language resource access points e.g., Ferreira et al. (2021).

In this presentation we use the terminology of diversity and inclusion to illustrate how inclusive terminology can gloss over important elements of diversity, and how diversity can make communities stronger and more resilient. By providing more options for identification, we allow a more just and equitable representation of language documentation resource stewards.

2 Mar, 2023 17:00
University of Hawai‘i at Mānoa


Bird & Simons (2022)
& (). The Open Language Archives Community: A 20-year update. The Electronic Library.
Bird & Simons (2021)
& (). Towards an Agenda for Open Language Archiving. University of North Texas.
Ferreira, Lukschy, Watyam, Ungsitipoonpor & Seyfeddinipur (2021)
, , , & (). A Website Is a Website Is a Website: Why Trusted Repositories Are Needed More Than Ever. University of North Texas.
Yi, Lake, Kim, Haakman, Jewell, Babinski & Bowern (2022)
, , , , , & (). Accessibility, Discoverability, and Functionality: An Audit of and Recommendations for Digital Language Archives. Journal of Open Humanities Data, 8(10). 1–19.
Content Mediums:
Hugh Paterson III
Hugh Paterson III
Collaborative Scholar

I specialize in bespoke research at the intersection of Linguistics, Law, Languages, and Technology; specifically utility and life-cycle management for information products in these spaces.