Blog post metadata OAI-Sets for OLAC | Hugh's Curriculum Vitae

OAI-Sets for OLAC

In a passing conversation Steven Bird and I were discussing the options for displaying collection metadata in OLAC. For a long time the assumption has been that OLAC records are flat on the basis of Dublin Core constraints. This is not completely true with the presence of the DCMIType value Collection and the relationship hasPart or isPartOf. That is through these mechanisms a hierarchy can be infered.

In a while brain storming for other options Steven mentioned the OAI value for sets (and also the historical documentation in OAI version 1.0), which is an interesting part of the OAI protocol. Basically it is a way to establish classes that the OAI verbs will respect. How these classes are set up is completely dependent on the data source manager. So, unless an OLAC recomendation document established a set of classes which it expected OLAC data providers to implement I don’t see how this actually resolves or lowers complexity related to setting up an OAI service for OLAC to harvest from. However, classes can be useful if an OAI provider has several different clients it is seeking to facilitate but only wants to provide a single OAI endpoint.

OpenEditions, the publisher in France, has an example of documentation where the OAI provider defines the sets they make available. They chose books, journals, blogs, and events as their sets. In some ways this could be navigated with DCMITypes and complex queries, but maybe some of thier clients OAI harvesters are only interested in certain sets (for example DOAJ is only interested in the journals). In the OLAC case, the harvester generally wants everything. So these specific classes doesn’t make much sense. I don’t redily see how OAI classes fit the proposed use case of establishing hierarchical structures for the description of collections. But perhaps another useful application can be thought of. The one case I can think of is if an OLAC data provider indexes resources which it doesn’t actually have a copy of and also indexes items it does have a copy of then these might be of interest to divide into two sets, but OLAC would harvest them both anyway.

Hugh Paterson III
Hugh Paterson III
Collaborative Scholar

My research interests include typological patterns in articulatory phonetics; User Experience design in language tools; and graph theory applied to language and linguistic resource discovery.