Some questions on the theory presented in the Dublin Core

admin

Some questions on the theory presented in the Dublin Core

12 Oct, 2022 7 min read Research Note, Jot

Project

It seems that various communities of use of the Dublin Core have understood the possibilities of the Dublin Core within their own traditions of record making. For example, when dealing with Qualified Dublin Core, librarians suggest not using both dc:Description and dcterms:Abstract. That is, dcterms:Abstract ought to be reserved for semantically valid abstracts as understood in print media. However, What is to say that dcterms:Abstract can not be applied to any item qualified with any value of the DCMIType? That is, what is the abstract of a sound, software, event or physical object? This may seem absurd in some cases, but is a sound clip or a movie trailer not the same as an abstract?
Dublin Core has several design principles intended for its use. Two of them are the 1-to-1 Principle and the Repeatability principle. The question I wish to pose is how can these two principles interact in predictable ways? The Repeatability principle says that any described object can have multiple instances of the same element describing a resource. For example, two or three instances of dc:subject as part of an object record makes sense. The 1-to-1 Principle says that each record should only describe one object. The 1-to-1 Principle seems rather straight forward until one faces constraints imposed by technology platforms or seeks to describe aggregate works. Setting aside the issue of describing aggregate works, how do these principles interact when working with Qualified Dublin Core? That is, does the 1-to-1 Principle provide constraints to the application of the Repeatability Principle. For example, qualified dc:type elements with the DCMIType vocabulary ought to only be used in their singularity. Presumably, a record with both a qualified type of “text” and “sound” each relating to different files in a “record bundle” would be in violation of the 1-to-1 Principle. For a different crazy example consider: how can a qualified dc:type.dcmitype.sound also be dc:type.dcmitype.event? That is, not only are there limits to via the 1-to-1 Principle, but also internal term-specific semantics of the qualified terms are limiting, but it seems that there are repeatable types of interactions which specifically limit the application of the Repeatability principle. Perhaps no terms are more in semantically incompatible than “collection” and any other term in the DCMIType vocabulary. How can something be a “collection” and also a “text” unless we are arbitrarily assigning the type text to each item within the collection. However, doing this is a violation of the practice of description that indicates that the record should only apply to a single layer of the description — presumably the entire collection at this point, so DCMIType “Collection” should not co-occur with any other DCMIType qualifier. Note, I am only speaking of situations which qualify the type element from the dcterms namespace. Obviously, dc:type can be unqualified, and can also be qualified differently from different namespaces and all of these can co-exist.

Some thought has gone into a Dublin Core Constraint Language. See the linked propsal. However, this seems to be taking constraints in yet another direction.

Since a dct:license value is only valid as long as copyright is also valid, and since copyright length varies from jurisdiction to jurisdiction, how is the jurisdiction of the copyright registration (claim/standing) indicated? For further discussion on this question see: https://github.com/dcmi/usage/issues/104
What qualifier on the “date” element is used on a record which also contains a dc:type element where the DCMIType qualified value is “Event”? It seems that date elements qualify a number of types of resources describable with the DCMIType vocabulary. However, specific dates inferring a relationship for Events are not explicitly clear. The general date element is available and the duration syntax could be used. But is this sufficient for the kinds of dates needed in regards to Event descriptions?
How is one to indicate the script a resource is in via Dublin Core? One could use the language tag with a valid BCP-47 tag but this is not mentioned in the usage guide. (One would need to indicate either RFC4646 or RFC5646 as the refinement of language.) If am I to correctly understand the situation, new approved syntaxes have crept into DC/DCT usage over time via the language tag recommendations as the IETF approves new language tag related RFCs.

To prevent this from happening maybe the DC Usage board can just define a syntax for BCP-47 which will remain a stable pointer without needing to invoke a specific RFC defined syntax. BCP-47 has in the past pointed to each of RFC1766, RFC3066, RFC4646, RFC5646. Each progressive standard in the BCP-47 series does not invalidate the next set of codes but rather adds specification. All RFC1766 tags are valid RFC5646 tags, but not all RFC5646 tags are valid RFC1766 tags. However all RFC1766 tags and all RFC1766 tags are valid BCP-47 tags. As I understand things, by pointing to RFC5646 rather than BCP-47 it does not allow for the implementation of RFC6067 and RFC6497 within the narrowly interpreted syntax of RFC5646. RFC5646 allows for 35 subtags with in the BCP-47 space, RFC6067 and RFC6497 define two of these without changing the syntax of the BCP-47 tag scheme. It’s just that when narrowly understood, RFC5646, now doesn’t include all of BCP-47.

My thought is that it might be possible to just define BCP-47 as a syntax and if additional sub-tags are registered then there is no-need on the part of DCMI to update the number of syntaxes DC/DCT count as valid.

Dublin Core does not specify a specific syntax. Therefore it must get its syntax from an application profile, a specific instanciation of defined community agreements. When the syntax is XML, the exact nature of the allowable elements and attributes may vary based on the namespaces declared in the header. Therefore I have a general question about the use of the xsi:type="" attribute in XML, as the xsi namespaces is often declared in Dublin Core application profiles. Do values of xsi:type="" need to be namespaced? The Open Language Archive application profiles (based on Dublin Core and OAI: therefore XML) uses namespaced values such as xsi:type="olac:role". So what happens if I just use role without namespacing it? I had this issue come up in some other dublin core project and I couldn’t get a clear answer. That project doesn’t namespace xsi:type="" values and I suggested that those non-namespaced values shouldn’t validate… but then project owners said: “oh the code validates just fine”…. but I’m thinking that it syntactically validates as valid XML but does not validate against a schema. The second thing I was thinking was where does the xsi:type="" validate against when there is no namespace? An OLAC (or any OAI based aggregator) security concern would be to close this gap and force all xsi:type="" uses to namespace as a required practice. ELSE fail a required validation.

The book XML in a Nutshell says the following:

xsi:type The xsi:type attribute may be used in instance documents to indicate the type of an element, even when a full schema is not available. For example, this length element has type xs:decimal : <length xsi:type="xs:decimal">23.5</length> More importantly, the xsi:type attribute enables a limited form of polymorphism. That is, it allows you to make an element an instance of a derived type where an instance of the base type would normally be expected. The instance of the derived type must carry an xsi:type attribute identifying it as an instance of the derived type. For example, suppose a schema says that a ticket element has type TicketType . If the schema also defines BusTicketType and AirplaneTicketType elements as subtypes of TicketType , then a ticket element could also use the BusTicketType and AirplaneTicketType content models provided it had an xsi:type="BusTicketType" or xsi:type="AirplaneTicketType" attribute.

But here again what if AirplaneTicketType is not defined in the base schema?

This sort of question seems to have been asked on stack exchange where it is pointed out that in the XML specification that all elements need to be defined. See also Type Definition and this stack exchange explination for the XML header.

Tags: Semantics Metadata Dublin Core OLAC
Categories: Research Note Jot

Hugh Paterson III

Collaborative Scholar

I specialize in bespoke research at the intersection of Linguistics, Law, Languages, and Technology; specifically utility and life-cycle management for information products in these spaces.

Some questions on the theory presented in the Dublin Core

Hugh Paterson III

Collaborative Scholar

Related