Thursday, January 21, 2010
Notes on Pragmatic Language Design
If you are Ed, and you are reading this post: no it isn't the post I promised (yet). But it is a precursor, and you should read it before you read the subsequent one in the series. Not that you have any choice, since I haven't actually written the next post yet.I had a nice conversation with a coworker yesterday, in which she asked me a question I hear a lot in library-land: "Is it an identifier for the abstract thing, or just for this manifestation of the thing?" I reminded her that I don't understand what "abstract thing" means in that context.
Contingent on the acceptance of "thing" as a valid concept, an identifier either identifies that thing or does not. We can (and at libraries, often do) argue about whether a particular identifier goes with a particular thing, but this is a specific argument rather than an abstract one. It is an argument about language design for some particular identifier.
If I substitute "word" for "identifier" it brings the problem with my way of thinking into stark relief. A word either identifies a thing, or it does not. If my coworker and I agree about the association between the word and the thing, there's no problem. But if we disagree, we are not using the same word (even though it is spelled the same and pronounced the same). We are faced with a multiple dispatch scenario: if she understands what I mean by my word and I understand what she means by her word, we need some way to disambiguate which one we are using when we speak. We sometimes do that by adding other words: pen-in-the-sense-I-mean-it versus pen-in-the-sense-she-means-it. If we often need to perform this kind of disambiguation, we probably develop a lingo or some jargon that people outside our small subgroup might not immediately grok.
In this way, we are pragmatic language designers. We are using words to communicate about things, and a word is an identifier for all the things it identifies. This is a definition, in the words of my brother Daniel, that probably "dissolves into mush" if examined too closely.
But I'm convinced it's right, despite its fragility. I think it's even more right in library-land. When we use identifiers, we need clear criteria for what they do-and-do-not identify. When we get close to the edge of the definition, we should discuss whether a particular thing is in or out rather than trying to speak in abstracts. And when it becomes evident we need to disambiguate the-thing-that-I-mean from the-thing-that-you-mean, we should carefully consider adding some words (identifiers) to help us with that task. If we do it repeatedly, we should design them into our language.
And every once in a great while, we should go over the whole language and see if it could benefit from a little refactoring. See if there are similarities in the places where jargon and lingo are cropping up, see if we can't make it into something that's easy for us all to remember.
Labels: language design, libraries
Tuesday, November 17, 2009
Memento and Persistence
Following a long conversation with a coworker, I wrote down some thoughts about persistent identification schemes (including ARK, DOI, Handle, PURL). I had the post in the can, ready to go, when it was rudely interrupted by a really interesting presentation, which completely changed my thinking. I should recap that ill-fated blog post in one sentence before moving on: adding a layer of identifiers doesn't make an existing identifier more persistent, it makes it less so.
Now, that being said, there's a real problem. If I want to point at something as it exists on a certain date, it's often quite unwieldy to do so. Maybe I can cache it locally, maybe I can use one of the persistent identifiers mentioned in that first paragraph, or maybe I'm just out of luck. I point to someone's Geocities site, and it's just fricken gone. You see, the problem isn't that information moves to a different location, it's that the information at a given location changes. Or that it disappears entirely. That's a use case I care about.
Enter Memento. Herbert Van de Sompel and Michael Nelson gave a presentation about it at the Library of Congress yesterday, and I'm convinced it's a better way to think about persistence. Basic gist is that you specify a date with a URI, and a combination of clients, servers, proxies, and services try to give you back the thing you were pointing at, rather than the thing that's there now. I don't love all their terminology or even their implementations, but those are details. Memento is still a work in progress, and I like the approach.
Now, that being said, there's a real problem. If I want to point at something as it exists on a certain date, it's often quite unwieldy to do so. Maybe I can cache it locally, maybe I can use one of the persistent identifiers mentioned in that first paragraph, or maybe I'm just out of luck. I point to someone's Geocities site, and it's just fricken gone. You see, the problem isn't that information moves to a different location, it's that the information at a given location changes. Or that it disappears entirely. That's a use case I care about.
Enter Memento. Herbert Van de Sompel and Michael Nelson gave a presentation about it at the Library of Congress yesterday, and I'm convinced it's a better way to think about persistence. Basic gist is that you specify a date with a URI, and a combination of clients, servers, proxies, and services try to give you back the thing you were pointing at, rather than the thing that's there now. I don't love all their terminology or even their implementations, but those are details. Memento is still a work in progress, and I like the approach.
Subscribe to Posts [Atom]