Tuesday, June 10, 2008

Linked Data, FOAF, and OWL DL

So I spent a little time a while ago looking through all the different ways ontologies support linked data. Some of my data I wish to link together is not RDF but documents that define a subject. For example, a protein will have peer reviewed documents that define it. It's not RDF but it is important.

The tutorial on linked data has a little bit of information: "In order to make it easier for Linked Data clients to understand the relation between http://dbpedia.org/resource/Alec_Empire, http://dbpedia.org/data/Alec_Empire, and http://dbpedia.org/page/Alec_Empire, the URIs can be interlinked using the rdfs:isDefinedBy and the foaf:page property as recommended in the Cool URI paper."

The Cool URIs paper, Section 4.5 says: "The rdfs:isDefinedBy statement links the person to the document containing its RDF description and allows RDF browsers to distinguish this main resource from other auxiliary resources that just happen to be mentioned in the document. We use rdfs:isDefinedBy instead of its weaker superproperty rdfs:seeAlso because the content at /data/alice is authoritative."

There is also some discussion about linking in URI-based Naming Systems for Science.

Now my use case is linking things to documents that define that thing. So rdfs:seeAlso is not appropriate as it "might provide additional information about the subject resource". And rdfs:isDefinedBy is also out as it is used to link RDF documents together. I need a property that defines a thing, is authoritative but isn't linking RDF (it's for humans). I also would like to keep my ontology within OWL DL.

FOAF has a page property. I've used the OWL DL version of FOAF before and FOAF cleaner (or should that be RDFS cleaner). So it seemed like a good match. However, its inverse is topic which isn't good. Because I'm linking the thing to the page - it's not a topic. So scrub that.

RSS has a link property which extends Dublin Core's identifier. This seems more like it. However, I'd like to extend my own version of link and I'm stuck because as soon as you use RDFS vocabularies in OWL DL you're in OWL Full territory. It'd be nice to stay in OWL DL. There is an OWL DL version of Dublin Core. All of the Dublin Core properties are nicely converted to annotation properties. However, you're still stuck because you can't make sub-properties without going into OWL Full. I like the idea of annotation and semantically Dublic Core seems to be a suitable vocabulary of annotation properties. Extending Dublin Core is out of OWL DL - which is shame because it's probably the closest match to what I wanted.

As an aside, annotation properties are outside the reasoning engine. The idea is that you don't want an OWL reasoner or RDF application necessarily inferring over this data or trying to look it up in order for the document to be understood. So the way they do it in OWL DL is to have annotation properties that are outside of/special to the usual statements. Sub-properties require reasoning, so limiting them makes some sense but it does hamper extensibility - it'd be nice to express them and turn on the reasoning only when asking about those properties (I think Pellet has this feature but I didn't look up the details).

The other vocabulary I looked at was SIOC's link. Again, this seems like a close match but again it's RDFS.

In the end, I just created another annotation property called link.

In summary:

  • For my requirements, the suggestions for linking data seems to only work for RDF and RDFS ontologies. Reusing RDFS from OWL DL or OWL DL from RDFS doesn't look feasible as one isn't a subset of the other (an old problem I guess).

  • Current, popular Semantic Web vocabularies are in RDFS. Why aren't there more popular OWL DL versions of these things? Is the lack of extensibility holding it back?

  • Is my expectation wrong - should I stick within OWL DL or is an RDFS and OWL DL combination okay?

  • Why not allow annotation properties to have sub-properties?

  • Maybe the OWL DL specification does have suitable properties for linking certain data but I don't understand which is the right one.



Update: The Neurocommon's URI documentation protocol is quite similar as well. Except that, it seems to be too specific as it ties the name with a single thing that defines it. All the parts of Step 5 could potentially be eliminated with what I'm thinking of.

No comments: