Thursday, August 05, 2004

Is RDF a graph at all?

This follows from discussions by Paul and Andrae about RDF being a 3-uniform directed hypergraph (or whatever the syntax is for a directed hypergraph where all the elements in the set of edges have a cardinality of 3).

There are two properties that I think prevent RDF being a hypergraph or more generally a graph:
* The edge set cannot be empty.
* That all RDF nodes are vertices.

A graph with an empty set of edges is a 3-uniform hypergraph due to the elements in the empty set having any properties. However, an RDF graph has no way of expressing a set of nodes outside of them being part of a statement. So there's no such thing as a set of RDF nodes (vertices) and an empty set of statements (edges). RDF/XML and N3 serialization and programming APIs like Jena have no way of creating a graph that only consists of a set of nodes.

The other problem, also stems from the fact that nodes don't exist outside statements, and that's blank nodes. These blank nodes don't exist in the set of nodes unless they exist in a statement first. A blank node is specifically there to be a place holder in a statement. What you seem to do in RDF is take a set of set of statements (edges) and place all the unique items into a set of nodes (vertices). This is the opposite approach you take with a graph. The definition of a hypergraph is: "an ordered pair (V, {E}) where V is a set of vertices and {E} is a set of edges such that {E} is a subset P(V) (power set of V)."

So RDF seems to really be a network - an application of a graph. I think this is part of the confusion people have when they try to visualize RDF. If I was to characterise it: RDF is statement/edge centric and graphs are node/vertex centric. So a good visualization is one of statements rather than nodes - so the approach should be more faceted than ball-and-stick.

No comments: