Sunday, March 02, 2008

Algebra A and SPARQL

I've been reading, "Logic and Databases: The Roots of Relational Theory" and more importantly the chapter 10 which is about "Why is it called a Relational Algebra?". He defines what an algebra is such as identities, idempotence, absorption and so on with respect to Algebra A. I first came across Algebra A in the 3rd Manifesto which is an untyped relational algebra that defines a relationally complete system in about three operations: REMOVE, NOR or NAND and TCLOSE (transitive closure). I say, "about three" because I'm not sure TCLOSE is part of a relationally complete system and NOR and NAND are made up of an untyped OR or AND plus NOT. It also gets rid of WHERE, EXTEND and SUMMARIZE (which can be used for aggregate functions) by creating "relational operators" which are special relations that perform an operation (like COUNT).

Anyway, one of the more interesting points is that on page 260-261 of "Logic and Databases" he talks about identities such as: A + 0 = A * U = A, A + U = U and A * 0 = 0. Where A is any relation, 0 is the empty relation (DUM) and U is the universal relation (DEE). These match the tables I created for JOIN and UNION for SPARQL - and likewise I think are correct for OPTIONAL.

There is also a chapter on the closed world assumption and why Date dislikes the open world assumption which I'm still trying to digest. It seems that to get around 3VL Date uses strings - which seems like a massive hack.

It also occurred to me when reading this that SPARQL and query languages in general are non-monotonic - that is as you add more information the results you get from a query can be different - which is different to RDF. It made me wonder what a monotonic query language would look like but not for too long.

No comments: