Monday, December 11, 2006

The Difference is Antijoin

I made a mistake about making a mistake. I got caught up with the fact that SPARQL's diff is not the same as relational difference but that's not how left outerjoin is defined. So I'm correct that relational set difference is not compatbile with SPARQL's difference as defined in "The SPARQL Algebra" but wrong that this means the definitions for left outer join are not equivalent.

So antijoin is:
( R1 difference (project(R1)(R1 join R2)) )

So using the previous example:
{{ (?x = 2, ?y = 3) }} \ {{ (?y = 3) }} is {} in SPARQL.

Now, as I said using plain difference it's:
{{ (?x = 2, ?y = 3) }} - {{ (?y = 3) }} is {{ (?x = 2, ?y = 3) }} in relational algebra.

But using antijoin the right hand side is really:
  • project ({{ (?x = 2, ?y = 3) }}) ({{ (?x = 2, ?y = 3) }} join {{ (?y = 3) }})
  • project ({{ (?x = 2, ?y = 3) }}) ({{ (?x = 2, ?y = 3) }})
  • {{ (?x = 2, ?y = 3) }}. Which matches the left hand side.

This is why JRDF looked like it was producing the right result - it was doing the same thing - it's just the operations were further decomposed. I'm sure I've done this before but for some reason I was certain I was wrong the other day. I was told that antijoin and SPARQL's diff were equivalent when it was changed but I had to question it.

There still remains an outstanding issue around null joins but maybe that's not needed either. I know that the NULLs in "SPARQL RULES!" are required for Logic Programs but it doesn't need to be there for relational algebra as long as you have the null accepting/untyped join.

So SPARQL in JRDF looks fine again.

No comments: