Thursday, December 15, 2005

Outer Joins aren't Primitive

Optional data in SPARQL seems to be equivalent to left outer join in SQL. As it turns out, outer joins can be composed of disjunctions. This is similar to the original MAYBE function suggested to be added to Kowari (although that suggestions is quite a deal simpler). The below paper outlines algorithms to do outer queries more efficiently. They require computing the anti-join of certain relations - an antij-oin being the set difference between two tables (or MINUS operation). Here is a good explanation of semi-joins and anti-joins.

Outerjoins as Disjunctions "The outerjoin operator is currently available in the query language of several major DBMSs, and it is included in the proposed SQL2 standard draft. However, “associativity problems” of the operator have been pointed out since its introduction. In this paper we propose a shift in the intuition behind outerjoin: Instead of computing the join while also preserving its arguments, outerjoin delivers tuples that come either from the join or from the arguments. Queries with joins and outerjoins deliver tuples that come from one out of several joins, where a single relation is a trivial join. An advantage of this view is that, in contrast to preservation, disjunction is commutative and associative, which is a significant property for intuition, formalisms, and generation of execution plans.Based on a disjunctive normal form, we show that some data merging queries cannot be evaluated by means of binary outerjoins, and give alternative procedures to evaluate those queries. We also explore several evaluation strategies for outerjoin queries, including the use of semijoin programs to reduce base relations."

Also related, Outer Join in Edutella where each part of the outer query is done individually.

No comments: