Friday, January 21, 2005

Repairing Databases

Coherent Integration of Databases by Abductive Logic Programming
"It is well-known that in general, the task of repairing a database is not tractable, as there may be an exponential number of different ways of repairing it."

"One important aspect of data integration systems is how concepts in the independent (stand-alone) data-sources and those of the unified database are mapped to each other. A proper specification of the relations between the source schemas and the schema of the amalgamated data exempts the potential user from being aware where and how data is arranged in the sources. One approach for this mapping, sometimes called global-centric or global-as-view (Ullman, 2000), requires that the unified schema should be expressed in terms of the local schemas. In this approach, every term in the unified schema is associated with a view (alternatively, a query) over the sources. This approach is taken by most of the systems for data integration, as well as ours. The main advantage of this approach is that it induces a simple query processing strategy that is based on unfolding of the query, and uses the same terminology as that of the databases...The other approach, sometimes called sourcecentric or local-as-view (used, e.g., in Bertossi et al., 2002), considers every source as a view over the integrated database, and so the meaning of every source is obtained by concepts of the global database. In particular, the global schema is independent of the distributed ones."

"When the set of integrity constraints is given in a clause form, methods of dynamic logic programing (Alferes et al., 2000, 2002) may be useful for handling revisions. As noted in (Alferes et al., 2002), assuming that each local database is consistent (as in our case), dynamic logic programing (together with a proper language for implementing it, like LUPS (Alferes et al., 2002)) provides a way of avoiding contradictory information, and so this may be viewed as a method of updating a database by a sequence of integrity constraints that arrive at different time points."

To determine which statements to keep and which statements are determined to be invalid when integrating:
"Among the common approaches are the skeptical (conservative) one, that it is based on a ‘consensus’ among all the elements of R(UDB, ?) (see Arenas et al., 1999; Greco & Zumpano, 2000), a ‘credulous’ approach, in which entailments are determined by any element in R(UDB, ?), an approach that is based on a ‘majority vote’ (Lin & Mendelzon, 1998; Konieczny & Pino P ?erez, 2002), etc."

ASystem homepage.

No comments: