Sunday, February 15, 2004

GiST and MTrees

MTree Project "The M-tree is an index structure that can be used for the efficient resolution of similarity queries on complex objects to be compared using an arbitrary metric, i.e. a distance function d that satisfies the positivity, symmetry, and triangle inequality postulates. For instance, with the M-tree you can index a set of strings and organize them according to their edit distances (the minimal number of character changes needed to transform one string into another)."

MTree Applet (also part of XXL).

GiST: A Generalized Search Tree for Secondary Storage "The GiST is a balanced tree structure like a B-tree, containing pairs. But keys in the GiST are not integers like the keys in a B-tree. Instead, a GiST key is a member of a user-defined class, and represents some property that is true of all data items reachable from the pointer associated with the key. For example, keys in a B+-tree-like GiST are ranges of numbers ("all data items below this pointer are between 4 and 6"); keys in an R-tree-like GiST are bounding boxes, ("all data items below this pointer are in Calfornia"); keys in an RD-tree-like GiST are sets ("all data items below this pointer are subsets of {1,6,7,9,11,12,13,72}"); etc. To make a GiST work, you just have to figure out what to represent in the keys, and then write 4 methods for the key class that help the tree do insertion, deletion, and search."

Report on implementing GiST in Java (framed).

Other trees (or multi-dimensional access methods).

No comments: