"High-performance software for information retrieval research. Emphasis on semi-structured text retrieval, especially for HTML and XML. The goal is to facilitate information retrieval research by providing an interchangable toolkit of functions." Written in Perl and C/C++.
Telcordia Demo Machine "...By using statistical algorithms, LSI can retrieve relevant documents even when they do not share any words with a query. LSI uses these statistically derived "concepts" to improve search performance by up to 30%."