Thursday, December 12, 2002

Search Engine Technologies

"A report issued by Forrester Research in September 2002 concluded soberly, "Most companies already own a search engine—one that doesn't work.""

"Questions about accessing information from different locations and devices have proved to be among the most vexing, forcing employees to waste time bouncing between company intranets and browsers, with no interface. Among companies making inroads in that realm is Divine (www.divine.com), which in June 2002 began offering SinglePoint Search, a tool with an open architecture that enables users to search all of their resources, in any format, simultaneously."

Goes through different approaches to searching including: clustering, linguistic analysis, natural language processing, ontology, probablistic, taxonomy and vector based. Also talks briefly about image searching. The software that does the linking in Infoworld articles, RichLink, is highlighted and even Zoe gets a mention (go email search engines).

On page 3 there's a side bar at the bottom talking about who owns the metadata. Using trademarked or copyrighted materials as metadata is fair use. Similar to the the article on Feist Publications Inc. vs. Rural Telephone Service Co. Inc. where it was ruled by the Supreme court that copyright does not protect the facts themselves, even though the compiler of the collection may have invested substantial funds, labor or both in collecting and compiling them.

Apparently, MS is in the probablistic camp and Autonomy is in both probablistic and clustering.

http://www.newarchitectmag.com/documents/s=7766/na0103a/index.html

No comments: