Friday, July 09, 2004

More metadata than data (again)

Anyone who has done any work on RDFS/OWL won't be surprised by this, Behind the Scenes at Yahoo Labs, Part 2:

"I would claim that there is more implied data (or inferable meta-data) than "raw" data on the web, and that we are barely scratching the surface of it. Today, all search engines are scraping for some simple forms of implied data: language, locality, etc. What's missing from this list is a nearly infinite collection of relationships that are obvious to most any human reader but extremely difficult to infer from a single document. The reason why implied data is so hard to identify is because, in the aggregate, it forms our collective cultural wisdom."
