Friday, August 13, 2004

Heuristic Database Integration

IBM gets heuristic in database wars "IBM is preparing to launch enterprise database technologies that can more effectively link together related information from multiple data sources, potentially eliminating some of the quality problems which can plague large data warehousing projects.

The technology, codenamed mineLink and developed at the company's research centre in Almaden, uses heuristic techniques to identify data fields which contain related information even though they may be labelled differently. For instance, a field labelled 'Surname' in one database may be labelled as 'First Name' in another, which can cause problems in integrating the data. While that example is fairly simplistic, matching fields often requires complex analysis of their contents, especially if businesses want to drill further into the collected data.

A prototype of mineLink for use in the life sciences field was demonstrated by IBM researchers as long ago as 2002. That project used existing the DiscoveryLink analytic technologies in DB2, but added additional data mining features in order to provide a unified view of complex information."

They might mean 'Surname' and 'Last Name'.

No comments: