Wednesday, December 31, 2003

Statement vs Stating

This comes up from time to time (at work and on mailing lists) a useful summary of an RDF statement and a stating:
Statements/Statings. From the ILRT Semantic Web technical reports at a glance. Two other references: Does the model allow different statements with the same subject/predicate/object? and also part of Reification in the RDF Semantics document.

Commercial Ontologies

TeraView, Level 5 and BioWisdom starting 2004 in style "Ontology specialist BioWisdom also plans to make a “big announcement early in the New Year.

Ontology is a branch of science that deals with knowledge capture and representation. BioWisdom’s approach involves the development of specialised knowledgebases and the software tools for managing them.

Chief executive Gordon Smith Baxter said: “2003 has been a great year for BioWisdom. In January we secured a £2.5m investment from MB Venture Capital II and Merlin Ventures Fund IV."

BioWisdom and Network Inference on using ontologies for drug discovery.

Friday, December 26, 2003

Some Holiday Links

* Improved Topicalla Screenshot
* Weedshare
* XML 2003 Conference Diary - Notes the continual rise in interest in the Semantic Web.
* SnipSnap 0.5 - Now with the snips available as RDF.

Friday, December 19, 2003

Quintuples

Trust, Context and Justification While I'm not sure about using 5 tuples, we use 4 and make statements about the 4th tuple in order to do things like security, it's still an interesting paper with some good references.

Google Searching for Relevance

A Quantum Theory of Internet Value "When "the Internet" was unveiled to a doughnut-eating public a decade ago, we were promised unlimited access to vistas of encyclopedic knowledge. Every body would be connected to every thing, and we would never be short of an answer. What with the abundance of information, and the costs of transporting information approaching zero, the world would never be the same again.

Of course, a decade on, we know that real economics have prevailed. Information costs money. Those transport costs certainly aren't zero. And faced with a choice of a million experts, people gravitate towards experts with a good track record: i.e., for better or worse, paid journalists, qualified doctors or other centers of expertise.

Taxonomies also have been proved to have value: archivists can justify a smirk as manual directory projects dmoz floundered - true archivists have a far better sense of meta-data than any computerized system can conjure. If you're in doubt, befriend a librarian, and from the resulting dialog, you'll learn to start asking good questions. Your results, we strongly suspect, will be much more fruitful than any iterative Google searches. "

"At a convivial dinner recently, John Perry Barlow asked me why no one had written a story about how the most powerful organisations in the world were dependent on the most awful, antiquated and dysfunctional technology. Well, I ventured (to a deafening silence), maybe they were making ruthless choices, and really weren't too slavish about following techno-fads. Maybe the answer is in the question."

Wednesday, December 17, 2003

Commerical RSS

How to make RSS commercially viable "Without full content no aggregator can add much value by categorizing and filtering infomation, so no purely RSS based aggregator can make much money.

Despite all of the interest around web based syndication, people like Lexis Nexis will still make all the money unless this problem is solved."

Does it? Will it? Must it?

Interview: David Weinberger "What Shelley calls "the semantic web" is the Web itself. She puts it beautifully. And I agree 100% that the Web consists of meaning; it has to because we created this new world for ourselves out of language and music and other signifiers. But that meaning is as hard to systematize and capture as is the meaning of the offline world and for precisely the same reasons. The Semantic Web, it seems to me, often underplays not only the difficulty of systematizing human meaning (= the world) but also ignores the price we pay for doing so: making metadata explicit often is an act of aggression. Human meaning is only possible because of its gnarly, tangly, implicit, unlit, messy context. That's the real reason the Semantic Web can't scale, IMO.

If by "The Semantic Web" you merely mean "A set of domain-specific taxonomies some of which can be knit together to provide a greater degree of automation and improved searching," then I've got no problem with it. It's the more ambitious plans -- and the use of the definite article in its name -- that ticks me off when it comes to The Semantic Web."

Exceptions (again)

13 Exceptional Exception Handling Techniques notes "Declare Unchecked Exceptions in the Throws Clause" and "Soften Checked Exceptions" (always use RuntimeExceptions). This lead to JDO and its JDO Transaction class that uses runtime exceptions (although it does document them) instead of JDBC's use of checked exceptions. Similarly, the Spring Framework and in Chapter 4 of Expert One-on-One J2EE Design and Development the author discusses the usual reasons given to avoid checked exceptions:
"Checked exceptions are much superior to error return codes...However, I don't recommend using checked exceptions unless callers are likely to be able to handle them. In particular, checked exceptions shouldn't be used to indicate that something went horribly wrong, which the caller can't be expected to handle...Use an unchecked exception if the exception is fatal."

With both JDO and Spring the contract offered by the framework tells the client what they can and cannot handle. In my experience, this is not an either or situation. For example in JDO they use "CanRetryException" and "FatalException" - an exception that can be retried, could actually be fatal depending on the context and vice-versa. This often occurs when large frameworks are used in conjunction with one another - at the system integration level. Preventing the developer the choice, when integrating into larger frameworks, what exceptions can and cannot be caught often leads to unexpected exceptions tunneling through layers.

Tuesday, December 16, 2003

Drools in Groovy

Drools (an augmented implementation of Forgy's Rete algorithm) is now available in Groovy.

RDF Matures

Resource Description Framework (RDF) Is a W3C Proposed Recommendation and OWL Web Ontology Language Is a W3C Proposed Recommendation, the next step is Recommendation.

More On Practical RDF

Practical RDF Town Hall "Next, xmlhack editor Edd Dumbill explored how he applies RDF to his personal data integration problems, running personal information through the Friend-of-a-Friend (FOAF) RDF vocabulary, using the Redland framework as a foundation for processing...which has sprouted context features and Python bindings to support this work. "

"In the last presentation, Norm Walsh explained how he was using RDF to make better use of information he already had. Walsh explained that he had lots of data in various devices about a lot of people and projects, but no means of integrating it. Thanks to various RDF toolkits - "just by dumping it into RDF, it just kind of happens for free." Aggregation and inference are easy - and Walsh can get convenient notifications of people's birthdays without duplicating information between a file on a person and a calendar entry noting that."

Monday, December 15, 2003

Corporate Taxonomies

Verity provides standard ways to categorise content "Traditionally, taxonomies have been time-consuming and expensive to set up. A Taxonomy needs to be unambiguous and cover all topics of interest to the organisation. In other words, it has to be Collectively Exhaustive and Mutually Exclusive. Few individuals, not even the company librarian, have the breadth of knowledge of the organisation and its information assets to construct a set of categories that encompasses all information and meets all needs."

"Because a taxonomy reflects the most important knowledge categories of an organisation, organisations that carry out the same business activities need similar taxonomies. (In the same way that such organisations share similar core business processes). This fact and the rising importance of taxonomies to organisations has led Verity to make six tailorable taxonomies available to jump-start the development of an organisation's taxonomy. Verity's six taxonomies suit a range of business activities covering Pharmaceuticals, Defence, Homeland Security, Human Resources, Sales and Marketing, and Information Technology. Organisations that start with these predefined taxonomies can then tailor them to their specific needs. "

Sunday, December 14, 2003

The Winner Takes It All

Power Laws, Discourse, and Democracy "Well, inevitable inequality is one way to characterize the effects of power laws in social networks. But is it the most useful way? Drawing on the same body of research on power laws in social networks, and using similar methods, Jakob Nielsen chose to emphasize instead that, as he put it in a piece published on AlertBox (03.06.16): Diversity is Power for Specialized Sites:"

"Winner-takes-all networks may follow Pareto's Law (the 80/20 rule) with regard to the cumulative distribution of links. But, according to Barabasi in Linked, the distinctive distribution hierarchy of scale free networks will have been broken. Instead, the network takes on what Barabasi describes as a "star topology," in which a single hub snarfs nearly all the links, dwarfing its competitors. "

"It's the the dynamics of emergent systems being formalized in open source. It's the fragile and turbulent architecture of democracy.

By contrast, winner-takes-all networks wipe out the middle ground connecting leaders to the network's other players. With this, winner-takes-all networks strip away the architecture that supports the productivity of local niches."

Saturday, December 13, 2003

More Practical RDF

Practical RDF "There are two features of RDF that I find particularly practical: Aggregation [and] Inference".

Not Influential or Famous

Myths Open Source Developers Tell Ourselves

Friday, December 12, 2003

Groovy is Out

"Groovy is a powerful new high level dynamic language for the JVM combining lots of great features from languages like Python, Ruby and Smalltalk and making them available to the Java developers using a Java-like syntax."

GPath "When working with deeply nested object hierarchies or data structures, a path expression language like XPath or Jexl absolutely rocks."

The SQL and Markup example also looks interesting.

New Java Tools

Algernon-J is a rule-based reasoning engine written in Java. It allows forward and backward chaining across Protege knowledge bases. In addition to traversing the KB, rules can call Java functions and LISP functions (from an embedded LISP interpreter).

JRDF "A project designed to create a standard mapping of RDF to Java."

Google 2005

Searching With Invisible Tabs "Doesn't the future of search look great? Whatever type of information you're after, Google and other major search engines will have a tab for it!"

Highlights that people can suffer from "tab blindness" and why one UI doesn't suite all (fairly obvious).

Greed is Good for Data Emergence

The Age of Reason: The Perfect Knowing Machine Meets the Reality of Content "In brief, the concept of "data emergence" that is central to this knowledge Nirvana is best summed up by James Snell as "the incidental creation of personal information through the selfish pursuit of individual goals." From Snell's perspective, content value is shackled by dumb Web browsers that are used to share information about individuals with Web sites that then try to "personalize" their content - an experience that must be repeated at each and every Web site visited, since this knowledge about individual interests and preferences is not shared site-to-site. Instead of this, the perfect world would have a "smart" content service, probably on one's PC, that would retain knowledge of all of one's personal profile and interests in accessing content; content providers would then be "dumb" sources pumping information into the smart service, not having any detailed knowledge of who is using their services and how. No more nasty Web site publishers, just one perfect tacit machine that knows exactly what you're thinking and allows you to obtain and share thoughts with others."

"Aggregation can happen anywhere to the satisfaction of many."

Kowari Already Out There

Kowari for RDF developers - Early Release It's already out there - found this when doing a Google on Kowari. The real site will be Kowari.org but with OS the source is the real thing I guess.