Monday, November 29, 2004

IBM boosts Oncology Ontology

IBM and Massachusetts General Hospital Announce Effort to Improve Information Sharing Among Cancer Researchers ""Effective tools for information management, integrated tightly with underlying computing and data infrastructures, are key to life sciences researchers gaining new insights into complex problems," said David Grossman, Distinguished Engineer, IBM Internet Technology Group. "In addition, the use of semantic web technologies to integrate many sources and formats of data with advanced modeling algorithms is particularly helpful for this type of large-scale collaborative project.""

""There is an urgent need to develop a common, unifying infrastructure that enables the integration and sharing of knowledge about cancer -- both in terms of disparate data and distinct computational tools -- with the goal of modeling cancer as a complex dynamic system," said Dr. Deisboeck. "While advances in cancer research and new technologies have generated a wealth of new data and insight, all too often the lack of shared systems and standards makes integration of this crucial knowledge difficult or impossible.""

Python and PHP

The Next Language "The vast majority of J2EE deployments (over 80% according to Gartner) are simply Servlet/JSP to JDBC applications. Basically HTML front-ends to relational databases. It is ironic that much of what makes Java complicated today is all of its numerous band-aid extensions, such as generics and JSP templates, which were added to make these types of simple applications easier to develop."

"Apparently what is needed is a language/environment that is loosely typed in order to encapsulate XML well and that can efficiently process text. It should be very well suited for specifying control flow. And it should be a thin veneer over the operating system."

Just to make this clear, the idea that the future of application development is turning them into "a big text pump" seems rather foreign and completely opposite to where application development seems to be going.

Requirements and Architecture

Requirements guru shares 'cosmic truths' "Wiegers' list of cosmic truths also includes:

* "Customer involvement is the most critical factor in achieving software quality."
* "The customer is not always right, but the customer always has a point."
* "Change happens."
* "If it’s not in the requirements specifications, don’t expect to find it in the product."
* "Even the best requirements document cannot replace human dialog."
* "You are never going to have perfect requirements.""

Architected RAD gets an A in Gartner study "The Gartner survey of development teams, completed in the past month, found the ARAD approach reduces training time and increases productivity of coders regardless of the vendor tool used.

"We have gotten consistently positive feedback from users of Computer Associates' Advantage:Plex, Compuware's OptimalJ and IBM's Rational Rapid Developer offerings," writes Michael Blechar, one of the Gartner analysts who worked on the survey."

"Of the newest tool technology, the Gartner report says: ARAD methods and tools are just beginning to achieve recognition by mainstream Java 2 Platform, Enterprise Edition (J2EE) and .NET developers. The tools provide development teams with pre-built J2EE and .NET frameworks as well as pre-built technical components, which Gartner says can be customized by technical architects and used to generate 60 to 85 percent of the code. Then the programmers on the development team can add the business logic specific to the application."

Sunday, November 28, 2004

Learning with the Semantic Web

Reasoning and Ontologies for Personalized E-Learning in the Semantic Web "Adaptive educational hypermedia systems are able to adapt various visible aspects of the hypermedia systems to the individual requirements of the learners and are very promising tools in the area of e-Learning: Especially in the area of eLearning it is important to take the different needs of learners into account in order to propose learning goals, learning paths, help students in orienting in the e-Learning systems and support them during their learning progress...We propose a framework for such adaptive or personalized educational hypermedia systems for the semantic web. The aim of this approach is to facilitate the development of an adaptive web as envisioned e.g. in (Brusilovsky and Maybury, 2002). In particular, we show how rules can be enabled to reason over distributed information resources in order to dynamically derive hypertext relations. On the web, information can be found in various resources (e.g. documents), in annotation of these resources (like RDF-annotations on the documents themselves), in metadata files (like RDF descriptions), or in ontologies. Based on these sources of information we can think of functionality allowing us to derive new relations between information. "

Friday, November 26, 2004

Free trade that isn't free

Patently yours "We quite understand that (the title) How to Kill a Country may sound alarmist...We use the parallel experience of Canada to buttress some of these points. Canada is now being described by leading author, Mel Hurtig, as a "Vanishing country"...By the mid 1980s, about half of the major US corporations in Canada were 100-percent American-owned. Ten years later, some 85 per cent had no Canadian shareholders...As Canadian shareholders were eliminated, corporate boards were substantially reduced in size and more American directors were added, as were more U.S. CEOs and board chairmen. As external directors were eliminated, there was no longer a force to influence policy decisions which would be beneficial to Canada. Gone too was the ability to scrutinise the payment of dividends, management fees, and content costs paid to the parent company."

"But the Australian negotiators overlooked the point that Australia is a net importer of IPRs...As a whole, Australian industry has everything to gain by moving away from the Microsoft stranglehold and towards an Open Source mode - rather like governments in Germany and Taiwan are currently doing in earnest...local firms would do well to shift towards the Open Source model, and utilise open source programs such as Linux..."

"...frequently the actions are entirely justified, and entirely in the spirit of competition - as when an importer of copyright-protected CDs seeks them out in a third market and imports them, entirely legally, at a lower cost than is stipulated by the IPR-holder. The FTA makes this action much more difficult - in the name of placing severe restrictions on parallel imports. Another name for this is placing restrictions on free trade in IPR-protected goods - all within a "free trade" agreement!"

Dangling Databases

Why Relational Databases And Semantics Don't Mix "Jarg Corporation, which takes its name from "jargon", is about the next evolution of search. Actually, it is about more than that since semantics has wider applicability than search, but we will stick with search as an easily understood example of this technology."

"So, the question is: how does it do that?

The first answer is that it doesn't - at least in any general sense - only where it has already built an ontology which, in this case, is within healthcare. Indeed, its first customer is a hospital medical library. However, industry knowledge bases are becoming widely available and Jarg reckons that about 75% of the work involved in creating an ontology can be automated, so extending its product for new customers should not be a big issue."

"The second answer is that it achieves this sort of performance by refusing to use a relational database. Instead, it has patented its own approach, which involves storing semantic fragments. A segment fragment is either two elements and the relationship that joins them (for example, "Waterloo is a station") or it can store and element with a "dangling" relationship. This latter concept is especially important. The whole point about searching is that you want to be able to discover relationships that you didn't know existed."

MEST Architecture

The MEST architectural style "We have both agreed that we shouldn’t call our architectural style ProcessMessage after all. Instead, we decided to call it MEST (MESsage Transfer) so as to recognise the big influence REST had in this work and our thinking in general. So, after all the blog entries and the discussions with the community, we have finally arrived to the MEST architectural style of which ProcessMessage is part."

WWW2005 Tutorial: Architecting and Developing Message-Oriented Web Services "Savas and I have been accepted to present a tutorial at WWW2005 in Chiba next year. We're going to be talking about message-orientation and the MEST architectural style. Our approach is going to be very interactive: We'll be doing head-to-head live coding and will have the audience involved right the way through.

Broadly speaking, we're going to introduce a simple problem domain (probably a simple game), get the audience to work through the domain with us, identifying the services and message exchanges involved, then we'll code up a solution. Once we've got a solution in place we'll break it in various interesting ways and show how various WS-* protocols can help prevent such breakages from occuring."

RDF on Lambda

RDF and Databases "Some RDF research dropped me to a nice paper (PDF) from IBM discussing RDF with relational databases. This combination can replace half-baked application data mechanisms. These crop up regularly in my consulting work. Think nested directories of Windows INI files and brittle, binary files breaking on minor design iterations. The pain, the pain."

"There are several projects in this domain. My favorite so far is OpenRDF Sesame. It supports querying at the semantic level. It seems more mature than others, having derived from previous efforts, and works with both PostgreSQL and MySQL as well as Oracle. An abstraction layer called SAIL makes Sesame database-agnostic. Sesame even sports a stand-alone b-tree system, or in-memory operation, if you don't want an external database."

Thursday, November 25, 2004

One step closer to making a lighter Kowari

An initial port of Sesame 1.1's RIO RDF/XML parser to JRDF has been checked into JRDF's CVS repository. It's basically the same with a few modifications to the constructor, the SAXFactory explicitly asks for a reader that will use namespaces and a reduction on depending on other Sesame classes.

In Kowari it already uses JRDF to do N3 and RDF/XML exporting. A recent requirement that I was just asked about was providing RDF/XML from the result of an iTQL query. The client side JRDF API allows the creation of a JRDF graph using an iTQL answer so it might be possible to plug this into the exporter classes.

Instruction on getting it are here.

A simple example of using it:

public class RdfXmlParserExample {
public static void main(String[] args) throws Exception {
String baseURI = "http://slashdot.org/index.rss";
URL url = new URL(baseURI);
InputStream is = url.openStream();
final Graph jrdfMem = new GraphImpl();
RdfXmlParser parser = new RdfXmlParser(jrdfMem);
parser.setStatementHandler(new StatementHandler() {
public void handleStatement(SubjectNode subject,
PredicateNode predicate, ObjectNode object) {
try {
jrdfMem.add(subject, predicate, object);
}
catch (Exception e) {
e.printStackTrace();
}
}
}
);

parser.parse(is, baseURI);
Iterator iter = jrdfMem.find(null, null, null);
while (iter.hasNext()) {
System.err.println("Graph: " + iter.next());
}
is.close();
}
}


While mentioning Kowari, one of the included resolvers to be included in 1.1 will generate statements based on the latitude and longitude of two points. We used the DAML Geofile linked from Semantic Web Application Integration: Travel Tools.

Tuesday, November 23, 2004

15 parts of Classification Theory

Another JOT link, this time the 15 parts on "The Theory of Classification". Here they are: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15.

Creating Requirements

Generating Complete, Unambiguous, and Verifiable Requirements from Stories, Scenarios, and Use Cases "Although very valuable as requirements elicitation, analysis, and initial validation tools, stories, scenarios, and use case path specifications are typically inadequate for specifying requirements because they are incomplete, ambiguous, and therefore unverifiable. For example, they usually do not address preconditions and postconditions, which have a huge influence on the meaning of the requirements. Similarly, they do not tend to state the triggering events that cause them to be true. They also do not typically clarify the distinctions between requirements (i.e., what the system must do and what postconditions it must ensure) and ancillary information (e.g., triggering events produced by actors and preconditions that may or may not be ensured). This column has provided examples and guidance on how to transform stories, scenarios, and use case path specifications into complete, unambiguous, and verifiable textual requirements."

Anti-metadata Google

Dear Mr. Bosworth "I can't believe that a smart guy like you could advocate Poscasting as a solution to the complexity of the semantic web and all that enterprise, corporate complexity - without a hint of meta-data - anywhere.

How's the work? Inference? Osmosis?

I know that Google is known as an anti-meta-data sort of place - but PLEASE oH LORd - get over that!

This is NOT about the religious wars of RSS 1.0 vs RSS 2.0. I couldn't give a dam about rdf, T B-L or any of that semantic web hooey.

I'd just like to see folks standardize on attributes, properties and meta-data around these new, burgeoning forms of micro-content."

Monday, November 22, 2004

Third SIGSEMIS

Volume 1, Issue 3 (PDF) "For the third time AIS SIGSEMIS bulletin is in your hands. Many interesting articles, a featured interview with Tom Gruber, our regular columns as well as several interesting announcements are waiting for your attention."

Interview with Tom Gruber: "In fact, the World Wide Web is based on a semiformal ontology, and it shows how ontological commitment works in software interoperability. At its core, the concept of the hyperlink is based on an ontological commitment to object identity. In order to hyperlink to an object requires that there be a stable notion of object and that its identity doesn’t depend on context (which page I am on now, or time, or who I am). Most of the machinery of the early Web standards are specifications of what can be an object with identity, and how to identify it independently of context. These standards documents serve as ontologies – specifications of the concepts that you need to commit to if you want to play fairly on the Web. If one built a system with these commitments, all of the web infrastructure works well."

"Intraspect was designed on the assumption that it is more valuable to get evidence of human knowledge into a collective memory than to add structure to existing online material. So we created technology that helped people work together on-line, and as a byproduct their work became available for discovery using information retrieval technology."

Other interesting articles: "Component Requirements for a Universal Semantic Web Framework", "Elements of a First Visual Rule Language for the Semantic Web" (about REWERSE), "Response Management in Multidimensional Web Information Systems", "The Semantic Web Trends in Brief", "Using e-business Registry / Repository for E-Health Semantics" and many others.

Your blog is boring and other links

* RSS, Blogs on a Roll, But How Extensive is Their Use? and a response.
* Swebok All you need to know about Software Engineering.
* Best Software Essays of 2004 via Danny.
* ISCOC04 Talk on how RSS 1.0 failed, with lots of other comments that seem to support a RESTful like system like RDF. With followups Fielding Bosworth and Quick Reactions.
* The Many Faces of J2EE, v5.0 Another annotations for J2EE 1.5 piece as well as Google and JBoss join SE/EE Java Council.
* Domain Speific Language and Domain Specific Modelling. Is UML really the best tool for the job?. The end of UML?

Friday, November 19, 2004

Metadata does Matter

In a similar vein to, I.T. does Matter a recent column entitled Does Meta Data Matter? highlights the competitive advantage that metadata plays in the enterprise.

"Is meta data strategic in nature? Yes. The reality is that information technology continues to get more complex. Our ability to manage these technologies and solutions requires a higher degree of knowledge and management skills. In the dynamic environment we see emerging, command and control style of management fails to deliver a competitive advantage. As a wise man once said, "All great things have been in done in spite of management." Our ability to adapt within the technology community may be dependent on our ability to handle multiple tasks, objectives and strategies which can then change on a dime. Meta data plays a central roll in your organization's ability to become an agile organization. Moving to common infrastructures, software platforms and even systems does not negate the competitive advantages that technology and meta data can bring."

Keeping abreast of the Semantic Web

Revolutionising breast cancer treatment through knowledge management "The system uses Semantic web technologies, enabling information from X-ray mammograms, MRI images, biopsy results and data from the clinician to be made available when the practitioners meet for their weekly Triple Assessment Procedure. Semantic web technologies allow information to be linked in such a way that it can be easily processed by machines. Practitioners can then view different types of images and scans, call up patient information, and automatically generate reports. It is also possible to investigate, annotate and analyse the data using web and Grid services.

‘This research draws on technologies in which the UK is a world leader,’ says Professor Nigel Shadbolt of the School of Electronics and Computer Science at the University of Southampton. ‘Eventually, e-health will be delivered using the web and incredibly powerful networks of computers. Medical practitioners will have the information and evidence at their fingertips to support decision-making that has a direct impact on us all.’"

More Working Notes

Andrae is out doing both Paul and I with two blogs: Etymon which has links to papers, interesting articles and the like and Circumlocute which is similar to Working notes.

When two worlds combine

"Gnowsis and Fenfire have met and another time, great Semantic Web developers (aka us) have proven that using ontologies, RDF and web protocols RULEZ. In just a day, two open source projects made substantial integration work."

An early screenshot of Fenfire and Gnowis combining their efforts.

Google Scholar

Hits for the Semantic Web gives you that piece in Scientific American. Another interesting one is citations to "The Description Logic Handbook: Theory, Implementation and Applications".

Sesame 1.1

Sesame 1.1 released "Highlights of this release are:

* The Graph API, an extension of Sesame's access APIs, allows fine-grained manipulation of RDF models directly from Java.
* The Native Disk Store is a new storage backend that works directly on the file system, without need for a DBMS. It uses B-Tree indexing on binary files for fast, efficient and scalable storage.
* SeRQL revision 1.1 is a syntax revision that makes SeRQL queries even easier to read and write, and makes embedding in XML easier.
* Blank node handling has dramatically improved compared to 1.0.x.
* Lots of issues related to full Unicode support have been fixed.
* RDF Schema inferencing has been updated to be fully compliant with the W3C RDF Semantics Recommendation.
* Support for MS SQL Server as storage backend RDBMS. Thanks to Adam Skutt for providing fixes and suggestions for this.
* The Rio parser now supports the Turtle serialization format.
* Partial OWL reasoning support through Sesame's custom inferencer.
* Fully updated and extended User Documentation, including code examples for use of the Sesame APIs and a new Troubleshooting and FAQ chapter."

Thursday, November 18, 2004

What's new with Java 6

Sun invites outside involvement with Java 6 "The new version will be easier to manage, exposing information that outside management software can use to make control decisions, said Mark Reinhold, chief J2SE engineer. And it will be easier to find problems, with an "attach-on-demand" feature that can let debugging software graft onto software while it's running instead of just before it's launched.

Another item on the list is support for a basic set of Web services called WS-I, Hamilton said. That basic set, standardized through the Web Services Interoperability organization, had been scheduled for the Tiger release.

And Mustang will have better integration with graphical user interfaces, including Microsoft's upcoming Longhorn version of Windows, Reinhold said. "

Details here.

Tuesday, November 16, 2004

The Problem with Ontologies

The Ontology Problem: A Definition with Commentary" "If the Ontology Integration Problem is not solved it will not be possible to answer a semantic search query across the open Web for a question such as "find all software products that work with Linux and are open-source and are endorsed by people or companies I trust." Why not? Because while there could be tons of raw RDF and OWL instance data out on the Web that is relevant from various ontologies, unless it either all uses the same ontology or all the ontologies that various instances refer to are integrated, the query agent will have no way of making sense of or normalizing the results. Of course, the query agent could simply run the query on all data from all ontologies it knows about, and then just present the results in a single list, sorted by ontology -- but as we've seen above, different ontologies might mean different things by classes with the same names -- and thus the results returned may not really be relevant or well-ordered."

"I believe the solution will ultimately stem from a solution to the Upper Ontology Problem -- if we can solve that problem, then much of the Ontology Integration Problem will go away as most ontologies will automatically be inter-mapped at the Upper-Ontology Level at least. If we had a standard Upper Ontology and furthermore, if this standard were to include concepts for mapping between ontologies and expressing shades semantic mapping and structure between ontological definitions, then mapping would be even easier."

Metadata Driven Component Model

BEA Announces Open-Source Milestones for Apache Beehive "BEA Systems, Inc. (Nasdaq: BEAS - News), a world leader in enterprise infrastructure software, today announced significant milestones in the company's open-source efforts including code release milestones, updated tools and additional platform support for Apache Beehive -- the lightweight metadata-driven component model which is designed to help accelerate the development of service-oriented architectures (SOAs)."

"The M1 code release for the Apache Beehive project is now publicly available for use in both Beehive open-source development and by BEA WebLogic Workshop 8.1 users, and can help developers to begin developing and collaborating on SOA-based applications."

The Apache Beehive Project "This is the project working on making J2EE easier by building a simple object model on J2EE and Struts. The goal is to take the new JSR-175 metadata annotations and use them to reduce the coding necessary for J2EE. The initial Beehive project has three pieces."

Related Pollinate Project " Pollinate is an Eclipse technology project slated to build an Eclipse-based IDE and toolset that leverages the open source Apache Beehive application framework." The PDF slides has some more information on NetUI Page Flows.

Via BEA announces Apache Beehive Milestone

Monday, November 15, 2004

HAVING

One of the features we recently added to iTQL was HAVING. This is practically identical to SQL's use of HAVING. For example:
SELECT $foo COUNT (SELECT $bar 

FROM ...
WHERE $bar <-> <->)
FROM ...
WHERE $foo <-> <->
HAVING $k0 <tucana:occurs> '1.0'^^<xsd:double> ;

There are a few things that bother me with this. The first one is the implicit column names. All aggregate functions in iTQL are implicity given $kn. Where n is an integer. The variable name should be able to be set by the user; something like: "SELECT $no_people=COUNT ..." or you could copy the SQL 92 use of AS.

Another one is caused by copying SQL. Putting constraints in the WHERE that were meant for the HAVING will produce an error from an SQL interpreter explaining that certain constraints must be in the HAVING. What this really means is that it could've been done in the WHERE clause and have been automatically extracted if necessary.

Also, the use of double should really be changed to nonNegativeInteger.

A good summary of these first two points are highlighted in a presentation "The Importance of Column Names". It is part of the web site of the new version of "The Third Manifesto".

Here's an example:
SELECT D#, COUNT(*)  

FROM EMP
GROUP BY D#
HAVING COUNT(*) >= 50

In SQL:1992 this is equivalent to:
SELECT *  

FROM ( SELECT D#, COUNT(*) AS NUMBER_OF_EMPS
FROM EMP
GROUP BY D# ) AS TEETH_GNASHER
WHERE NUMBER_OF_EMPS >= 50

The Tutorial D version:
( SUMMARIZE EMP  BY { D# } ADD COUNT ( ) AS NUMBER_OF_EMPS ) 

WHERE NUMBER_OF_EMPS >= 50

Closing the World

'Closed world' assumptions in RDF "Could we not use the XML declaration attribute standalone to determine that a document is self-contained? As in:

<xml version="1.0" encoding="UTF-8" standalone="yes">


That assumption would possibly elimitate the usefulness of that document in a Semantic Web, but could potentially make the 'open world' issue controllable."

RDF/XML is just one serialization of RDF so modifying the XML doesn't change the RDF contained within. While I don't think you can make RDF be closed world, things like ontologies do let you express "all pigs don't fly" or you can perform a query "do any pigs in this graph fly?" which may have been what you wanted anyway.

Manufacturing Semantics

A question of semantics "“I believe the big trend right now in terms of integration of business systems is the movement toward the semantic Web,” says Steven Ray, division chief for the Manufacturing Systems Integration Division (MSID), a division within the National Institute of Standards and Technology (NIST). “The semantic web gives meaning to information—it makes that information formal and acceptable to a computerized system to allow truly intelligent searching.”
According to Ray, the semantic Web is more than a structured database. Today, companies rely on programmers to give meaning to information. For example, if an enterprise resource planning (ERP) company wants to work closely with suppliers to monitor inventory levels, a programmer must complete integration for each new supplier. Over time, this can become quite costly."

"“These standards are the primary building blocks to intelligent and integrated manufacturing,” says Pat Snack, on executive loan from General Motors for AIAG. “We will see a growing need for semantic standards in coming months. The closer we get to end-to-end integrated manufacturing, the more we need to minimize our cost to complexity. It is really a delicate balancing act.”"

Sunday, November 14, 2004

Shortcomings of Spotlight

Tiger's Spotlight - Simplicity with Room for Improvement "While I'm not doubting the success of Apple marketing this as a world-class search tool, from what I know today, there are some shortcomings:
* Only one Importer registrable per File type (.extension) system-wide.
* No daisy-chaining of Importers.
* Very limited amount of data to be stored in dictionary per file.
* No built-in capability to index the content of compressed files like jars, zip, tar etc."

Follows a recent article Apple details plans to Spotlight desktop search. If you want to try something similar there's also Quicksilver which runs on 10.3 and comes with a Spotlight plugin called Flashlight.

Annotations and EJB 3.0

EJB 3.0 Preview " The EJB 3.0 specification uses annotations so that you can declare your EJB metadata directly within the bean class.


import javax.ejb.*;

@Stateful
public class ShoppingCartBean implements ShoppingCart
{
@Tx(TxType.REQUIRED) @MethodPermission({"customer"})
public void purchase(Product product, int quantity) {...}


@Remove void emptyCart() {...}
}

The @Stateful annotation marks the ShoppingCartBean as a stateful session bean. @Tx denotes transaction demarcation, while @MethodPermission defines role-based security for the bean method. EJB 3.0 provides annotations for every type of metadata so that no XML descriptor is needed and you can deploy your beans simply by deploying a plain old JAR into your application server."

Friday, November 12, 2004

Never trust a company with a fish tank

EA: The Human Story "...all along the way there were deceptions, there were promises, there were assurances -- there was a big fancy office building with an expensive fish tank -- all of which in the end look like an elaborate scheme to keep a crop of employees on the project just long enough to get it shipped. And then if they need to, they hire in a new batch, fresh and ready to hear more promises that will not be kept; EA's turnover rate in engineering is approximately 50%. This is how EA works. So now we know, now we can move on, right? That seems to be what happens to everyone else. But it's not enough. Because in the end, regardless of what happens with our particular situation, this kind of "business" isn't right, and people need to know about it, which is why I write this today."

That goes for virtual ones too.

The Floggings Will Continue Until Morale Improves "The list could be much longer, but it boils down to:
* Give them the tools to do their job efficiently
* Remove potential interruptions or distractions
* Make sure they’re motivated"

Always the last to know

NetKernel + Kowari "NetKernel has a great pipes/filters framework for composing services. Kowari is the only non-rdbms backed RDF triple-store with support for queries against datetime data types.

Sadly, building a web application or service in Kowari kinda sucks. But, theoretically, Kowari is embeddable. So I set out to verify this assertion and to gain more knowledge about the inner workings of NetKernel."

"Fourth, NetKernel is really just that, a kernel for managing the interaction and scheduling between components executing in the vm, called modules. But I haven't found support for explicit life-cycle management. That is, there is no init(), start() or stop() type methods on a module. It seems to rely on finalizers to clean up resources. Thus, shutting down a NetKernel instance corrupts the Kowari database."

Like I said in the comments, previously commited data will be there no matter if you kill the process, turn off the power or whatever. However, if you stop it during a load or if autocommit is off it won't be there when restarting.

New from Google

Coincidentally, Google's Index Size Jumps and Search Engine Size Wars V Erupts "On the eve of Microsoft's long anticipated launch of MSN Search, Google is reporting on its home page that its index size has nearly doubled. Google now claims that it is now "Searching 8,058,044,651 web pages." Earlier today, a search for the word "the" returned nearly 11 billion results, a far larger number than officially reported on the home page. No matter which numbers you believe, it's a significant expansion of Google's web database."

"Microsoft had planned to seize the title of biggest search engine by announcing 5 billion pages indexed today."

And the people at Google "abusing" their power Tweaking the tiger's tail. It's still there. A search for kowari only brings up references to the RDF triple store and not the animal. Google brings up Kowari the animal as the second result.

Thursday, November 11, 2004

Too many or not enough

The Ontological Challenge "There are several big missing pieces right now in making the semantic web. Certainly the lack of ontologies is a major issue. There are, I guess Deborah would say thousands of ontologies. So there maybe isn't a lack; there may be too many from one perspective. When you start looking at these ontologies, what you find is that some of them are overly specialized; maybe they are focused, for example, on particular niches of interest to DARPA, not particularly of great use to consumers unless you live in New York (with the paranoia that we all experience there)."

"Currently, there is no good human-readable mid-level ontology that's covering common-sense concepts. Cycorp has probably the most impressive ontology. The only problem is it's so big and complex and requires such a high, steep learning curve to actually do anything with it that it's not really targeted at the needs of normal developers and regular end users. The lack of the good, open ontology that covers common-sense concepts is a big problem. That's something we're working on, too. I think that ultimately there ought to be at least something like that that comes out of the W3C or is handed to the W3C at some point to at least provide a basis for describing certain types of entities and relationships that we all have to use in our applications."

"So associating data with ontologies is a problem. Building ontologies, I come from the school of thought of top down. I've never seen a bottom-up ontology that I liked. There aren't many. Having built much of ontologies, I think that the amount of thinking that goes into it is just so intensive that to do it well, I just don't think that, at least without great AI, we'll be able to do it anytime in the next couple of decades."

Wednesday, November 10, 2004

Welkin

Welkin: A General-Purpose RDF Browser "Many consider the Semantic Web to be vaporware and others believe it's the next big thing. No matter where you stand, a question always pops up: Where is the RDF browser? The SIMILE Project, a joint project between W3C, MIT and HP to implement semantic interoperability of metadata in digital libraries, released today the first beta release of a general purpose graphic and interactive RDF browser named Welkin (see a screenshot), targetted to those who need to get a mental model of any RDF dataset, from a single RSS 1.0 news feed to a collection of digital data."

Welkin Homepage.

Creative RDF in Queensland

Creative Commons taking shape in Australia "The Australian branch of the Creative Commons is taking shape with the Queensland University of Technology being the lead agency, according to Professor Brian Fitzgerald, head of the university's law school.

In February this year, QUT became the Australian institutional affiliate for the project and over the last few months it has worked closely with the legal firm Blake Dawson Waldron to set up the platform for the project in Australia.

The University is holding a conference in January next year on Open content licensing, and has invited Stanford University Law Professor Lawrence Lessig, one of the directors of the Creative Commons, as its keynote speaker."

6 months with Java 5

"I have been writing JDK 5.0 code for over six months now, so I thought I would take some time to reflect on my experience and draw a few conclusions on the features that were introduced."

Surprisingly the favourite, is the enhanced for loop:
"The undisputed winner. I can't even begin to describe how good it feels to use the new for loop everywhere (well, almost everywhere). I mentally cringe the few times when I am forced to use the old for loop, typically when I need the index or that I want the Iterator to be visible outside the loop."

Unsurprisingly annotations gets a rave:
"Obviously, I am partial to annotations since they are at the heart of TestNG but I am a firm believer that annotations are going to change the way we build software in Java. We have been relying for far too long on reflection hacks to introduce meta-data in our programs, and annotations are finally going to provide an excellent solution to this problem.

Also, I haven't felt the need to use some of the predefined annotations such as @Override, so I haven't formed an opinion on them yet.

It seems inescapable to me that in a couple of years, most of the Java code that we will be reading and writing will contain annotations. "

And generics:
"In a nutshell, I have this to say about Java generics: my code feels more robust, but it's harder to read."

JDK 5 in Practice.

Also an interview with one of the authors of Hibernate:
"Well, we are a bit stuck. We can't use many of the new features, because Hibernate needs to stay source-level compatible with older JDKs. The annotations stuff is okay, because we can provide it as an add-on package.

Certainly, annotations are the most significant new feature of Java 5, and it's very likely that they will completely change the way we write code."

The problem with SOAP

Web Services - The SCRAM Generation "SOAP - the bedrock of our industry's plans for universal interoperability. The cornerstone of enterprise mission-critical integration."

"SOAP - a communications protocol that does not guarantee delivery of messages, does not guarantee what order those messages will be delivered in, and does not guarantee not to throw in extra messages, just for the fun of it. Now to be fair, SOAP is only half of the problem. The other half is the communications channel itself, which is typically HTTP."

"SCRAM stands for Secure, Coordinated, Reliable Asynchronous Messaging"

SPARQL Protocol

SPARQL Protocol for RDF "This document describes SPARQL, a protocol for accessing RDF data and for conveying RDF queries from query clients to query processors. The SPARQL Protocol has been designed for compatability with the SPARQL Query Language for RDF but is designed to convey queries from other RDF query languages as well."

"The SPARQL abstract protocol has three parts: types, operations, and responses."

"The abstract protocol uses the following types, which fall into three categories: W3C XML Schema types, which are relevant to all abstract protocol operations and borrowed from W3C XML Schema Datatypes; protocol types, which are relevant to abstract protocol operations; and query types, which are relevant to query operations."

"The SPARQL Abstract Protocol defines three kinds of response: success, fault, and informational. Success responses indicate that the protocol operation was successfully executed; fault responses indicate that the protocol operation was not successfully executed; informational responses either provide additional information about an abstract protocol operation or describe specific conditions relevant to particular abstract protocol operations."

"The SPARQL Abstract Protocol is made up of seven orthogonal operations: query, getGraph, getOptions, makeGraph, dropGraph, addTriples, and deleteTriples."

Monday, November 08, 2004

The time for change

J2EE5, EJB3 and microcontainers "Today, java has support for annotation since JDK5.0. We are already using it to transform the way we think about, design, implement and indeed standardize middleware in the next generations of J2EE. Where J2EE has a big edge over .NET is first in a large installed base but most importantly in the quality of the services."

"We have been doing the right thing in the spec committees, namely simplifying the programming models to support POJO and annotations at the specification level by leveraging JDK5.0. Across the board in EJB3 for example, we have completely revamped and simplified the way developers interface with and program to middleware. Instead of complex API's and tons of XML, developers can tag their objects with annotations. Developers have already adopted this approach."

"EJB3 will have a long life, as it is the first to introduce this long awaited microcontainer, lightweight programming view of the world. EJB services can now be used on J2SE with these microcontainers."

"All of this is already used in production, today. Many vendors are on the market, many packages await branding standardization for further penetration of the market. J2EE will remain strong."

No-Nonsense Semantic Web Part II

A No-nonsense Guide to Semantic Web Specs for XML People "I was going to talk about RDQL in this article, but on Oct. 12, the Data Access Working Group (DAWG, pronounced so that it rhyms with 'dog') released the first working draft of Sparql, the query language for RDF."

"So, as a result, the query will return the title and the price of items where the price is less than 30.

Big deal, I hear you saying. I can do that today in SQL.

True, you can. If your data is local and you control it. But what if you want a software agent to do the queries for you? How are you going to find out across different databases how to adapt your query to their own internal logic, to their tables and to the way thay modeled the information in their relational model?"

"So, in short: should you care about RDF? For now, you are safe if you care about keeping your own data valid and coherent. The semantic web is trying hard to unlock the chicken-egg problem of "no killer app until data, no data until killer app" and automatic trasnformation of existing data into RDF is what I think is going to unlock it. Also, the fact that we are building tools that you can now use to operate on your RDF data, for example to browse and search it, will show you what you can gain by making those relationships explicit."

Thursday, November 04, 2004

Parallel Countries

The similarities between the Australian and American political situation was recently highlighted by "Bush & Howard: parallel lives"

"As John Howard ponders the likely return of George Bush, he will no doubt be struck by the parallels between the President's probable re-election and his own emphatic victory last month."

Similarities:
* Both have control in the upper and lower houses to pass laws as long as they can maintain party lines.
* Running trade deficits - Australian Trade Deficit Widens as Imports Increase
* Banning abortion.
Sarah Maddison: Minister mistakes opinion for fact
.
* Increasing surveillance powers. Ruddock relaunches terror laws.
* Tax cuts for the higher tax brackets.
Howard flags new tax cuts

* Reducing gay rights. THE COALITION PLANS TO REINTRODUCE LEGISLATION BANNING SAME-SEX COUPLES FROM OVERSEAS ADOPTION NOW IT HAS CONTROL OF THE SENATE.
* Environment. We won't Sign Kyoto Treaty, Says Australia

SWOOP 2.2 beta 3

SWOOP

New Features:
"1. Integrated Annotea Client for writing and sharing annotations on
any ontology/class/property/individual. (see Advanced->Annotea) - you
can attach and distribute ontology change sets as well
2. Ontology debugging support (check Debug) through better
explanations via Pellet, and highlighting inconsistent classes / class
expressions.

There are a bunch of minor new features (e.g. improved lookup, manual
checkpoints, more XSD support etc), UI improvements (esp. w.r.t adding
new entities, restrictions and Abstract Syntax/Turtle renderers), and
plenty of bug fixes, all noted in readme.txt."

SWOOP v2.2 beta 3 released

Screenshots of versioning and inline Koalas.

Martin Fowler on OOPSLA 2004

OOPSLA2004 "Four patterns were voted off. Factory Method (due to the confusion over what it means - the pattern is different to the more common usage of the term), Bridge, Flyweight, and Interpreter. Two patterns, Singleton and Chain of Responsibility, were split decisions.

I found the votes and discussions interesting. Of course it's hard to really consider this without looking at alternatives. I was surprised that Singleton got away with a split decision, considering how unpopular it's become amongst my friends. Most of the others were voted off because people felt they were sufficiently uncommon and that other patterns would probably take their place, sadly we didn't have time to consider new members."

"Steve McConnell opened up day two. His talk focused on the ten year gap between the first and second edition of his excellent book Code Complete. Brian Foote summed it up as a "litany of things I agreed with". My sum up would be that the industry has made real progress in this area during the last decade, despite the usual views of skies falling in around us."

"Attacking software-as-manufacturing was particularly timely since every OOPSLA attendee got a copy of Software Factories in their goody bag. I've been aware of this work for a while, and my instinctive allergy to this metaphor was only reinforced by the "we need industrialization" motivations. Dig deeper, however, and there are good ideas in here - particularly the approach of integrated DomainSpecificLanguage capabilities."

It's good to see this was discussed as the two books I recently acquired were "Code Complete" and "Software Factories".

Mobile Semantic Web

A Mobile Web That Knows All About You "MyCampus consists of several task-specific agents that automatically capture contextual information. Each MyCampus user has a database, called a "Semantic eWallet," which is a repository for users' personal information, such as class schedules, list of friends and classmates, and lifestyle and event preferences. Location data is generated using Pango's WiFi access-point triangulation. All the data is marked with Semantic metadata so that MyCampus agents can make use of it. User's can set access privileges to allow certain people to know where they are at any given moment, or what their schedule for the upcoming week is."

It uses some projects I haven't heard of before OWL inference engine using XSLT and JESS and ROWL: Rule Language in OWL and Translation Engine for JESS.

Via Semantic Web on mobile devices

PsychiatryOnline to use Silverchair Content Manager

American Psychiatric Publishing Partners With Silverchair to Develop DSM-IV-TR and PsychiatryOnline "With the early 2005 launch of PsychiatryOnline, the complete content of DSM-IV-TR® -- the world's most widely used psychiatric reference -- will be integrated with other essential psychiatry resources through Silverchair Content Manager(TM)(SCM) -- Silverchair's semantic Web platform for clinical references. PsychiatryOnline will provide mental health professionals with comprehensive, intuitive access to the world's leading references..."

That's the little "s" semantic Web, it's not using RDF but just XML based technologies.

A bit of left thinking

As conservative politicians continue to hold power across the world another left vs right debate has been brought up.

Driving on the umm, Right "If human society is of predominantly right-handed people and right-handers tend to the right, then I think Britain got it wrong; we should all drive on the right."

Why do the English drive on the "wrong" side of the road? "In days of old logic dictated that when people passed each other on the road they should be in the best possible position to use their sword to protect themselves. As most people are right handed they therefore keep to their left. This practice was formalised in a Papal Edict by Pope Benefice around 1300AD who told all his pilgrims to keep to the left."

As usual, America copied off the French:
"The connection with the USA is thought to be General Lafayette who recommended a keep right rule as part of the help that he gave the Americans in the build up to the war of Independence. The first reference to keep right in USA law is in a rule covering the Lancaster to Philadelphia turnpike in 1792."

Which side of the road do they drive on? Has a map of Left vs Right, a section on pedistrian movement and lists a place in America where they drive on the left: "There is a rather dramatic segment of Interstate 5 where one drives on the left. It is on the Five Mile Grade coming into the Los Angeles area from the north. Because there are four lanes going in each direction, the separation is several miles long, and the two roadways are on opposite sides of a canyon, the effect is quite impressive."

Wednesday, November 03, 2004

AntFlow

AntFlow: Hotfolder Driven Workflow and Automation based on Ant "AntFlow builds upon Apache Ant to provide a new approach to simplifying system automation that uses pipelines of hot folders chained together to perform a given task...Hot Folder Triggers - AntFlow allows users to automate processes based on a file being added or changed within a folder on disk. This provides an intuitive and user-friendly way for end-users to interact with applications and can greatly simplify integration between applications that can output and injest files."

A related article, Patterns and Strategies for Building Document-Based Web Services.

The Greatest Democracy of All

What the U.S. can learn from India's electronic voting machines. "While we in the United States agonize over touch screens and paper trails, India managed to quietly hold an all-electronic vote. In May, 380 million Indians cast their votes on more than 1 million machines. It was the world's largest experiment in electronic voting to date and, while far from perfect, is widely considered a success. How can an impoverished nation like India, where cows roam the streets of the capital and most people's idea of high-tech is a flush toilet, succeed where we have not?"

"Unlike the machines used in the United States, the Indian machines are not networked. Each one has to be physically carried to a central counting center. This takes more time, of course, but reduces the opportunities for mischief. Someone who wanted to throw the election would have to fiddle with thousands of machines, one at a time."

"Or, as the Russians might put it: Why build a million-dollar pen when a pencil will do?"

Another site, E-Voting News and Analysis, from the Experts lists the current problems occurring in the various US states.

It seems weird that the US votes on a weekday. In Australia they picked the election day so that it didn't conflict with the football finals.

Monday, November 01, 2004

Predicate Dispatch

JPred: Practical Predicate Dispatch for Java "With predicate dispatch, a method implementation may specify an arbitrary predicate as a guard. A method m1 overrides another method m2 if m1’s predicate logically implies m2’s predicate. Ernst et al. provide a number of examples illustrating how predicate dispatch unifies and generalizes several existing language concepts, including ordinary OO dynamic dispatch, multimethod dispatch, and functional-style pattern matching."

"JPred augments the Java language by allowing each method declaration to optionally include a clause of the form when pred, just before the optional throws clause. The predicate expression pred is a boolean expression specifying the conditions under which the method may be invoked."

"While a traditional multimethod is expressed in JPred as a predicate consisting of a conjunction of specializer expressions on formals, JPred also allows arbitrary disjunctions and negations."

"Finally, a notion of resend [13, 37], which generalizes Java’s super to walk up JPred’s method-overriding partial order, could be useful to allow predicate methods within a class to easily share code."