Thursday, June 30, 2005

Semantic Web Fast, SOAP not that Slow and other links

* The Semantic Web In One Day "...syntactic aspects of data integration turned out to be tedious. Often, output from tool A can’t be used directly as input for tool B, although both have the same language capabilities. For example, both tools can handle RDF for input and output, but the resulting data is syntactically incompatible to the extent that the tools can’t communicate." Full article here.
* SOAP Performance Considered Really Rather Good points to a number of people studying the speed of SOAP. An intersting paper is "An Evaluation of Contemporary Commercial SOAP Implementations" which says that "SOAP and non-SOAP implementations continued to widen with .NET Remoting offering 280 msgs/sec at peak while most SOAP implementations were only handling from 30 to 60 msgs/sec. Even the leading Product A Document/Literal implementation only gave a maximum throughput of 67 msgs/sec. The two lowest performing RPC/Encoded implementations only handled 15 msgs/sec, the binary/TCP alternative." Not quite the "speed of light is the limiting factor".
* Secrets of the A-list bloggers: Technorati vs. Google "If Google favors indexing more popular sites more often, a clear opprtunity for world-live-web search engines like Technorati would be in the long tail of less-often-indexed sites but Technorati seems to ignore that opportunity and concentrate on the top sites. What that will translate into is a direct reproduction of the power laws when it comes to indexing of blogs."
* A conversation with Jeff Nielsen about agile software development "I was particularly interested to hear about Jeff's use of FIT, Ward Cunningham's Framework for Integrated Test. This technique first appeared on my radar in an outtake from our 2003 story on test-driven development. A more recent development is Fitnesse, a Wiki that supports the use of FIT... pains me to say so but, according to Jeff, XML-oriented tools have so far failed to cut the mustard in this environment." XML is not agile!
* Managing Component Dependencies Using ClassLoaders "Java's class loading mechanism allows for more elegant solutions to this problem. One such solution is for each component's authors to specify the dependencies of their component inside of its JAR manifest."

It's somehow fuzzy here

Playing with Google Earth the US area has fast food places and monuments and the rest of the world is all fuzzy and you most just get rivers drawn up. You can tell where the closest KFC is to the Washington monument but I don't even know if there's a Krispy Kreme Doughnuts anywhere in Australia. Is this a realistic view of the rest of the world from an American point of view? I haven't checked but is Iraq highly detailed, Afghanistan all blurry (especially around the borders), Canada is just like the US, etc.

Tuesday, June 28, 2005

Northrop Buys Tucana and Continues Kowari

Northrop Grumman Buys the Tucana Knowledge Server "Stunningly for a company their size, Northrop has not only agreed to support Kowari but rushed to do so. I certainly didn't expect a US federal systems integrator to "get" Open Source Software, but times have clearly changed. Their senior managers have made a legitimate effort to figure out the licensing and how to make it work within their business model. I have confidence that we can figure out a way to make it work for both the Kowari community and Northrop Grumman."

A Grumman employee? Or a Thoughtworker with inside knowledge? Building an Agile Enterprise "I have confidence that we can figure out a way to make it work for both the Kowari community and Northrop Grumman."

Update: Paul's doing the handover

Wednesday, June 22, 2005

Jazzed by Jackrabbit

Catch Jackrabbit and the Java Content Repository API "If the Java Content Repository (JCR) API expert group's vision bears out, in five or ten years' time we will all program to repositories, not databases, according to David Nuescheler, CTO of Day Software [4], and JSR 170 spec lead. Repositories are an outgrowth of many years of data management research, and are best understood as fancy object stores especially suited to today's applications."

"The Jackrabbit code base contains not only the JCR API reference implementation, but also a fully functional repository as well as several contributed libraries for tasks, such as accessing a remote repository via RMI. There is even a JDBC persistence manager to allow plugging in a relational database as a persistent store, and an object-relational mapping tool that allows Hibernate applications to use the repository."

"The default Jackrabbit repository is based on the file system. However, Jackrabbit provides a JDBC persistence manager that relegates data storage to a relational database. As any JCR-compliant repository, Jackrabbit can be accessed through any protocol such as WebDAV or RMI. Examples for different repository access modes are included in the Jackrabbit source distribution."

"Still, for a public blogging "superstore" to have real value to application developers, for instance, some agreement on the node types supporting a blogging data model would be helpful. The repository community has so far avoided the politically sensitive pitfalls of trying to initiate agreement about such information models. The jury is still out whether truly universal data "superstores" can emerge in the absence of such shared data models, or if they will remain a dream befitting Utopia."

Tuesday, June 21, 2005

Scripting the Semantic Web

Semantic Web, meet Ruby on Rails "OWL is capable of defining a rich object model with classes, properties and instances, which led me to consider the possibility of a separating domain model maintenance from the implementation of services that leverage the model. I've seen attempts to realize this vision, but the only ones I can remember involve code generation, particularly from UML models and have not been particularly successful. Ruby, being a dynamic language with rich metaprogramming facilities presents some interesting possibilities for using shared domain models since it allows runtime definition of classes. It should be possible to write a library for Ruby that loads entire object models and instances on the fly, much in the same manner that Ruby on Rails ActiveRecord loads its class definitions for persistent classes directly from database metadata."

Also related, Deep Integration of Scripting Languages and Semantic Web Technologies "Instead of loading an ontology, the notion of importing one leads to a different association for the programmer. The difference may be regarded as pedantic or subtle, but it is an important one: the programmer, instead of regarding the ontology as mere data she has to load and access via an API, the imported ontology behaves like a library, extending her possibilities like only code does."

What's covered in the second paper is the problem of the Open World Assumption, in programming they state that we need a third value "unknown" rather than just true and false.

Somewhat related, Hitting reload is the framework job "What's the point in designing tables for a webapp when an RDF-backed store will manage the data for you and RDF queries will come back as tabular data anyway? There are RDF triple stores that will handle in the order 10^6 statements - Leigh Dodds is doing some research on that, up to 10^8 by the looks of things. If I need queries instead of hacking out iterators+fiters I'll use versa/itql/rdql. Now, saying I never want to design another relational schema again is not to say I don't want to use a database. Most of these RDF triple stores are in fact using an RDBMS in the background, as the filesystem and indexer, it's just that the relational schema in use is not exposed to the application."

Sunday, June 19, 2005

Merry Links

* Web Services with WebObjects.
* Why do physicists want to kill their father/grandfather? They didn't prove that you can't sleep with your grandmother though.
* Abraham Bernstein on users "The regular user is not able to cope with strict inheritance."
* x86 OSX is all part of Steve's 10 year master plan.
* MicroSpring "It is a JDK 1.5 IOC implementation in a 30k jar file! It is compatible with the Spring 1.1.3 XML format, but only the IOC elements, and best practice is followed according to the Spring documentation."

10 Minute Commits for Better Code

The old way of building software "However an alternative does exist, and it's not that hard to achieve. This alternative involves short development cycles, doing a small amount of tested fully refactored work, and regular commits. There's no magic involved, and it lends itself to fully tested code, constant integration, knowledge transfer and robust code that is easy to update as requirements change (which they always do). Of course if you throw pairing in on top, you get even more of the benefits, however judging by recent experiences this is something that managers and most developers are not ready for, even if it produces excellent results."

In my recent refactoring of JRDF, if it's more complicated that 10 minutes work it doesn't get done. Thinking about how to achieve something in 10 minutes that gets to where you want to go is as powerful a programming tool as anything I've come across. Luckily, running the existing tests in JRDF doesn't take more than a few seconds. Of course, some of this requires a rather powerful IDE too.

Thursday, June 16, 2005

Two Semantic Webs

Bronze from anear, by gold from afar? points to Semantic Web Architecture: Stack or Two Towers? "Features such as closed world assumption and negation as failure (NAF) can be supported by powerful query languages—queries already have a closed world flavour (because distinguished variables can only bind to named individuals), and it is natural to extend this with NAF by way of query subtraction (e.g., the answer to the query “faculty(?x) and NAF professor(?x)” can be computed by subtracting the answer to the query “professor(?x)” from the answer to the query “faculty(?x)”). These features are already supported in query languages such as SPARQL [14] and nRQL [8] (the query language implemented in the Racer system). Moreover, recent work on integrating rules with OWL suggests that future versions of this framework could include, e.g., a decidable subset of SWRL, and a principled integration of OWL and Answer Set Programming [5, 12, 13].

On the other hand, adopting Datalog rules (and DLP with Datalog semantics) would effectively establish two Semantic Webs, with little or no semantic interoperability between the rules based Semantic Web and the ontology based Semantic Web, even at the RDF level. These two versions of the Semantic Web would inevitably be in competition with each other, and this would make the Semantic Web much less appealing: new users would be presented with a difficult choice as to which part to choose, and in choosing would sacrifice semantic interoperability with the other part."

Tuesday, June 14, 2005

JRDF for Learning

In the past I used another project to practically use trendy new things like patterns, XML, Swing, etc. Similarly, I'm going to use JRDF for the same purpose. Kowari is a bit too big for things like going over to Java 1.5, IoC, mocking (real unit tests), lock free alogirthms (including B-Trees) and a few other things that I want to try. I'm not sure it's possible to have a system that doesn't have transactions but it would be interesting to find out. So basically this is just to let people know to expect some changes in JRDF.

Practically, it might mean an RDF/XML based pull parser, persistent JRDF, and more interesting APIs. I'm convinced that developing web services is too expensive and may implode under its own weight - so maybe something based on netKernel or a REST based framework would be a good idea. At the moment I'm just using it to see how much I can get out of IntelliJ.

Friday, June 10, 2005

A Reminder About Incremental and Test Driven Development

Iterative and Incremental Development: A Brief History "Project Mercury ran with very short (half-day) iterations that were time boxed. The development team conducted a technical review of all changes, and, interestingly, applied the Extreme Programming practice of test-first development, planning and writing tests before each micro-increment. They also practiced top-down development with stubs."

"We were doing incremental development as early as 1957...where the technique used was, as far as I can tell, indistinguishable from XP...All of us, as far as I can remember, thought waterfalling of a huge project was rather stupid, or at least ignorant of the realities...I think what the waterfall description did for us was make us realize that we were doing something else, something unnamed except for “software development.”"

Other notable references include Boris Beizer and Bill Hetzel in "Introduction to Test Driven Development of Embedded Systems Software": "The value of writing tests early in the design process was first mentioned in 1980 by Boris Beizer, a noted software testing expert. He described the benefit to the testing process of thinking about testing earlier, and then elaborated this point to include the idea that tests developed before the targets of the test may provide additional value in guiding the design effort. Decades later, we have come to believe this is true for a variety of reasons."

Monday, June 06, 2005

Drools 2.0

Drools 2.0 Released "Drools is designed to allow pluggeable language implementations. Currently rules can be written in Java, Python and Groovy. Drools also enables Domain Specific Languages (DSL) via XML using a Schema defined for your problem domain. DSLs consist of XML elements and attributes that represent the problem domain. An XML Authoring tool provides a semi-rapid development environment with a drag and drop type interface based on the provided Schema."

Examples: House Example and Semantics Module Framework.

Searching around for a screenshot came across SEMANTIC CONFLICT DETECTION IN META-DATA – A RULE BASED APPROACH.

Friday, June 03, 2005


* "The SEMANTIC Knight always triumphs! Have at you! Come on, then. I have an battalion of KR theorists on my side". Via Too Close To Home
* Google Sponsored Semantic Web API abstraction project "While a lot of semantic web APIs are available (Jena, Sesame, Kowari etc..) especially for the java language, there is no standard set of interfaces and wrappers so that middle ware RDF toolkits or higher level RDF based API can be built regardless of the underlying api/srdf storage. The project will certainly not start from scratch, but rather from earlier discussions and code to build upon (See jrdf, classes in the Simile projects etc..)."
* Tetris Shelving.