Saturday, January 01, 2005


In the same week Tucana goes under we get probably the busiest week on the Kowari mailing list and 45 downloads for Pre-release 2. Considering three quarters of these is the 50MB version, there's either a lot of good bandwidth out there or great patience.

Before the holidays there was great progress being made. We have been attempting to make the "10,000 triple/second challenge" and had succeeded (at least the first 1 million triples, it degraded to a few thousand a second after 200 million). This is on the same 1.6 Mhz Opteron system that we use for all our tests.

Andrae was working on a large refactoring of the transaction operations so that any JRDF, Jena, or iTQL (or anything new) could be put within a transaction or use an existing one. They were already in a transaction but they couldn't tell whether they already had the write phase for example. From what I can remember Simon was working on the problems associated with using file and HTTP protocols in the FROM (external resolvers). David M was working on speeding up deleting triples as well as loading speed. Paul had been working on SOFA and OWL. Robert had started looking at JDO, EJB and mapping relational databases to RDF.

Our commercial focus had been on speed and scalablilty. With Tucana gone some of this focus is probably going to change. It takes a lot of time, effort and resources to keep everything changing and to continue to perform at a commercial level. Between 1.0 and 1.1 there practically isn't a portion of code that wasn't modified - more often than not it was completely changed. This kind of development, on multiple fronts, is something that probably can't occur in the future.

So the future is probably smaller, simpler, with an eye on standards compliance. I hope some things like multiple writes, phase holding, pluggable datatypes, inferencing Hotspot, SPARQL support etc. will be developed but I doubt many of these things will see the light of day now. I'm not discounting them entirely, but there are many things that had to occur in parallel that need a team of developers. A lot of the new features depended on the multiple writers feature. Multiple writers is not an easy feature to describe but suffice to say it's more like Lucene than a normal relational database.

So I think that means improving RDF, RDFS and OWL support. We made some good improvements in Pre-Release 2 with better datatype support. David M added the functionality to allow literal and URI prefix matching - someone just has to add it into the query layer. Combining this with matching nodes based on type (literal, URI or bnode) and trans/walk queries and they're a very powerful combination of features. In the background there's also been a focus on languages (I know we were talking about 3066bis support). Paul and I were looking at inference models (seems to have similarities with pseudo models and datalog vs tableau). Paul's Masters will probably push some of this into reality.

So all in all there does seem to be some good opportunities in the future. I know that Tucana had customers who were paying lots of money for TKS and I know that Kowari did help in the risk assessment. So my intentions, at least in the short term, is to see that good support occurs, bugs get fixed and the like. Although there isn't anyone left to do TKS releases.

I don't really understand why this happened (or what's going on) but to quote Marge Simpson: "One person can make a difference. But most of the time they probably shouldn't."
Post a Comment