Saturday, January 01, 2005

Crisis

In the same week Tucana goes under we get probably the busiest week on the Kowari mailing list and 45 downloads for Pre-release 2. Considering three quarters of these is the 50MB version, there's either a lot of good bandwidth out there or great patience.

Before the holidays there was great progress being made. We have been attempting to make the "10,000 triple/second challenge" and had succeeded (at least the first 1 million triples, it degraded to a few thousand a second after 200 million). This is on the same 1.6 Mhz Opteron system that we use for all our tests.

Andrae was working on a large refactoring of the transaction operations so that any JRDF, Jena, or iTQL (or anything new) could be put within a transaction or use an existing one. They were already in a transaction but they couldn't tell whether they already had the write phase for example. From what I can remember Simon was working on the problems associated with using file and HTTP protocols in the FROM (external resolvers). David M was working on speeding up deleting triples as well as loading speed. Paul had been working on SOFA and OWL. Robert had started looking at JDO, EJB and mapping relational databases to RDF.

Our commercial focus had been on speed and scalablilty. With Tucana gone some of this focus is probably going to change. It takes a lot of time, effort and resources to keep everything changing and to continue to perform at a commercial level. Between 1.0 and 1.1 there practically isn't a portion of code that wasn't modified - more often than not it was completely changed. This kind of development, on multiple fronts, is something that probably can't occur in the future.

So the future is probably smaller, simpler, with an eye on standards compliance. I hope some things like multiple writes, phase holding, pluggable datatypes, inferencing Hotspot, SPARQL support etc. will be developed but I doubt many of these things will see the light of day now. I'm not discounting them entirely, but there are many things that had to occur in parallel that need a team of developers. A lot of the new features depended on the multiple writers feature. Multiple writers is not an easy feature to describe but suffice to say it's more like Lucene than a normal relational database.

So I think that means improving RDF, RDFS and OWL support. We made some good improvements in Pre-Release 2 with better datatype support. David M added the functionality to allow literal and URI prefix matching - someone just has to add it into the query layer. Combining this with matching nodes based on type (literal, URI or bnode) and trans/walk queries and they're a very powerful combination of features. In the background there's also been a focus on languages (I know we were talking about 3066bis support). Paul and I were looking at inference models (seems to have similarities with pseudo models and datalog vs tableau). Paul's Masters will probably push some of this into reality.

So all in all there does seem to be some good opportunities in the future. I know that Tucana had customers who were paying lots of money for TKS and I know that Kowari did help in the risk assessment. So my intentions, at least in the short term, is to see that good support occurs, bugs get fixed and the like. Although there isn't anyone left to do TKS releases.

I don't really understand why this happened (or what's going on) but to quote Marge Simpson: "One person can make a difference. But most of the time they probably shouldn't."

4 comments:

Danny said...

Thanks for the update.

Right now I'd suspect Kowari's main niche would be in small-medium enterprise/orgs as a Webbish RDBMS substitute, emphasising the agile. But right now, I dunno, at 50MB that might be a symptom of bloat. When you say : "So the future is probably smaller, simpler, with an eye on standards compliance." it sounds about right. Personally I think Sparql is a must-have (the other query language(s) are usable, but no-one wants to learn more than they have to).

Keep up the good work, and Happy New Year!!

Andrew said...

50MB includes 32MB worth of test scripts and test data. A simple server that answers SPARQL queries via HTTP could be made to be around 5-7MB. Removing Barracuda for the WebUI and moving to JSPs would be another good change to make. Also, instead of including Jetty we deploy Kowari inside any servlet container. This is relatively simple stuff that's never been focused on.

Anonymous said...

As someone who did what he could (unpaid, of course) to see you guys flourish, I took the closure news kinda hard. Put simply, it sucks. I've got to know too many good folks not to be bummed about it all.

Kendall Clark

Anonymous said...

Wow, I'm sorry to hear the news... I have been following Kowari for a while now and have been impressed with the product and the thought that has gone into it. Best of luck to you all in your future endeavors.

Chris Wilper
Cornell University