Thursday, November 30, 2006
Different Strokes
In that post and in my thesis I've said that the definitions of OPTIONAL by Pérez et al (and used by Andy Seaborne) are compatible with the relational definitions I used. I'm now about as certain as I can be that that's not the case.
The definition in Pérez et al is for set difference is (in ASCII):
O1 \ O2 = {u <- O1 | for all u' <- O2, u and u' are not compatible}.
The key here is the definition of compatibility, I didn't find it clear in the original paper (my fault and I'll explain a bit more about this later) but it's explicit in "Semantics of SPARQL" where it says, "Two mappings with disjoint domains are always compatible, and the empty mapping u0 is compatible with any other mapping. Intuitively, u1 and u2 are compatibles if u1 can be extended with u2 to obtain a new mapping, and vice versa."
A relational join is defined as the set union of the headings and matching values. The heading consists of the attributes (X, Y, Z), where X is the attributes of the left hand side relation, Y contains the attributes that match the left and right hand side relations and Z are the attributes of the right hand side relation. The body consists of tuples with the values matching X, Y, and Z (Date writes it as { X x, Y y, Z z}). I think that this is compatible with the Pérez et al definition.
But it's different for difference. The difference operator in relational algebra requires the relations to be the same type to be removed (this is the definition I've used). I'll use "\" to represent SPARQL's set difference and "-" to represent the relational difference.
For example:
{{ (?x = 2, ?y = 3) }} \ {{ (?y = 3) }} is {} in SPARQL
{{ (?x = 2, ?y = 3) }} - {{ (?y = 3) }} is {{ (?x = 2, ?y = 3) }} in the modified Galindo-Legaria relational algebra that JRDF uses.
The reason it is unchanged in relational algebra is because the definition of equality requires them to be the same type (the same attributes). So a binding with one value can't be equal to a binding with two values. The Pérez et al definition is a much looser definition and many more matches are made.
As an aside, I was fairly sure that set difference was a requirement for OWL inferencing. I might be wrong here too. If it is the case, though, I'd think it'd be nice (an efficienct use of operations) if SPARQL operations could be reused for OWL ones. This would allow the difference operator by itself to be expressed in SPARQL too.
So I think I understand the issue now. At the moment I'm trying to work out how JRDF gets what seems to be the right results because it doesn't really look like it should. The other thing that was in my paper was a mapping to SQL and I'm not sure how a SQL based SPARQL implementation would work either (based on the definitions I used). Lots of questions anyway.
I am trying to work out whether this definition of compatibility is a good one. At the moment I don't like it or dislike it because I don't understand yet.
The trigger was the removal of the term antijoin from the SPARQL algebra - that is a good idea because I understood it to mean it was compatible with relational algebra (using the same terms I guess). Up until that point I'd assumed difference in SPARQL was compatible with difference in relational algebra.
Update: Paul mentions in the comments, that the SPARQL difference operator is the one needed for OWL. Another win for that definition then too.
It's apparent that the relational model really isn't appropriate to model SPARQL in. I don't think there's a single definition of an operation in relational algebra that hasn't been changed.
Update 2: I've since come to realize that joins are not compatible either. This is due to the suggestion that you can join on NULL values. This contradicts relational algebra (taking either the Date or Codd approach on NULL) and SQL joins (not that I think that's worth much but it's easy to demonstrate).
Update 3: Last update for this blog I hope. While relational difference is not equivalent, antijoin is equivalent. Which is what I was told when antijoin was renamed diff in first place. More information, "The Difference is Antijoin".
| 4 comments | Link me |
Labels: jrdf, rdf, relational model, sparql
Tuesday, November 28, 2006
B, b, b boca gives you an enterprise ready RDF store
The users guide lists some other interesting features such as a client stack with "a fair amount of compatibility with HP's Jena API" and text indexing using Lucene. Based on the configuration it looks like it requires DB2 or Apache Derby as well as Java 5.
Via, IBM SLRP Release.
| 0 comments | Link me |
Labels: ibm, rdf, semantic web, triple store
Thursday, November 23, 2006
The Web is to Blame
While putting YouTube links I also appreciated, U2 & Green Day - The Saints are Coming.
| 2 comments | Link me |
Labels: broken promises, flying cars, green day, humour, music, political, u2
Wednesday, November 22, 2006
Another SPARQL Algebra
This seems influenced by Jorge Pérez's work (or least it shares a common basis as the definitions are very similar) and looks like it's compatible with the stuff that I did (it even has antijoin in there).
And in other SPARQL news, Danny Ayers has published (based on discussions between Max Völkel and Richard Cyganiak) a SPARQL Update Language for insertion and deletion of triples.
Update: Andy Seaborne has posted about his work on this algebra, a version of ARQ is available which implements this algebra (it's post 1.4). The use of the FILTER command is also discussed. Via SPARQL will be formalized as an algebra.
| 2 comments | Link me |
Tuesday, November 21, 2006
The Search for the Levitating Super-Turtle
"Ask a deeply religious Christian if he’d rather live next to a bearded Muslim that may or may not be plotting a terror attack, or an atheist that may or may not show him how to set up a wireless network in his house. On the scale of prejudice, atheists don’t seem so bad lately."
A good preview of Dawkins book is here.
I've recently finished reading both "The God Delusion" and "The Goldilocks Enigma" and I found both books quite good. Seeing as though so many people commented last time I thought I'd post a bit more about what I think this time.
Much like the disappointment of revisiting old television shows of your youth, Dawkins' book is great for pointing out how truly bad those stories taught at Sunday school were, including Noah, Lot and Abraham. Although I must admit, even as a child I found the story of the flood and "The A-Team" both rather unbelievable. There are other interesting topics, like coming up with morals without religion, but I think these are better covered elsewhere.
One of the main things I got out of this book is that progress is about conscious raising. Most improvements have come about when a society becomes aware of a problem and goes about trying to solve them. Historically this includes human rights, more recently global warming and ones to fully take hold like animal rights. The other thing I got out of it is that I don't have as many problems with religion as Dawkins.
I found Dawkins the least convincing when he diverges from his areas of expertise especially when he tries to cover cosmology. This is especially apparent when you compare his counter argument against teleology (things look like they were designed therefore there must be a designer). Dawkins explanation in relation to biology is clear and concise but for cosmology its rather glossed over and there seems to be a bit of hand waving. He doesn't provide a good argument why evolution on a universal scale is well founded. This is where Paul Davies' book provides some better arguments for a rational creation of the universe.
Davies is actually a little bit more open to the idea of God than Dawkins which, when he chooses a different explanation, makes his arguments more convincing. The possible explanations of the universe he discusses include: absurd (no real cause), unique (there are no free parameters for the universe to be the way it is), the multiverse (String theory), intelligent design (God or Gods), the life principle and the self explaining universe. He says he prefers the latter two explanations. I found the most interesting explanation given is the self explaining universe. It uses quantum mechanics, casual loops and the requirement for the universe to understand itself.
The last chapter of the book is certainly the best and I wish he spent the whole book on the ideas in it instead. His description of the infinite regress as the levitating super-turtle is great. He also describes how Platonism is incorrect, especially at the beginning of the universe, and how the laws of physics have emerged over time.
| 2 comments | Link me |
Labels: atheist, dawkins, god delusion, religion
Thursday, November 16, 2006
JRDF GUI 0.3 Released
| 0 comments | Link me |
RDF and OWL are Yahoo's Secret Weapons
| 0 comments | Link me |
Labels: owl, rdf, semantic web, yahoo
Web 2.0 is the new Applets
Nova has a bunch of interesting postings recently including: "Web 3.0 Versus Web 2.0", "Does the Semantic Web = Web 3.0? and "New York Times Article About the Emerging Semantic Web" (all about the recent NY Times article on the Semantic Web, the hype and misconceptions), "What is the Semantic Web, Actually?" and "The Meaning and Future of the Semantic Web".
| 0 comments | Link me |
Labels: nova spivack, semantic web, web 2.0, web 3.0
Wednesday, November 15, 2006
Simplicity, Good Design and Refactoring
Video of interview of Paul Graham and original article Taste for Makers.
| 0 comments | Link me |
Labels: design, paul graham, programming
Tuesday, November 14, 2006
The Classpath Exception
Tim Bray has the answer: "Unmodified GPL2 for our SE, ME, and EE code. GPL2 + Classpath exception for the SE libraries. Javac and HotSpot and JavaHelp code drops today. The libraries to follow, with pain expected fighting through the encumbrances. Governance TBD, but external committers are a design goal. No short-term changes in the TCK or JCP.".
| 0 comments | Link me |
Labels: david wood, gpl, java, open source
Friday, November 10, 2006
Mocking the Inspector
"The Inspector - A unit test that violates encapsulation in an effort to achieve 100% code coverage, but knows so much about what is going on in the object that any attempt to refactor will break the existing test and require any change to be reflected in the unit test."
| 0 comments | Link me |
Labels: anti-patterns, design, programming, tdd
Semantic Web 2.0
"This is where it became evident that there is a deep disconnect between the traditional database community and the semantic web community. Mårten’s response was rather vague, that this wasn’t as broad as the semantic web and that the semweb includes unstructured data so wasn’t appropriate."
CEO of MySQL "Invents" the Semantic Web! "I have to say, his talk was both a validation of what we have all been working towards, and as Ian Davis explains, it is also a clear sign that the W3C and the Semantic Web community have not found a way to get the message accross."
And moving data around, owning your data seems to be another aspect of the Semantic Web overlooked.
WEB 2.0: Google CEO: Take your data and run "The more we can, for example, let users move their data around, never trap the data of an end user, let them move it if they don't like us, the better."
And my mind boggles at the idea of taking the proposed Australian Access card and their integration problems and fusing it with RDF. Some interesting points: "...the Access Card will be owned by the cardholder and not by the issuer...The effect of the issuer retaining ownership is that they control the card and the purpose for which it is used."
"In Centrelink alone we have a massive 275 kilometres of files...Medicare has to measure its records in a similar way. They have more than 3 square kilometres of storage space for forms with signatures."
"We collect, and almost never reuse, this information."
"The new card will finally put an end to this waste of time. We will be able to reuse the information that you have given us before, but only for the purposes for which you gave it to us. We can then pre-populate forms and take a lot of the pain out of the claim process."
| 0 comments | Link me |
Labels: australia, google, joe hockey, nova spivack, personal metadata, semantic web
Thursday, November 09, 2006
Mr Sparkle
Direct link (PDF ~500K).
BTW, if anyone notices any grammatical errors please let me know and I'll fix them right up. I get to the point where I can't read what I've written anymore so there's bound to be a few errors.
Update: I've updated it with some small typos fixed, formatting and the example relations (some conversions to S1 -> Supplier1 that I missed) were fixed up.
Update 2: The HTML version is now available. Many thanks to Eric Prud'hommeaux for the work he did converting the original PDF.
| 0 comments | Link me |
Labels: jrdf, relational model, sparql
Wednesday, November 08, 2006
Languages - Open vs Closed World
"Javascript by its design is fundamentally messy, however that is its advantage over Java. The path to any sanity may just be what Google has shown, when building silo apps, hide the messy details under a clean well defined Java facade. Never forget though, that these facades are abstractions that leak. Always afford your applications the ability to escape into the Web and Javascript when necessary."
Bruce Tate has recently had a new article published, "Crossing borders: Delayed binding": "The more you dig into type and binding strategies, the more you find that waiting until run time to bind to an invocation or type fundamentally changes the programming process, opening a whole new world of possibilities. True, you find less safety. But you also find less repetition, more power, and more flexibility with fewer lines of code."
| 0 comments | Link me |
Labels: design, google, java, javascript, programming
Why Semijoin is Better than Join
This is one of the first times, in many years, where I had to resort to using another search engine other than Google (Yahoo) to find this paper.
| 0 comments | Link me |
Labels: relational model
Tuesday, November 07, 2006
ISWC 2006
- Semantics and Complexity of SPARQL. The early version of this paper was used in my thesis.
- A Model Driven Approach for Building OWL DL and OWL Full Ontologies.
- MultiCrawler: A Pipelined Architecture for Crawling and Indexing Semantic Web Data
- Provenance Explorer -- Tailored Provenance Views Using Semantic Inferencing
- Can OWL and Logic Programming Live Together Happily Ever After?
The best paper nominees is here including one that I hope to read in the near future "Querying the Semantic Web with Preferences".
| 0 comments | Link me |
Labels: conferences, iswc, semantic web
Monday, November 06, 2006
Tupelo, Tupelo
It uses Kowari, 3Store, and Jena as backing stores. It also has a rather interesting API for dealing with triples called Context (more information in the cookbook). There are some smaller things too like using 1.5's varargs to add triples, transitive closures on queries, object to resource mapping and something that I'd meant to do for JRDF a while ago, have a ResourceVisitor that actually visits resources.
The presentation lists similar projects: SRB, Slide/SAM, Fedora, DSpace and Jackrabbit. All of which (except for SRB) I'm pretty familar with.
And ActiveRDF, a Ruby API for accessing RDF, went 1.0!
Sunday, November 05, 2006
A Perfectly Functional Blog
Update: The first bit of content is up now, "Have you ever wanted to do this?". Which is all about how easy it is to redeclare equality in Haskell vs Java or .NET.
| 0 comments | Link me |
Labels: .net, functional programming, haskell, java, tony morris, working mouse
Quick links
- Monad page on the Haskell wiki. Including the spacemen and space ship tutorial. Another interest metaphor for monads, there's a monster in my Haskell!
- Mozilla to completely remove RDF support "Brendan Eich wants to remove RDF completely for Mozilla Firefox 3.0 aka Mozilla Platform 2.0."
- E.O. Wilson + Daniel Dennett "Shortsightedness is natural. Hypertension is natural. Obesity is natural if you eat too much. There are many things that have deep evolutionary roots that are natural. But one of the glories of civilizations is we've learned to adjust things that may be natural but we don't like them. So I think natural cuts two ways."
- Tim Berners-Lee Announces Web Science Initiative - Studying the Social Web "Tim Berners-Lee is leading the program, which is essentially about formalizing a new kind of scientific discipline called Web Science. The goal is to understand the deeper structure of the social Web and how people are using it. But as well as studying the Web, they also hope to shape the future of the Web."
| 0 comments | Link me |
Labels: escience, functional programming, haskell, mozilla, rdf, semantic web
The Fifth Element
In the manner of the Agile Manifesto, whilst there is value in delivering a list of features, I value achieving the outcomes of the project higher. This feels complementary to the other four agile manifesto values."
Via, Valuing Outcomes over Features.
| 0 comments | Link me |
Labels: agile, programming
Saturday, November 04, 2006
Decertification
Certifications are losing value because employers are looking for more in their workers than the ability to pass an exam; they want business-articulate IT pros. "
| 0 comments | Link me |
Labels: certification, software industry
Friday, November 03, 2006
Win Some, Lose Some
The other claim that seems to have worked out though was, "That future languages and platforms will probably be deployed on .NET and Java VMs. The competition between the two seems to have a positive impact on both - locking out any competitors. That means, there's something to look forward to in Java 7 and .NET 3." This follows the news about JRuby: "JRuby has been getting more and more attention from folks within Sun, Rubyists around the world, and especially from Java developers anxious to escape from their Java-only prisons."
| 0 comments | Link me |
Labels: continuations, java, jruby, jvm, ruby
Thursday, November 02, 2006
The Four Parts of Future Languages
1. "The inner layer is a strict functional language. All four projects start with this layer."
2. "The second layer adds deterministic concurrency. Deterministic concurrency is sometimes called declarative or dataflow concurrency."
3. "The third layer adds asynchronous message passing. This leads to a simple message-passing model in which concurrent entities send messages asynchronously."
4. "The fourth layer adds global mutable state. Three of the four projects have global mutable state as a final layer, provided for different reasons, but always with the understanding that it is not used as often as the other layers. In the Erlang project, the mutable state is provided as a persistent database with a transactional interface. In the network transparency project, the mutable state is provided as an ob ject store with a transactional interface and as a family of distributed protocols that is used to guarantee coherence of state across the distributed system."
| 0 comments | Link me |
Labels: design, functional programming, programming languages
When Garbage Collection isn't Enough
I've previously advocated using a non-final assignment and then checking in the finally block if object is not null and then closing it. The better way is:
final Connection conn = ...;This idiom is detailed here. It gives reasons behind using this idiom and the rule of thumb: "place one "try-finally" directly after each resource allocated".
try{
...
} finally {
close(conn);
}
This was written in 2005 based on a previous Javalobby thread. I continue to see resource leaks caused by not closing resources correctly. I know of people who have made considerable money consulting to fix these kinds of bugs in large systems which I find fairly depressing.
| 0 comments | Link me |
Labels: design, java, jvm, tony morris