Saturday, December 23, 2006

Thank Goodness

THANK GOODNESS! About Dan Dennett's recent brush with death.

"Yes, I did have an epiphany. I saw with greater clarity than ever before in my life that when I say "Thank goodness!" this is not merely a euphemism for "Thank God!" (We atheists don't believe that there is any God to thank.) I really do mean thank goodness! There is a lot of goodness in this world, and more goodness every day, and this fantastic human-made fabric of excellence is genuinely responsible for the fact that I am alive today. It is a worthy recipient of the gratitude I feel today, and I want to celebrate that fact here and now."

"Do I worship modern medicine? Is science my religion? Not at all; there is no aspect of modern medicine or science that I would exempt from the most rigorous scrutiny, and I can readily identify a host of serious problems that still need to be fixed. That's easy to do, of course, because the worlds of medicine and science are already engaged in the most obsessive, intensive, and humble self-assessments yet known to human institutions, and they regularly make public the results of their self-examinations."

"One thing in particular struck me when I compared the medical world on which my life now depended with the religious institutions I have been studying so intensively in recent years. One of the gentler, more supportive themes to be found in every religion (so far as I know) is the idea that what really matters is what is in your heart: if you have good intentions, and are trying to do what (God says) is right, that is all anyone can ask. Not so in medicine! If you are wrong—especially if you should have known better—your good intentions count for almost nothing. And whereas taking a leap of faith and acting without further scrutiny of one's options is often celebrated by religions, it is considered a grave sin in medicine. A doctor whose devout faith in his personal revelations about how to treat aortic aneurysm led him to engage in untested trials with human patients would be severely reprimanded if not driven out of medicine altogether."

"In other words, whereas religions may serve a benign purpose by letting many people feel comfortable with the level of morality they themselves can attain, no religion holds its members to the high standards of moral responsibility that the secular world of science and medicine does!"

Wednesday, December 20, 2006

Some Design Issues

Semantic Web Road map "It is clearly important that the query language be defined in terms of RDF logic. For example, to query a server for the author of a resource, one would ask for an assertion of the form "x is the author of p1" for some x. To ask for a definitive list of all authors, one would ask for a set of authors such that any author was in the set and everyone in the set was an author. And so on."

Relational Databases on the Semantic Web "Is the RDF model an entity-relationship mode? Yes and no. It is great as a basis for ER-modelling, but because RDF is used for other things as well, RDF is more general. RDF is a model of entities (nodes) and relationships. If you are used to the "ER" modelling system for data, then the RDF model is basically an openning of the ER model to work on the Web. In typical ER model involved entity types, and for each entity type there are a set of relationships (slots in the typical ER diagram). The RDF model is the same, except that relationships are first class objects: they are identified by a URI, and so anyone can make one. Furthurmore, the set of slots of an object is not defined when the class of an object is defined."

Linked Data "The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data."

"So statements which relate things in the two documents must be repeated in each. This clearly is against the first rule of data storage: don't store the same data in two different places: you will have problems keeping it consistent. This is indeed an issue with browsable data. A set of of completely browsable data with links in both directions has to be completely consistent, and that takes coordination, especially if different authors or different programs are involved."

Tuesday, December 19, 2006

A Sick Industry

Respecting the whole person "There's a discussion going on about women and IT, starting with a post by Richard Jones: Why there's few women in IT. Phillip Eby responded in Is porn driving women away from the computer industry?, The Real Reasons There Are Few Women In IT -- And What YOU Can Do About It and Why (Most) Men Don't Get It -- and I personally find his analysis much more useful.

Richard's original post talks about a case where someone used porn in a presentation, and he's basically saying that is "why there's few women in IT". I don't buy it."

Links to HOWTO Encourage Women in Linux.

The real question is why people (men and women) aren't going into IT - it's not just women. The IT Workforce Conundrum "...college student enrollment in computer science programs is down substantially from the 1990s -- around 20 percent according to the Computing Research Association. In fact, the peak numbers reached in the last decade reflected the Internet boom and represented twice as many CS students as there were in the 1970s. Outsourcing is helping to meet the demand of some types of IT jobs, but many companies are still hard-pressed to find qualified tech workers.

A recent PricewaterhouseCoopers (PwC) report predicts that competition for high-tech talent is going to become even more severe over the next few years as globalization absorbs the remaining technology workers around the world. The report says that while companies have been forced to look offshore in order to gain access to larger pools of talent, even this resource is not bottomless. European and Asian executives anticipate a severe shortage of tech talent within the next three years. And, according to the report, worker compensation is being equalized globally; even India and China are not considered low-cost anymore."

The PCW Report says, "In an era of scarcity, technology companies will need to focus more acutely on their talent management. But today, in many areas of human capital management, executives’ self-assessments show little or no confidence in their companies’ abilities."

X for Vista

Windows Vista Vista is not a copy of OS X. The search box is in the completely opposite corner (bottom left opposed to top right), gadgets not widgets, and the 3D chess games in Vista has porcelain as a board type (innovative).

Vista Wins on Looks. As for Lacks ... suggests that the copying wasn't quite perfect, "And then there’s that Sidebar, the floating layer of mini-programs. If you close one of the gadgets, you lose its contents forever: your notes in the Post-it Notes gadget, your stock portfolio in the Stocks gadget, and so on. You couldn’t save them if you wanted to. How could Microsoft have missed that one."

Thursday, December 14, 2006

Relational OWL

As a sequel (not SQL) to relational SPARQL there's relational OWL, "In this paper we analyzed the similarities and differences between relational databases and description logics, in particular with respect to the role of schema constraints. Our analysis reveals more similarities than differences since, in both cases, constraints are just (restricted) first-order theories. Furthermore, reasoning about the schema is in both systems performed under standard first-order semantics, and it even employs closely related reasoning algorithms. The differences between relational databases and description logics become apparent only if one considers the problems related to reasoning about data. In relational databases, answering queries and constraint satisfaction checking correspond to model checking, whereas, in description logics they correspond to entailment problems."

Via All Problems Solved.

No Chance

Yangtse dolphin 'is extinct', a victim of economic explosion "Instead, the dolphin was driven to destruction by the noise of the traffic, which disrupted the sonic waves it used to navigate — it was virtually blind, its tiny eyes useless in the murky, sedimented water of the river.

In addition, overfishing had cut its food supplies by half, while the huge Three Gorges Dam had interfered with current and sandbank patterns. The World Conservation Union will only declare a species formally extinct once there is "no reasonable doubt", but Mr Pfluger said it was too late to hold out any more hope."

There's actually a blog that covers these based on Douglas Adam's, "Last Chance to See", called "Another Chance to See". As he said, "There is one last reason for caring, and I believe that no other is necessary. It is certainly the reason why so many people have devoted their lives to protecting the likes of rhinos, parakeets, kakapos, and dolphins. And it is simply this: the world would be a poorer, darker, lonelier place without them."

The whole expedition is available in blog form at Blog Baji.org, where the race goes on to stop the extinction of another Yangtze inhabitant the finless porpoises. They say, "The baiji is Functionally extinct. Lipotes vexilifier is the first species of cetacean – whales, dolphins and porpoises – to disappear from our globe in modern times…the first large mammal to go extinct as a result of man’s destruction of their natural habitat and ressources."

At least there's still the Kakapo to be seen.

Monday, December 11, 2006

The Difference is Antijoin

I made a mistake about making a mistake. I got caught up with the fact that SPARQL's diff is not the same as relational difference but that's not how left outerjoin is defined. So I'm correct that relational set difference is not compatbile with SPARQL's difference as defined in "The SPARQL Algebra" but wrong that this means the definitions for left outer join are not equivalent.

So antijoin is:
( R1 difference (project(R1)(R1 join R2)) )

So using the previous example:
{{ (?x = 2, ?y = 3) }} \ {{ (?y = 3) }} is {} in SPARQL.

Now, as I said using plain difference it's:
{{ (?x = 2, ?y = 3) }} - {{ (?y = 3) }} is {{ (?x = 2, ?y = 3) }} in relational algebra.

But using antijoin the right hand side is really:
  • project ({{ (?x = 2, ?y = 3) }}) ({{ (?x = 2, ?y = 3) }} join {{ (?y = 3) }})
  • project ({{ (?x = 2, ?y = 3) }}) ({{ (?x = 2, ?y = 3) }})
  • {{ (?x = 2, ?y = 3) }}. Which matches the left hand side.

This is why JRDF looked like it was producing the right result - it was doing the same thing - it's just the operations were further decomposed. I'm sure I've done this before but for some reason I was certain I was wrong the other day. I was told that antijoin and SPARQL's diff were equivalent when it was changed but I had to question it.

There still remains an outstanding issue around null joins but maybe that's not needed either. I know that the NULLs in "SPARQL RULES!" are required for Logic Programs but it doesn't need to be there for relational algebra as long as you have the null accepting/untyped join.

So SPARQL in JRDF looks fine again.

JSF to XHTML to SVG or PDF

I haven't looked at JSF since 2004 where I thought it would be cool to combine Swing and JSF. Coming back to JSF after nearly two years I came across, Combine JSF Facelets and the Flying Saucer XHTML Renderer which takes XHTML content and rendered it using Swing this is then transformed into an image, PDF or SVG. In the past I've used iReport and directly iText to generate reports (PDF and Excel). The thing at the back of my mind is always, this seems to be reproducing a lot of what HTML does and people only really want PDF because it prints well and Excel because they have macros. Until there are macros built into browsers to do manipulation of data that is.

Thursday, December 07, 2006

Microsoft does RDF

Microsoft Connected Services Framework "The Profile Manager component provides profile management services. Connected Services Framework uses Profile Manager to store custom information about users and their preferences. Profile information is held in a Resource Description Framework (RDF) store, which is implemented by a Microsoft SQL Server database. Profile Manager provides facilities for creating and managing user profile information and for propagating profile information to Web services that cooperate in a service-oriented application."

Behold MS SPARQL: "
string sparqlQuery =
@"PREFIX ex:
CONSTRUCT {
ex:fullName ?Name
}
WHERE {
ex:fullName ?Name
}
";
"

Via, links for 2006-12-07. Microsoft using RDF and SPARQL, "...according to timbl Vista will be using XMP (RDF inside) to store metadata about photos etc."

Also, I recently noticed their implementation of EXCEPT (SQL's set difference) EXCEPT and INTERSECT (Transact-SQL).

Marx Smurf

A followup to, Papa Smurf is a Communist. Wikipedia has an article about The Smurfs and communism "Papa Smurf has a wide beard, which some feel looks like Karl Marx's. He also wears red slacks and a red cap, displaying the stereotypical color of Communism throughout the world. Despite the society's communal nature, Papa Smurf does have the ultimate authority, often overruling Brainy Smurf when he oversteps his boundaries. In several episodes when Papa Smurf is not present, the Smurf Village's utopian system destabilizes entirely."

Also, "Communism fell in Russia around the time that The Smurfs were lost from TV syndication and comic publication."

Gummi Bears, what was there game?

Wednesday, December 06, 2006

JRDF GUI Takedown

The other day I decided to remove the JRDF GUI from Sourceforge. Subsequently, there has been a need to download it again. :-) It's here for the time being:
http://jrdf.sf.net/jrdf-gui-0.3.jar

Like I've said, none of the operations implemented in JRDF match SPARQL's. In order not to throw away the work I've done I'm going to spend sometime soon renaming it and changing the syntax (to be more like Tutorial D). The name might be UQL (Unknown Query Language), RRQL (Relational RDF Query Language) or DRQL (tutorial D like RDF Query Language). Any suggestions?

Wired on Religion

So. Many. Letters "We would have sworn that our November cover story on New Atheism was going to generate a firestorm of reader criticism, delivered with a side order of brimstone just for good measure. Wrong. You all just started calmly talking. And talking. And talking. We got more responses to this article than to any piece in memory. Brimstone quotient: low."

"So we posted every last letter to our website."

Sunday, December 03, 2006

SPARQL Favours the Brave

SPARQL RULES! "Now, as opposed to [17], we define three notions of compatibility between substitutions:
  • Two substitutions O1 and O2 are bravely compatible (b-compatible) when for all x <- dom(O1) Intersection dom(O2) either xO1 = null or xO2 = null or xO1 = xO2 holds. i.e., when O1 Union O2 is a substitution over dom(O1) Union dom(O2).
  • Two substitutions O1 and O2 are cautiously compatible (c-compatible) when they are b-compatible and for all x <- dom(O1) Intersection dom(O2) it holds that xO1 = xO2.
  • Two substitutions O1 and O2 are strictly compatible (s-compatible) when they are c-compatible and for all x in dom(O1)Intersection dom(O2) it holds that x(O1 Union O2) /= null.
"

They make the conclusion that only c-joins operate correctly with idempotency for join and, "Following the definitions from the SPARQL specification, in fact, the b-joining semantics is the only admissible definition, which is why [17] does not consider null values at all. There are still advantages for gradually defining alternatives towards traditional relational algebra treatment. On the one hand, as we have seen in the examples above, the brave view on joining unbound variables might have partly surprising results, on the other hand, as we will see, the c- and s-joining semantics allow for a more effective implementation in terms of Datalog rules."

I'm surprised at how badly I've understood the proposed SPARQL algebra, I didn't think that joining on nulls was being proposed (see Example 2.4 on page 9 of the PDF). Again, this conflicts with relational algebra and JRDF's implementation.

A join in relational algebra does match the suggested s-compatible one, you can't join on NULLs (because they are considered unbound values, they aren't "real", they don't exist and aren't equalable values). As shown in the paper, it wouldn't successfully join a null name and would return only the row containing "Alice".

In SQL's 3VL logic, "NULL literally means that the value is unknown or indeterminate. One side effect of the indeterminate nature of NULL value is it cannot be used in a calculation or a comparision." This means you can't join across NULLs in SQL or use them for aggregate functions except for COUNT. NULLs are also handled differently across SQL implementations. One example that I've come across before is the handling of strings and NULL values between DB2 and Oracle.

It reminds me how much I hate IEEE citations (I think that's the standard) where you use [17] for the reference to Perez in this paper and it's [22] in another. What's wrong with [Perez2006] or something?

Anyway, they also suggest: a set different or minus operator (which I've suggested SPARQL would be incomplete without even though you can do it with FILTER and iTQL has had it for a while before that), nested queries using ASK and using SPARQL as a rules language.

This will probably be my last post on SPARQL for a while - I'm a bit sick of continually, publicly displaying my ignorance (which I guess has occurred over the last year). I could change JRDF to fit any of the proposed implementations (and every other permutation) but I'm not sure it makes sense. I feel very far away from understanding what's going on and I doubt I will (or should) be commenting about it too much in the future.

Update: I seem to have confused s and c semantics. I've updated the paragraph related to relational algebra - it does match s-compatibility.

Good Models - My Super-Turtle is Better than Yours

  • Standards and Pseudo Standards from Celebrating OWL interoperability and spec quality. Can a standard be based on a pseudo standard? Another posting also points to "An Investigation into the Feasibility of the Semantic Web" about the feasibility of integration using ontolgies.

  • Beyond Belief 2006 Session 5 includes Paul Davies (asks why the universe should even be understandable to human beings, why he's no longer a Platonist and levitating super-turtles) and Session 9 on why religion may have a place.

  • Agile Atheism "I am not agile because I don't believe in the agile religion and I don't accept its dogma. I like the engineering and planning practices that agile teams use - in the same way that I like people who do nice things (even when they do it because of fear of divine retribution). The difference is I don't want to be constrained by dogma into only doing those sensible things which are prescribed by agile. In the same way I don't like being prevented from doing sensible social things because of religious beliefs."

  • Jane's Rule for Loading Dishwashers "Compare this to test-driven development. There may be a little more effort when writing code, because you are writing programmer-tests to drive writing that code. It's an "unnatural" process, like sorting silverware into sub-bins when loading the dishwasher. But the true benefit comes later in the project (possibly just minutes or hours later), when you can rely on those programmer-tests to make refactoring safer. And since you fixed bugs during TDD, you have much less work fixing bugs later when it comes time to ship your product."

  • How can I get financial market information updated automatically to my spreadsheets?

  • 10 most intelligent / least intelligent dogs My dog is apparently the stupidest breed with regards to obedience. I'm not sure I'd call obedience a sign of intelligence (the complete opposite really).

Saturday, December 02, 2006

Strong Typing without the Typing

Functions, Types, Function Types, and Type Inference "One of the most important things to recognize about Haskell's type system is that it's based on type inference. What that means is that in general, you don't need to provide type declarations. Based on how you use a value, the compiler can usually figure out what type it is. The net effect is that in many Haskell programs, you don't write any type declarations, but your program is still carefully type-checked."

"If we look at that type, and think about what the factorial function actually does, there's a problem. That type isn't correct, because factorial is only defined for integers, and if we pass it a non-integer value as a parameter, it will never terminate! But Haskell can't figure that out for itself - all it knows is that we do three things with the parameter to our function: we compare it to zero, we subtract from it, and we multiply by it. So Haskell's most general type for that is a general numeric type. So since we'd like to prevent anyone from mis-calling factorial by passing it a fraction..."

Currently I'm enjoying the print version "The Haskell Road to Logic, Math and Programming" (links to the PDF version). I haven't appreciated the writing style used in this math textbook this much since, "Introduction to Graph Theory". I'm still waiting for Haskell to reach 0.1% popularity.

Breaking Project

SPARQL Basic Graph Pattern Matching Andy writes, "Example data:
  :a :p 1 .
:a :p 2 .
Example query pattern:
 { ?x :p [] }
How many answers? Blank nodes are existential variables in RDF; named variables (the regular ?x ones) are universal variables. Queries don't return the binding of an existential; queries can return the binding of a named variable and the bound value of a named variables can be passed to other parts of the query (FILTERs, OPTIONALs etc) via the algebra.

In the absence of a DISTINCT in the query, are the solutions to this pattern:
  • 1 : ?x = :a
  • 2 : ?x = :a , ?x = :a
  • Either 1 or 2
"
I would say there is a fourth option here, see this example, which would be:
  • { { ?x:subject = :a, p1:predicate1 = :p, o1:object1 = 1 } , { ?x:subject = :a, p1:predicate1 = :p, o1:object = 2 } }
The DAWG test case, rdfSemantics-bNode-type-var, redefines the meaning of project.

If you keep track of where the variable were bound you can still count the distinct instances without redefining project.