Wednesday, January 31, 2007

Semantic Spreadsheet

What happens when you combine a Google spreadsheet of Semantic Web tools and Simile Exhibit you get Sweet Tools (Sem Web).

The only problem seems to be that it only shows the first 10 or all results - it'd be good to have a previous/next button.

Tuesday, January 30, 2007

SPARQLing AJAX

Dojo Data "As of January 2007, we have five simple datastores, which are included in dojo as example datastore implementations...dojo.data.RdfStore a read-write store that uses SPARQL to talk to RDF data servers including, for example, the Rhizome RDF application server"

From the source of RdfStore: "RdfStore provides a dojo.data Store for querying and updating a server that supports the SPARQL Query Result JSON format. (see http://www.w3.org/TR/rdf-sparql-json-res/). It also maps RDF datatypes to Javascript objects. RdfStore makes following assumptions about the Result JSON: (1) The result always contains 3 bound variables named "s","p", and "o", and each result binding is treated as an RDF statement. (2) When saving changes to the store, the JSON "results" object will also contain a "deleted" key whose value is a list of deleted RDF resources."

Update: OAT: OpenAjax Alliance Compliant Toolkit "Ondrej Zara and his team at Openlink Software have created a Openlink Software JS Toolkit, known as OAT. It is a full-blown JS framework, suitable for developing rich applications with special focus to data access."

"OAT also provides Data Aware controls for the above that include...SVG based RDF Graph Control".

Only works in Opera and Firefow - IE doesn't support SVG.

"OAT is Open Source and GPL’ed over at sourceforge and the team has recently managed to incorporate our OAT data access layer as a module to dojo datastore."

Via Planet RDF.

Wednesday, January 24, 2007

Patterns in Software - Part 2

Patterns of Software: Tales from the Software Community (freely available in PDF). "Quality Without a Name" spends a lot of time taking Alexander's objective "meaning for beauty, for the aliveness that certain buildings, places, and human activities have" and applying it to software. This chapter also has a fairly good definition of patterns:

Patterns certainly have an appeal to people who wish to design and construct systems because they are a means to capture common sense and are a way to capture abstractions that are not easily captured otherwise.


The second thing (and second paragraph) is a re-definition, for most people in the software industry, on who the users of the software are:

...when you read Alexander, it is clear that a “user” is an inhabitant—someone who lives in the thing constructed. The thing constructed is under constant repair by its inhabitants, and end users of software do not constantly repair the software, though some might want to.


So these are not the regular definitions of quality or user. It's certainly not the TQM definition of quality: the fitness to a standard or set of requirements. It's much closer to the principle of quality in XP and other agile methodologies. Meeting requirements is rejected as the definition of quality because they are often contradictory, he quotes Alexander on his experience with Bay Area Rapid Transit (BART) system:

So it became clear that the free functioning of the system did not purely depend on meeting a set of requirements. It had to do, rather, with the system coming to terms with itself and being in balance with the forces that were generated internal to the system, not in accordance with some arbitrary set of requirements we stated...What bothered me was that the correct analysis of the ticket booth could not be based purely on one’s goals, that there were realities emerging from the center of the system itself and that whether you succeeded or not had to do with whether you created a configuration that was stable with respect to these realities.


He also takes some time using other (somewhat confusing) words such as alive, whole, comfortable, free, exact, egoless, and eternal to further refine what "the quality without a name" is. The purpose for this is clear though, it's to rediscover objective quality or the combination of fact and value.

We in software are not so lucky—all of our artifacts were conceived and constructed firmly in the system of fact separated from value. But there are programs we can look at and about which we say, “no way I’m maintaining that kluge.” And there are other programs about which we can say, “Wow, who wrote this!” So the quality without a name for software must exist.


Some of them I'd consider fairly vague but others have some parallel with XP principles and similar ideas. In order to describe how a system is alive Gabriel uses the metaphor of a fire and a fireplace. The structure of the logs, chimneys and so on is a well thought out system which supports a self-sustaining process that once set in motion reaches a predefined end - a small pile of ashes. And "whole" is a property where something is self consistent. This is what occurs when code is reflected upon and refactored. And while Gabriel doesn't spend much time talking about a system being egoless, this concept should be familiar to most agile practitioners. XP and other methodologies encourage the idea of egoless programming and collective ownership.

All of these concepts and "the quality without a name" all seem to me to be descriptions of development methodologies, practices, principles and values and the particular kind of software that is developed as a result of applying these processes. That is, software with this quality is a result of a software process with the same quality.

He ends this chapter with "some things" (I'd say requirements) for software which possesses the "quality without a name". I'd summarize it as:
  • Not written under unrealistic deadlines.
  • Modules and abstractions not too large and small enough to understand and remember what they do.
  • Code constantly repaired (he'd probably say refactored today).
  • A fractal nature of code - looking at the large or the small the code is coherent.

He ends it with this lament:

I wish we had a common body of programs with the quality, because then we could talk about them and understand. As it is, programs are secret and protected, so we rarely see any but those we write ourselves. Imagine a world in which houses were hidden from view. How would Alexander have found the quality with no name?


Since this was written there's been a lot of code made public but I still can't think of any that I've really looked at and appreciated (and in fact I tend to avoid looking at Java code written under the Apache licence for example). Any suggestions?

Tuesday, January 23, 2007

Good Singletons

Singletons - we’re better off without them

I was first introduced to the singleton pattern as an alternative to global variables (”global variables are bad, use a singleton instead”). But this is one instance where singletons emphatically should not be used. Replacing a global variable with a singleton is just a lazy way of avoiding global variables without avoiding any of the problems inherent in global variables.


He goes onto list the reasons not to use singletons this way which include (in my order of importance): difficult to test, inhibits code reuse, breaking modularity, inadvertent changes occur between accesses and multiple references.

A singleton is responsible for both its behaviour and for ensuring that only one instance of it exists. In other words, it is responsible for two unconnected activities. This should usually be avoided and the unconnected activities should be implemented in different classes.

One View

MacFUSE Release Opens Up File Systems on Mac OS X "Some examples of applications using virtual file systems through FUSE include GmailFS, which allows users to set up a Gmail account as a local disk, and SSHFS, which allows users to interact with files on a remote computer via SSH (Secure Shell).

Singh said Mac OS X users could also use MacFUSE to mount Windows hard drives."

The demo video shows sing RSSFS (view of RSS feeds), ProcFS (view of OS processes), DocsFS (view of Google Docs and Spreadsheet), SpotlightFS (view of Spotlight results) and PicassaFS. The project web site list others.

Monday, January 22, 2007

Know Fear

Irene Khan -- 2006 Sydney Peace Prize (mp3)

It’s December 2001. I am in a hot and dusty Afghan refugee camp in Pakistan. The Taleban have been defeated and there is jubilation among the Afghan refugees. Refugee women are climbing into the buses that will drive them home...These women understand only too well the real horror of war but they also know that peace is much more than merely an end to fighting. Sitting on the bus next to Zubaida I ask, “What will you do when you return home?” She does not hesitate for a second. Clutching her baby close, she looks me straight in the eye, and says “I want to go to school. Some day I will be a scientist.”

What an amazing answer! Here is this woman discreetly covered from head to foot in a blue burqah, but there is nothing hidden about her message. She is telling me that peace is not a matter of military victories; it is about equality, justice and freedom for women as well as men. It is about creating the possibility for every human being to reach their full potential. And it is about hope.

Fast forward two years to July 2003. I am in Kabul now but I can’t find Zubaida or, for that matter, any woman studying science. Instead I find a fortress town guarded by American troops: a country caught in the grip of warlords and drug barons, torn by insecurity, afflicted by extreme poverty. I sense the fear in women activists as they tell me of the abduction of young girls from homes and schools, and of rampant sexual violence.

Later I am taken to a prison in Kabul, crowded with women and girls accused of adultery, or of wanting to marry the man of their choice or of running away from brutal husbands.


I am telling you this story about Afghanistan because what I saw in Kabul is, in a microcosm, what I see happening across our world today; a world in which peace is being redefined, in the interests of the powerful and the privileged, at the expense of the poor and the marginalized.

A new agenda is in the making in which the rules are being rewritten for the greater security of a few, while the actual sources of insecurity that affect the lives of many more are ignored. The “war on terror” dominates while sexual terror is ignored, even though it affects millions of women and girls around the world, in bedrooms, on battlefields, and in workplaces.

Persistent JRDF

After thinking about it many times I've started to integrate a persistent store for JRDF. Now I know Kowari/Mulgara would be the obvious choice but I decided to try something different - Apache Derby.

A recent post on the Sesame developers list about persistent blank node maps got me into action as well as all this talk about XA2 of course (JRDF's modified RIO parser in Kowari/Mulgara uses a persistent StringToLongMap).

One of the reasons to use Derby is because it has an XAResourceManager. Though I don't expect it to scale or be as fast as most stores (even if the table size is supposed to be unlimited).

I couldn't find an easy way to create a DiskHashtable (although the TestDiskHashtable gave some clues). Here's how I managed to get a persistent DiskHashtable going called derbyDB (it may not be quite right of course):
String driverStr = "org.apache.derby.jdbc.EmbeddedDriver";
EmbeddedDriver driver = (EmbeddedDriver) Class.forName(driverStr).newInstance();
final EmbedConnection30 connection = (EmbedConnection30) DriverManager.getConnection(
"jdbc:derby:derbyDB;create=true");
final LanguageConnectionContext languageConnectionContext = connection.getLanguageConnection();
languageConnectionContext.setRunTimeStatisticsMode(true);
TransactionController controller = languageConnectionContext.getTransactionExecute();
ContextService service = ContextService.getFactory();
service.setCurrentContextManager(languageConnectionContext.getContextManager());
DiskHashtable diskHashtable = new DiskHashtable(controller, TEMPLATE, INDEXES, true, true);


I have some ideas that it might be possible to take the relational RDF operations and put them into Derby (or vice-versa). The datatype support would be nice to leverage (especially the XML datatype for example). It's very preliminary at the moment and I may ditch it in the future.

Friday, January 19, 2007

Patterns in Software - Part 1

Patterns of Software: Tales from the Software Community (freely available in PDF format) is something I came across last time I was talking about design patterns.

First off, I think this book is very well written - style, language, structure - it's all good. I wish I could write as well as this guy. Christopher Alexander is in the preface (the guy who is attributed to inventing patterns).

Secondly, I'm going to cover the chapters that I've liked as I get to them. The first is "Abstraction Descant".

In this chapter he introduces the concept of compression which is: "...the characteristic of a piece of text that the meaning of any part of it is “larger” than that particular piece has by itself."

Examples are macros, function or class names. There is a danger though that compression and abstraction can be overused.

"The problem is that people are taught to value abstraction above all else, and object-oriented languages and their philosophy of use emphasizes reuse (compression), which is generally good. However, sometimes the passion for abstraction is so strong that it is used inappropriately — it is forced in the same way as it is with larger, more complex, and typically ad hoc abstractions."

"Another problem with complex abstraction arises from the observation that abstractions are about ignorance...Some complex abstractions, however, contain information about the implementation that is legitimately required, such as its performance, the algorithm, coding tricks, and resource usage—keep in mind that almost all interaction issues are about resource conflicts."


The next point is that software evolves through time, in little pieces by programmers not designers. The danger is the designers use abstractions (usually in ignorance) whereas coders should write code and design it in order to make it more habitable.

"...creating habitable software that can be effectively maintained, recognizing that the reality of software development is piecemeal growth and to plan accordingly, and to understand that the power of object-oriented programming is compression, which carries a terrific requirement for careful use of inheritance—relate to how we use abstraction and how much we use it."

"In programming, if a set of large abstractions does nearly the right thing, it is tempting to use them and to bend the structure of the surrounding program to fit them. This can lead to uninhabitable programs."

"Worse: You can fight this temptation and choose not to use them. This choice also can lead to uninhabitable programs because you will be using parts similar but subtly different from possibly familiar ones. The only way to avoid this is to use small blocks rather than large ones, or to use blocks well-designed and tested by experts."


He also has a comment on teaching programming where we need to "learn from the classics".

"How much time do we spend reading in our ordinary education? And from our reading we gain a foundation for writing...But in programming we just learn the language and solve a bunch of short puzzles. Sort of like writing 50 limericks and then off to write books."


The next point took me a while to digest but I think it's probably the most useful part of the chapter. It is that we rarely spend time creating new control abstractions (like loops) and most of the time creating abstractions of data. He argues that the two should go hand in hand.

"Let’s look at another problem with abstractions: Data and control abstractions are generally best when they are codesigned and this is rarely done anymore. Consider, for example, the Fortran abstractions of arrays and iteration. Arrays are abstractions designed to represent vectors and matrices. Iteration is a control abstraction useful for traversing vectors and arrays. Think, for example, of how easy it is to implement summation over the elements of a vector. This is because arrays and DO loops were codesigned."

"But an interesting thing happened to Lisp in the early 1980s: the use of macros to define control structures became forbidden style. Not only did some organizations outlaw such use of macros, but the cognoscenti began sneering at programmers who used them that way. Procedural abstractions are acceptable, but not control abstractions. The only acceptable control abstractions in Lisp today are function invocation, do loops, while loops, go statements (sort of), non-local exits, and a few mapping operations (such as mapcar in Lisp)."

"Regardless of what you make of this view of data versus control abstraction, it is certainly true that because almost every programming language does not allow any sort of meaningful user-defined control abstractions, there is always a mismatch in abstraction levels between control and data. If there is a good reason for allowing data abstractions, why isn’t that a good reason for allowing control abstractions; and if there is a good reason to disallow control abstractions, why isn’t that a good reason to disallow data abstractions?"


Finally a word on the general use of patterns.

"Common patterns are similar in nature though not detail to the patterns that Christopher Alexander uses in his so-called pattern languages. A pattern language is a language for generating buildings and towns with organic order. Patterns generally specify the components of a portion of a building or a place and how those components are related to other patterns."


More of this is covered in "The Failure of Pattern Languages" but the next part will be "The Quality Without a Name".

Wednesday, January 17, 2007

Chatting and Cheating Chooks

So I've known for a while after seeing "Cheating Chooks" that: "Chickens are underrated. In fact, they have a complex communication system which includes over 20 different signals, and the males at least are practiced in the art of deceit."

Late last year Chris Evan's reported that chooks have representational language, "This shows that the call triggers other chickens to look for specific information – in this case, whether or not they already know there is food about – and to respond appropriately, researchers claim. This is similar to how human language works, they say.

Such “representational” communication has been demonstrated in some primates before, but never in a bird."

Tuesday, January 16, 2007

Creating your own Peep Show

Breaking copy protection for entertainment "I bought my wife Peep Show series 3 on DVD for Christmas (for the non-Brits, this is not what you think). It doesn’t play in our desktop computer or her laptop, and most certainly doesn’t play in our DVD player (which is a first-gen PlayStation 2)."

"Dear Channel4 and Macrovision, you’ve just forced me to rip my own DVD in order to watch it. What’s wrong with this picture?"

Spring's ObjectFactory

More fun with Spring scopes "This time I would like to show another interesting application for the custom bean scopes. In this case conversation is bound to the page in web application (or even to each unique set of request parameters for that page). Practical examples include per-page caching of the static data (i.e. for Ajax use), allow page visitors to interact or edit page content together and many others..."

The Springframework reference has another example of using the ObjectFactory, "Knowing who you are". As noted, it doesn't move away from requiring Spring but it is slightly better than being BeanFactoryAware. The limiting factor on this is that if you need to create two types of objects you need two object factories. A generic object factory could do the trick and generics could be used to remove the casting too.

A recent InfoQ article, "Spring 2.0: What's New and Why it Matters", also has a section about the new bean scopes.

Announcing Andrae

lca: Andrae Muys on RDF "On the first day of linux.conf.au, I ran into Andrae Muys. He hacks Java and RDF for clients who want semantic web hackery done. I have to admit that early Semantic Web hype put me off: it sounded too much like 1970s AI hype. Andrae was interesting, though, and completely free of the wide-eyed uncritical enthusiasm that characterized a lot of my early RDF engagement."

"Andrae runs the Mulgara project, a Java RDF store. His goal is to be able to deal with 1E13 statements (aka tuples, facts, assertions) in three years. It'll do 1E9 right now, next stop is 1E11. He refers to this goal as "3 Ts: three trillion triples". A consortium is forming around Mulgara to make this happen: if it coalesces, Andrae will be the coder to make it happen."

"The next version of Mulgara, 1.3, will ship in February and have this relational mapping in it. A quick Google search shows a lot of RDF-relational mappings going on, but the list of other mappings he had impressed me: Lucene, RSS, mbox, ID3."

"I think it's time I looked again at the world of RDF. They may yet be doing interesting things. I said as much to Andrae and he replied, "I am an engineer. In the early days it was scientists and logicians in RDF. Now the engineers have arrived, and we just want it to work and to scale." Bold claim! If you have a favourite RDF package or practice, let me know in the comments."

Andrae also announced the paper presented at linux.conf.au. Among other things it references David Wood's paper presented in 2004 "Scaling the Kowari Metastore" ("Makepeace" is a good search term).

Monday, January 15, 2007

Quality is Free

The business value of software quality "Organizations that develop low-quality software, whether for internal use or for sale, are always looking backward, spending time and money on fixing defects in "finished" products."

"One common misconception about quality is that it can be traded for improved development speed, reduced development budgets, or added functionality. In practice, however, most organizations find the opposite is true. In the long run, improved quality enables teams to deliver more projects on time, at lower cost, with more features. We should realize that Meskimen's law -- "There's never time to do it right, but always time to do it over" -- is a tongue-in-cheek adage for a reason. A development team that continuously ensures quality does it right the first time. By not introducing defects into the system throughout the entire development process, a team eliminates the time and cost required to find and fix those defects later on."

And why don't customers care about quality: "The simple answer is: “Because defective software works.” The reason it works, however, is because software doesn’t wear out, rot, or otherwise deteriorate. Once it is fixed, it will continue to work as long as it is used in precisely the same way."

Sunday, January 14, 2007

The Software of Star Wars

Code Reads #7: Parnas's "Star Wars" paper links to the well known paper, "Software Aspects of Strategic Defense Systems", where he quotes a description of what a software developer actually does, "How, then, do we end up with any big programs that work at all? "The answer is simple: Programming is a trial and error craft. People write programs without any expectation that they will be right the first time. They spend at least as much time testing and correcting errors as they spent writing the initial program.""

"With these observations in mind, Parnas casts a cold eye on the SDI project, which aimed to produce working systems that could identify and target incoming enemy missiles in a matter of minutes. The system couldn't be tested under real-world conditions; it would be expected to function effectively even when some of its pieces had been disabled by enemy attack; and it was intended to be foolproof (since, with incoming H-bomb-armed ICBMs, 90 percent wasn't good enough). No such system had ever been built; Parnas maintained that no such system could be built within the next 20 years, using either existing methods or those (like AI or "automatic programming") on the horizon."

Logic is highlighted as the mathematics of programming much like continuous mathematics in electrical and mechanical engineering. A lot of the reasons why SDI was unfeasible was that the requirements gathering - it may seem that the customer is a ruthless, dictatorship, with the finger on total annihalation - but it's rarely true.

There are a couple of important questions that probably have better answers now. Can we provide proofs for code larger than 500 lines (in whatever language)? Is hardware failure considered in these proofs? Is rules based programming more efficient?

I'm glad I found this article, it's good because I vaguely remember hearing that the SDI project was not technically feasible (games like this demonstrate the error of linear scaling) but I had not idea that one of the problems was the development of the software (obvious in retrospect). This is slightly different to the management lessons learnt about the space shuttle but similarly illuminating.

The author of the initial blog entry also interviewed Joel Spolsky in which he mentions the idea of programmer fatigue. His book is available of the 16th of this month, called "Dreaming in Code", it talks about software development and the Chandler project. Maybe setting out to be the OS/360 project of the next generation?

Friday, January 12, 2007

Allchin's Buying a Mac in Context

Allchin's 'Buy a Mac' E-Mail Exposed "Jim Allchin's "I would buy a Mac" statement now has ontext. The e-mail is publicly available."

Allchin says: "I am not sure how the company lost sight of what matters to our customers (both business and home) the most, but in my view we lost our way. I think our teams lost sight of what bug-free means, what resilience means, what full scenarios mean, what security means, what performance means, how important current applications are, and really understanding what the most important problems are customers face are. I see lots of random features and some great vision, but that doesn't translate into great products.

I would buy a Mac today if I was not working at Microsoft."

Thursday, January 11, 2007

Ob. iPhone Comment

Apple's Son of Newton "Gibson learned that, like iPod, iPhone will have a non-removal battery. Let me repeat: The battery is fixed. Non-removable battery is a shortcoming If only Ffor a device with only five hours talk time and functions like Web browsing and music listening that sap power. Heck, Cingular sells the BlackJack with a spare battery in the box."

"With Zune, Microsoft adopted a more end-to-end approach of providing all the pieces (albeit, hardware from Toshiba). The Windows Mobile business model is Microsoft software and partner hardware. If Microsoft had considered releasing a Zune branded phone, greater debate inside the company is sure to follow."

So Apple's move seems quite smart compared to Microsoft's Zune and Smartphone combination. A lack of removable battery isn't so bad when you have cars and airlines all providing iPod integration. And reminds me of the lament of Douglas Adams' for a standard power adapter. The iPod dock connector provides a variety of 3.3, 5 and 12 volt pins depending if it's plugged into Firewire or USB. If only firewire had been that standard.

Microsoft isn't all bad, it announced to much fan fair the next version of Office for the Mac.

Wednesday, January 10, 2007

A Quick Overview of Relational SPARQL

I've put a quick overview of the relational SPARQL operations that I developed in JRDF onto the Google Code Wiki, called Relational SPARQL Operations. This is just a reworking of what I've done in my thesis in hopefully a more digestable way. I hope I'll have time to expand it and the JRDF Wiki generally to include more documentation on things like the other features of JRDF.

Tuesday, January 09, 2007

Upgrading the Road

When Product Cycles Collide "he auto makers have looked over the fence at the short product cycles of industries like the cellular phone (typical replacement interval 18 months), and are coveting the opportunity to sell someone a new car every two to four years instead of every five to 15. This would attack a trend that's bound to be concerning the auto makers: In 2005, the median age of cars on the road in the United States was almost 9 years, up from 6.5 years in 1990 and 5.1 years in 1969."

I wonder if taking the entire mobile phone model would be good for car makers. They could provide an electric car on a contract and sell access to the required infrastructure. The product cycle probably wouldn't be 18 months but might be less than 9 years (however long those batteries last).

Monday, January 08, 2007

Three Links Plus One

  • Major revision (1.01) of the Music Ontology Scooping the goodness inside as I'm currently designing an ontology (online instruments that are net ready). I'm after any well designed ontologies at the moment (and their success in production).

  • Unrolling nested queries Ahh the quest for answer closure continues.

  • AllegroGraph and TopBraid 1.5 and Calendar Mashup in TopBraid Composer Currently using SWOOP and Protege but TopBraid is impressive (and expensive). "More recently they have started to use their Lisp platform to develop Semantic Web technology solutions. AllegroGraph is one of their Semantic technology flagship products, and they have done some great progress with it in recent months. From what I have seen, AllegroGraph has really good performance and is now (as far as I know) the best professional RDF triple store on the market. They even have a free entry-level version of AllegroGraph, that scales to up to 50 million triples."