Monday, November 28, 2005

Little Ink

  • Semantic Web as Webized Database "In the absense of the close coupling of designers, developers, users, and applications that is found in successful database implementations, what do the semantic web technologies offer in the way of establishing a shared view of the corespondence between the data and real world?" Links to Adam Bosworth's Learning from THE WEB.

  • "IRIS is a semantic desktop application framework that enables users to create a “personal map” across their office-related information objects. IRIS includes a machine-learning platform to help automate this process. It provides “dashboard” views, contextual navigation, and relationship-based structure across an extensible suite of office applications, including a calendar, web and file browser, e-mail client, and instant messaging client." Screenshots.

  • A relational algebra for SPARQL "Despite being in the Last Call stage of the W3C recommendation track, the SPARQL query language document currently lacks mathematical rigor and fails to accurately define the semantics for some cases...SPARQL doesn’t use a special value to indicate missing information, but simply leaves variables unbound. There’s no explicit heading. The SPARQL model does not, for example, distinguish between an OPTIONAL variable that is unbound in some solutions, and a variable that is not used in the query at all."

Faking It

A RATIONAL DESIGN PROCESS: HOW AND WHY TO FAKE IT "(1) In most cases the people who commission the building of a software system do not know exactly what
they want and are unable to tell us all that they know.

(2) Even if we knew the requirements, there are many other facts that we need to know to design the software. Many of the details only become known to us as we progress in the implementation. Some of the things that we learn invalidate our design and we must backtrack. Because we try to minimize lost work, the resulting design may be one that would not result from a rational design process.

(3) Even if we knew all of the relevant facts before we started, experience shows that human beings are unable to comprehend fully the plethora of details that must be taken into account in order to design and build a correct system. The process of designing the software is one in which0 we attempt to separate concerns so that we are working with a manageable amount of information. However, until we have separated the concerns, we are bound to make errors.

(4) Even if we could master all of the detail needed, all but the most trivial projects are subject to change for external reasons. Some of those changes may invalidate previous design decisions. The resulting design is not one that would have been produced by a rational design process.

(5) Human errors can only be avoided if one can avoid the use of humans. Even after the concerns are separated, errors will be made.

(6) We are often burdened by preconceived design ideas, ideas that we invented, acquired on related projects, or heard about in a class. Sometimes we undertake a project in order to try out or use a favourite idea. Such ideas may not be derived from our requirements by a rational process.

(7) Often we are encouraged, for economic reasons, to use software that was developed for some other project. In other situations, we may be encouraged to share our software with another ongoing project. The resulting software may not be the ideal software for either project, i.e., not the software that we would develop based on its requirements alone, but it is good enough and will save effort."

Does writing software show a lot about human nature? Via Why There Is No Rational Software Design Process.

Saturday, November 26, 2005

JRDF is out

Another bug fix release. This time it was problems in the parser with RDF/XML literals and a fix for EscapeUtil. The first was an interesting problem, not directly to do with fixing the problem per se, but trying to maneuver the code in such a way as to only use the standard Java SDK. The second was a simple regex change and finding out how Unicode support has changed from 1.4 to 1.5 (and how it breaks certain regexs from 1.4). It now means that JRDF works under 1.4 and 1.5 as expected.

Both of these problems reinforce the idea that without tests to prove code works almost always means that it doesn't.

Microsoft has an interesting and incorrect words on TDD (from here).

UPDATE: Microsoft seems to have pulled the above mentioned article, although it's still available in Google's cache.

UPDATE 2: More comments artima developer forum.

UPDATE 3: Microsoft Gets TDD Completely Wrong "Microsoft has completely missed the point of TDD. They got it wrong. Do not follow their guidelines: they will decrease productivity. You'll find that the process they've described doesn't work. If you stick with it, you'll find yourself writing increasingly bad code to work around its problems."

Doomed to Repeat History

Detecting Semantic Errors in SQL Queries, for example: "SELECT * FROM EMP WHERE JOB = ’CLERK’ AND JOB = ’MANAGER’". Obviously, JOB cannot be two values - yet you are allowed to express it. This and other examples are queries that are simply wrong and should be detected as such. It seems that this entire area of research exists because it's possible to write semantically incorrect queries in the first place. While the first example above may require something like better feedback at the command line, other examples like HAVING and DISTINCT exist because of the language design. It would be neat to have an SQL interpreter that would prevent these incorrect queries from being submittted (similar to IntelliJ or Word). It also covers solving problems with subqueries (using Skolemization) and Null Values.

From the SQLLint page.

Military Intelligence and IQL

Smart Searching "Sources of intelligence in the field include feeds from UAVs, intelligence, surveillance and reconnaissance data from a vast array of sensors and overhead platforms, signal intelligence, satellites, film and video, not to mention all the data from the open source world."

"To conduct research and analysis effectively, DIA relies on a broad inventory of technology tools from such companies as Endeca Technologies, Basis Technology, Inxight Software, Insightful, Attensity, Convera, NetOwl and Clearforest."

"Many of the search engines familiar to consumers, such as Google, are based on Boolean logic to perform a search with complex, long queries. But the Boolean language is not as expressive as it could be in the kind of query one can pose. “The InFact Query Language (IQL) can express in three words what would take 20 lines using Boolean language,” said Marchisio.

Although Insightful offers a four-hour course on using IQL, it is currently working to increase usability in order to eliminate the need for the course and to reach a broader audience among those who are not all super users or might not have the time to learn how to use the technology."

Examples of the InFact query language can be performed on the live demo here. For example: "USA > invade > Iraq - returns links to all sentences mentioning USA invading Iraq", "[organization] > win> contract - returns links to sentences mentioning who won contracts" and "* > attack > 1st Infantry Division - returns links to sentences mentioning the division being attacked".

Also related to this area, using Semantic Web technologies is Ontology Works offering integration into legacy systems and terabyte scale knowledge servers.

Friday, November 25, 2005

Google Base data belongs to us

"I’ve read the little background material on Google’s Base and still can’t see whether the material you put there can be found by other search engines. I also cannot find evidence of an API that shares any standards for tags and structure. Is Base open or closed? So far, closed."

"I wish I were hearing more noise from the microformats guys to act as competitors — or at least as pressure on Google for openness and standards."

Google Base seems like a roach motel for your data. Its not Google's data - its your data. Danny has some thoughts on this.

Apparently, the answer is put your data on the web. The question remains, what will eBay's and database companies response be.

Friday, November 18, 2005

Old Skool

This arrived in my letter box (a graphical view of the included games is here). Another year not dying now has the added benefit of being able to relive part of my childhood in joystick form. For a mere $35, there are plenty of options for hacking: putting it in an old floppy disk drive, putting it in a case as well as hooking up a PS/2 keyboard and monitor. More information on Wikipedia, schematics and on a DTV hacking forum.

Sunday, November 13, 2005

Kiwi Country

As a warning, this is not a usual post about RDF, the semantic web, Java, software development, etc.

This is just a little list I made on holiday based on my limited interactions while touring the south island of New Zealand:

  • Money - Finally a country where the denominations are the correct size - the $2 coin is bigger than the $1 coin. It annoys me that denominations aren't based on size - both the US and Australia are guilty of this. Another example of some sensible thinking, the emergency number is 111 rather than 000 (New Zealand was first of course). You're having a heart attack, you've got a rotary phone, which number would you prefer to call.

  • Lack of infrastructure - this is related to population I guess but the lack of guard rails, petrol stations, dual carriage ways, etc. especially around mountainous roads was very disconcerting.

  • No newsagents - I really can't imagine living in a place that doesn't have newsagents or good book stores. In the US and Australia there's usually a healthy selection of magazines and newspapers - the only magazines that I could find were in a bookstore and there was a Mac one, a PC one and that's about it.

  • Food - was pretty much the same with Coke tasting the same (although 20mls less in a can), more pie shops, good white wines (chardonnay that was drinkable), chocolate fish and alcohol in supermarkets.

  • Language - while I started truncating my vowels by the end of the trip I also noticed that most people used "wee" rather than "little" a wee bit more than I was used to.

  • Another country with long history of stupidly introduced species. Somewhat of a surprise was that most of the really bad ones are from Australia. Possums, magpies, wallabies, stoats, weasels and rabbits seem to be the main culprits for a country whose native fauna consists mainly of birds. The idea of a ferret eating a penguin alive rids you of some of the cuteness associated with mammals. I knew about the possums beforehand (which are pretty awful anyway) but the damage is pretty hideous and its good to see some productive use is made of them with possum fur combined with wool in clothing. The bird highlights were the kiwi, kakapo (introduced to me in Last Chance to See and recently highlighted in Kakapo Crisis), kea (also a concept extractor) and fiordland crested penguins (who have lost the fear of land based predation). The future does look good with many of them being introduced on islands where they are safe.

  • Real estate or how beautiful places are being overrun by rich foreigners. This not unique to New Zealand or Australia but it was sad to hear that working class locals are quickly being out-priced by rich foreigners and may not be able to afford to live in the same place as where they have grown up.

  • It would appear that the Maori culture is vastly better integrated and embraced than native culture in Australia (and many other parts of the world I would guess). There is a dedicated Maori television channel, radio programs, Maori is the official language and there seems to be much greater cultural interaction. Its not without it problems and injustices but it seems New Zealand has a much better history of treating people decently. This includes it being the first country to give everyone over 21 the vote (in 1893) and generally being a progressive socialist state. Another example, universal voting wasn't granted until the 1960s in Australia whereas in New Zealand it was there from the start (it did require voters to have individual title of the land however).

  • Water, water everywhere. I'm fairly convinced that some places, like South Australia, are fairly silly places to live, especially compared with somewhere like New Zealand - which is basically paradise without the snakes. New Zealand is almost arrogantly wet and fertile. The sheer number of animals (mainly sheep) in a paddock was impressive.

  • I don't want to reinforce stereotypes but the most annoying tourists would have to be American. It wasn't the loud complaining per se but more the insistence to understand everything and it had to be within their own cultural context. There seemed to be a lack of adapting and accepting things and it's just annoying to always suppose that something is worse just because it's different. I know this isn't an unfamiliar sentiment - the converse is often said to be true - that others don't stand up for themselves enough. I'm trying to be even handed here, maybe everybody does it and sure not all Americans are the same, maybe its just the ones that go on holidays, maybe I'm a cultural fascist, but it was really, really annoying.

  • And one last thing, the iPod compatible bed.

Monday, November 07, 2005

Lazy Links

* MKSearch Beta 1 Released - includes web crawler, HTML metadata extractor, and RDF storage using Sesame. Also, MG4J (Managing Gigabytes for Java).
* Open Source Java Application Management: BlueGlue and MyJavaPack. A little different to Ivy (also interesting IvyCruise).
* Concurrency JSR-166 Interest Site includes interesting posts like Java Memory Model versus dotnet Memory Model mentions the forthcoming problems with Java code when multi-core systems are rolled out. Also, Concurrent Skip List Map (coming to Java 6).