Tuesday, April 30, 2002

Data mine or yours?

Now, say that one of the analysts in your company combines some of the facts in that commercial database with facts from other sources and develops a new database she calls the "XYZ Combined Database." She distributes this to coworkers. Later, your company decides to make the XYZ Combined Database available to third parties. Is your company's internal use of the XYZ Combined Database illegal? What about its distribution to people outside the company?

According to Feist Publications Inc. vs. Rural Telephone Service Co. Inc. 499 U.S. 340 (1991), a U.S. Supreme Court decision, both the use and the commercial distribution of your XYZ Combined Database would be permitted, as long as you have only "extracted" facts from the commercial database you used to help compile it—and as long as you have not copied an original, copyrightable selection or arrangement of those facts.

In Feist, the Supreme Court held that although copyright protects the original selection and arrangement of a compilation of facts, it does not protect the facts themselves, even though the compiler of the collection may have invested substantial funds, labor or both in collecting and compiling them.

Is Money Relevant to Software?

As a rule most blogs are boring. However, this guy is from Microsoft and strangely enough complains about the GPL (a type of software licence) as being anti-money. He argues that money is the best way for people to tokenize value. While this is true he states: "All people who seek to add value to the code are expressly legally forbidden from ever releasing that value into society in a way that can be measured objectively." So there's no value in using the software? The distribution channel that you can charge for software is basically the MS OS tax. Distribution over the Internet is essentially free.


He also wrote an article about the semantic web:

Monday, April 29, 2002


Apparently, blogging is supposed to be for self promotion. My long suffering Honours project RCOSjava version 0.4 was released yesterday. If you're interested in applets and teaching operating systems it's there now. Apparently, there's another paper for it to be presented at EdMedia in late June.

EdMedia web site:

The RCOSjava paper:
EdMedia Paper

I guess I should link to it too:

Friday, April 26, 2002

Global Morality


This is basically a bunch of (English) economists, journalists and others trying to make sense of September 11 and other recent atrocities (Rwanda, Chechnya, Kosovo, etc). Some of the interesting points were: the different types of wars now waged, how you can have morality without religion, the economic value of people (and how that brings equality) and how to prevent these things.

They talk about how the recent atrocities were not committed by States but by networks/organisations (although there was dispute on this based on Pakistan supporting Al Qaeda and the Rwandan government for the militia).

It's interesting historically that it's largely the English telling the Americans what to do. Rudyard Kipling wrote a poem called "The White Man's Burden". I don't read poems, but I was given a book called "The Boy's Own Annual" which was published in 1917. Apart from jolly good stories like "In the Power of Pygmies" and "A Missionary to the Cannibal Islands" there was an article called "Empire Citizens"; which was inspired by that poem.

In that article they talk about how "...the man with the white skin, and more especially the Briton, practically bore the burden of the world." It goes on to say how England gives what India, Africa, Canada and New Zealand need (that the native inhabitants need that is). "Believe me, the strongest navy in the world cannot keep an empire secure unless it is built upon the rock of righteousness".

In the end it seems to me that the mistake made was not valuing the local inhabitants as equals. It seems to me (and what the radio program was saying) that it's the same mistake that is being made all over again one hundred years later.

First hit from Google on "The White Man's Burden" (it's got some good stuff like critiques made at the time):
Price of Life

This was on the midday news, about the victims (not the allied troops) of the Afghan conflict. The figure was US$1,000 in compensation for every family member that was killed by allied troops. Although if you were a family member of a Afghan fighter who helped the allies apparently you got slightly more.

"But the CIA is reported to have begun distributing compensation of about $1,000 (£700) to the bereaved relatives, in what appeared to be the clearest admission so far that something had gone badly wrong."


While this web site talks about US$10,000:

The US government gives it's citizens a considerably better time:

I think the minimum (before offsets) is about $500,000 up to nearly $5,000,000. As I've noted before some people think women's suffrage, gay rights and civil rights have all spawned from assigning equal value to human beings.
RDF Author and other fancy stuff

There's a Swing version of RDFAuthor although it appears like it's a quite a few months older than the most recent version of the OSX one:

I still couldn't find how this was related but it's basically using RDF and Flash to do a GUI to show how different board members and companies in the Fortune 100 are related. There's even the Coke and Pepsi relationship.


I did find though the guy that is the "RDF Advisor to the Creative Commons project led by Larry Lessig" (Larry!!) and co-authored RSS 1.0 (we won't hold that against him). He's been working on
a decentralized triple store written in Python (we might hold that against him):


He points to Tristero which again is another distributed triple store:


There's also ERights. Which is a very cool distributed security platform. Smart Contracts and interesting links to Austrian economics there!

DARPA did security review saying:
"The E capability architecture seems to be a promising way to stretch those borders beyond what was previously achievable, by making it easier to build security boundaries between mutually
distrusting software components."

"As a pure-Java library, ELib provides for inter-process capability-secure distributed programming. Its cryptographic capability protocol enables mutually suspicious Java processes to cooperate safely, and its event-loop concurrency and message pipelining enable high performance deadlock free distributed pure-object computing."


Manage Without Them with 7 reason to like Austrian Economics:


Monday, April 22, 2002


This is a neat little product built on Jena. It not only allows you to view and create RDF visually but
query it as well.



Saturday, April 20, 2002

Parka-DB (again?)

This is probably the closest product I've seen to using an Open Source RDF DB.


The MindSwap.org web site has an interesting page listing their current status on using it to store,
index and search on it.


The have a publicly available CVS repository:


Use a KIF/CycL like language to represent shared activities among autonomic agents with a first person MOO (Object Based Mud) system JaMUD to provide debugging and agent programming facilities.

This project delivers Java APIs, Ontologies and other Tools that you will need to use CycL, KIF (Knowledge Interchange Format) and soon enough DAML (DARPA Agent Markup Language) content into an episodic simulation inside a virtual world MOO.



Coming across this link reminded me that I hadn't put it up here. Basically, Guha (Mr Hotsauce, Auorora bar, RDF, RDFDB, etc) got RDF working to convert sites with different schemas talk about the same object. The idea is describing the object by its values rather than by URL.


Friday, April 19, 2002

Mozilla RC 1 Released

This is a 2 year old, satirical, pre-Slashdot posted announcement.

At approximately 6.53am QLD Standard Time Mozilla 1.0RC1 was released.

Ignorant Mozilla developers at the time were reported to say, "Ree-lee-se?". Even the Project's most ardent supporters had to admit the possibility that their bouncing baby browser now more resembles a massive, festering cyst.

At the very least, the Mozilla Project has given the world a pretty good picture of what caffeine poisoning looks like.

This release has fixed lots of bugs and is way more stable and has lots of neat features like LDAP using RDF over XUL/SOAP.

Bill Joy, Sun's Chief Scientist, said that most of the bugs that were fixed were bound to have been fixed outside of the organisation. Which explained why Sun keeps releasing the same version of Java 1.4 (well twice) hoping that someone outside Sun will fix the bugs.


They did fix the font corruption:

Some of the original satire:

Thursday, April 18, 2002

JBoss RC 1 Released

About 2-3 days ago the Open Source J2EE like server, JBoss 3.0, got bumpbed to RC1.


Tuesday, April 16, 2002

Java 1.5 to be Open Sourced?

But in all likelihood, it will be 2003 before these open source VMs begin to emerge. That's when Java 2 Platform, Standard Edition (J2SE) 1.5, a.k.a "Tiger," is expected; Sun has pledged it will release Tiger under open source-compatible terms. Though IBM and HP have no comment on whether or not they plan to implement open source JVMs, BeUnited, the open source BeOS standards group, has been following the Apache situation and has its eye on J2SE 1.5. BeUnited President Simon Gauvin says he has been in touch with the Sun engineer in charge of the upcoming Tiger release about the changes. He adds that if his group were to develop an open source JVM, then the changes made to J2SE "would matter a great deal to BeUnited."


Monday, April 15, 2002

As part of the J2EE specification, Sun is developing JSR-40, JMI. It looks like Oracle and Hyperion (they are apparently a big company in BI (business intelligence) and KM (knowledge management) http://www.kmworld.com/100.cfm) for OLAP and Data Mining.

The JMI specification defines a dynamic, platform-neutral infrastructure that enables the creation, storage, access, discovery, and exchange of metadata using Java interfaces. JMI is based on the Meta Object Facility (MOF) specification from the Object Management Group (OMG), an industry-endorsed standard for metadata management.


JSR 40 (JMI):

JSR 73 (Data Mining API):

Saturday, April 13, 2002

Three ways that .NET languages are compiled

* The economy JITer represents the bare minimum functionality needed to run a .NET application, It directly replaces each MSIL instruction with equivalent native code, doing no optimization, thereby consuming less overhead. It's meant for use on platforms where memory resources are at a premium.

* On the other hand, the normal JITer, which is the default runtime configuration, can perform quite a few on-the-fly optimizations to the code it produces. This gives .NET an advantage over a traditional precompiled language, which can't make anything but fairly gross assumptions about the platform its emitted code will be run on.

* Microsoft provides what is known by the somewhat redundant name of a Pre-JIT compiler (otherwise known as the Native Image Generator, hence the name Ngen.exe).


Friday, April 12, 2002

Another RDF server and Query Engine

Sesame is a scalable, modular, architecture for persistent storage and querying of RDF and RDF Schema, using an RDF Schema Query Language (RQL).

It also seems to support security as well as ontologies.

And what's that it available on Sourceforge:

The online demo is here:
Buesiness 2.0 on Knowledge Management

"You'd think that most employees understand how their companies make money. You'd think wrong. Therein lie both an enormous problem and an opportunity -- but let me tell you how I got there."

"A clear business model -- "business model" meaning "how we make money." A strategy to implement that model. And knowledge management supporting it as tightly as a Mafia lawyer."

The article lists general favourites like GE. Knowledge management seems like a self fulfilling prophecy "how do you make money" sell consulting, software, services, etc to teach people how to do knowledge management.

More Metadata Than Data?

I was reading this from David's Whitepaper:

"Tucana works by representing large amounts of information with a (typically) smaller amount of metadata. In the case of a word-processing document or an electronic mail message, for example, metadata might include the author, the recipients, the subject, keywords, concepts addressed, people named, dates or places mentioned, etc. Metadata is stored in a lingua franca to enable sharing across applications and geographic boundaries. Metadata is represented in the World Wide Web Consortium's Resource Description Framework (RDF)[6], an international standard for the representation of metadata. RDF is part of the W3C's Semantic Web project[7]."


Now, my feeling has always been that there will be more metadata than data and that trying to store it *all* is going to be very hard or impossible.

Even just putting one tool over a document it can produce more metadata than data. You put the rest of the tools over it. I mean you could even have different runs over the same data which will take up more space.

I even found evidence of this previously:
"Metadata itself isn't new: the Romans had it, and medieval legal manuscripts have more metadata than data."


In fact you can demonstrate this in a HTML page:

Hello world
<H1>Hello world</H1>

The tags actually take up more space than the actually content.
Inktomi and Interwoven Combine Products

As part of their strategy to deliver a completely integrated content management categorization and retrieval solution to enterprises, Interwoven will integrate search technology from Inktomi into its content management platform and Inktomi will integrate content intelligence technology from Interwoven into its enterprise search platform.

The two companies currently offer an integrated solution that enables joint customers to immediately index and search content published by Interwoven TeamSite and MetaTagger via Inktomi Enterprise Search. As part of the new agreement, Interwoven will integrate XML search technology from Inktomi into its enterprise content management platform to expand search functionality within the TeamSite environment. In addition, Inktomi will integrate new MetaTagger categorization technology from Interwoven into its enterprise search platform for an integrated categorization and search solution that provides users with a complete range of information retrieval tools.

The two companies report that key benefits of the expanded relationship include:

* increased knowledge worker productivity,

* consistent view of content across network, and

* reduced content redundancy.


Their product is called MetaTagger which includes features:

* Taxonomy-Driven Categorization
* Automated Taxonomy Discovery and MetaSource? Visual Vocabulary Builder
* Summarization
* Keyword Generation
* Business Rules Engine
* and others


Wednesday, April 10, 2002

KAON - The Karlsruhe Ontology and Semantic Web Tool Suite

Ontomat is a framework which supports plugins for creating metadata tool. The current set of tools supports annotating, ontology and metadata editing, database adapter and others. It uses an idea called CREAM for annotation of data. The REVERSE package has been designed to integrate databases into the Semantic Web. The KAON server runs on the JBoss application server and Hypersonic or DB2 backend.

CREAM Paper:

The server and other tools are available at:

Sunday, April 07, 2002

Soapy Google and Dream a little Email Client

Google has a SOAP interface:


Dream of an Email Client:

Altivec Difference

An O'Reilly article about how to take advantage of the Alitvec extensions in C.


Friday, April 05, 2002

Global Morality by DEMOS

There are three important characteristics of new wars. One is that new
wars are fought not by States, but by networks...Now a second
characteristic of these new wars which is very important is the fact
that instead of being wars in which you mobilise people to achieve
military objectives like capturing territory, the point of the violence
is political mobilisation...The third characteristic, which I won't go
into, is the link with the criminal economy, I don't really want to talk
about that now. What I want to make clear is that the implications of
these characteristics is that these are wars that are profoundly
difficult to end, because the power of the networks depends both on
sustaining fear and hate, their ideology depends on sustaining fear and
hate, and also their economic sources."

"It's a strictly market view of the world that decides that men and
women are equal. It's a non-market hierarchical, militarily oriented
view of the world that decides that men are better than women, largely
because of their superior capacity for violence."

"On the question of September 11th, I think what's very interesting
about September 11th is the way in which the Americans have managed to
lose the moral high ground actually quite quickly."

"What we've seen since then is essentially unilateralist action returned
to American unilateralist action, the kind that we've seen before with
Tony Blair providing a kind of figleaf of respectability and a few bombs
and soldiers. I think that's fantastically depressing. I was writing
about the Taliban from 1996 onwards at a time when people were not very
interested in them at all, and it was always clear that the Taliban was
not an indigenous organisation or indeed an organisation that could
sustain itself without support from Pakistan."


DEMOS web site:

Thursday, April 04, 2002

The Best Networks are Dumb - Are Private Telecoms are Doomed?

" "Of all the winning networked applications of the last decade--e-mail, Web browsing, instant messaging, chat, music sharing, streaming audio, e-commerce, etc.--every one appeared on the Internet. Not one was invented by a telephone company. And not one needed any special mechanism [or intelligence] within the network itself..."

"With "the paradox of the best network," however, Isenberg dashes all these hopes. Although the best network is still a dumb optical network, a dumb network is "the hardest one to make money running." Optics will bankrupt you. The phone companies were right to resist. Not only can they not make money on an open network that does more to empower the users than the owners, but no one can."


The original web site:
Sash and SashXB

Using XML, HTML and Javascript you can create appplications on Windows (Sash) or Linux (SashXB). The SashXB projects uses Mozilla and Gnome. The checkers demo application is even cross platform. While it's not a goal of Sash if you stick with HTML and other cross platform extensions (Jabber) you can achieve it.

Wired article:

Sash Web site:

SashXB Web site:

Wednesday, April 03, 2002

Java Creator Uses OS X (I bet he's got a beta of 1.4)

Q: But you said you're shifting to Mac.

A: Yeah. The thing that kind of broke it for me is that I needed a new laptop, and ... Apple switched to OS X ... it's become a really incredible desktop machine.

Binary XML (BiM)

As part of the MPEG 7 standard binary XML format has been developed. It provides all of XML's features but is 10 and 30 times faster (with up to 98% compression) than a C XML parser.


NBIO for Java 1.3

While I read about this a while ago, I wonder if someone will port this to OS X. It provide NIO support for most Unixes and Windows 2000.