Saturday, August 31, 2002

x86 Macs Confirmed

"Sources said more than a dozen software engineers are tasked to Marklar, and the company's mainstream Mac OS X team is regularly asked to modify code to address bugs that crop up when compiling the OS for x86. Build numbers keep pace with those of their pre-release PowerPC counterparts; for example, Apple is internally running a complete, x86-compatible version of Jaguar, a k a Mac OS X 10.2, which shipped last week. "

"The likeliest solution to the Motorola impasse, sources said: A desktop version of the 64-bit Power4 server chip in the works from IBM, which co-developed the PowerPC platform alongside Motorola and Apple and has provided CPUs for a variety of Macs. Sources told eWEEK that Apple and IBM are collaborating closely to equip the Power4 with the Altivec vector-processing capabilities built into the PowerPC G4. IBM is expected to discuss its new CPU at October's Microprocessor Forum. ",3959,496270,00.asp

Friday, August 30, 2002

Do They Know We Know They Know Nothing?

Here's the flip side to all of this knowledge sharing and semantic web stuff.

"There are 5 foundations on which the Knowledge Enhanced Public Sector rests...Knowing what you know...Knowing how you know...Knowing who knows...Knowing where you know...Knowing why you know."

"'Knowing why you know' - clear and visible comprehension and exploitation of the human and structural interactions and cross-over points which underpin the creation, development, use, frustration and interplay of an organisation's information, knowledge and expertise – internally and with stakeholders. "

'Knowing when you know' - The timeliness of process centric knowledge creation and adaptive integration of system infrastructure requires very immediate sensitivity. Or perhaps it is what certain people lack and so end up wasting peoples' time.

This is merely a collection of meaningless statements and jargon. Someone gets paid to write this rubbish.

KAON 1.2

This was released on the 20th but has been updated today. They implement their RDF server on top of SQL databases and offer features like transactions, views, text mining and ontology maintenance. I think their web site is nicely put together. It uses the mysterious KAON CMS.

They've apparently been working on performance:
"For example, the server has been tested with an ontology of 100,000 concepts, 700,000 instances and 200,000 properties. Operations such as loading and changing the ontology now take only a couple of seconds.

The core of the engineering server forms an optimized database schema, and extensions of KAON API allowing bulk-loading of multiple objects in one call."

Caffeinated and Cancer Free

``Mornings Have Never Been So Invigorating

Tired of waking up and having to wait for your morning java to brew? Are you one of those groggy early morning types that just needs that extra kick? Know any programmers who dont regularly bathe and need some special motivation? Introducing Shower Shock, the caffeinated soap from ThinkGeek... ''

And if you have a mouse at home you can lather him up and it he will have less cancer:

""We had between 50% to 70% tumor formation inhibition in the mice that were treated with caffeine or with EGCG (the other chemical compound)," said Conney, senior author of a study appearing this week in the online site of the Proceedings of the National Academy of Sciences."

Wednesday, August 28, 2002

Hyperlinks That Last All Summer Long

How do you stop this from happening?

FTrain had an article about it recently:
"I have a way to solve the problem. Or rather, I will now dictate my ideas to an uncaring world, for the fun of it. Here's what happens: someone sets up a small organization that assigns permanent URLS to every major news event. The URLS look like this:, or Simple stuff. The URLs don't have to be incredibly granular or complicated."

"Then, whenever anyone publishes a story on a topic to the Web, they include a bit of metadata in their web page indicating that this page is covering a Newspurl-identified story. "

This is similar to "Robust Hyperlinks" you simply add "...five or so word content-based lexical signature to make a Robust Hyperlink. When the URL's address-based portion breaks, the signature is fed into any web search engine to find the new site of the page." You tried to escape Google, that was your mistake!

Something to integrate into Mozilla:

Good to see he liked Flowers for Algenon.

Sun to Integrate Solaris with Linux

"To accomplish that transition, Sun is doing more than acknowledging that MIPS are a commodity. According to Gingell, even the operating system is a commodity. From his point of view, neither SPARC nor Solaris are differentiators. Solaris doesn't have a lot of secret sauce, just good code, Gingell said. Instead, Gingell wants to see the parts of Solaris that can't be found in Linux open-sourced and then merged with Linux. He calls the result "Linux by Solaris.""

More support for the software and services strategy. So maybe JBoss and Sun are competing with each other more than they think.,14179,2878378,00.html

How Many Switchers?

"But according to an Apr. 22 research note from veteran Apple watcher Charles Wolf of Needham & Co., Apple has been converting only 0.9% of non-Mac visitors into Mac customers. (Wolf owns Apple, and Needham makes a market in its shares.) It would take only a 2% conversion rate to boost Apple's market share in the home market by 50%, Wolf believes."

Lisa: [sigh] Well, I guess you can't beat big business. There's just no room for the little guy.
Lisa: [the doll] Trust in yourself, and you can achieve anything! [another girl plays with Lisa Lionheart and smiles]
Lisa: You know, if we get through to just that one little girl, it'll all be worth it!
Stacy: Yes. Particularly if that little girl happens to pay $46,000 for that doll.
Lisa: What?
Stacy: Oh, nothing.

The Gateway ads sound good too. "Charles R. Wolf, an analyst with Needham & Co. who owns shares in Dell, Gateway, and Apple, called Gateway's campaign "the height of stupidity" for giving free publicity to Apple's product. He also questioned why Gateway would want to defend the cadre of Windows-based PC makers against Apple, instead of attacking Dell." You can't buy style! :-)

How to Make Money as a Product Company?

"McNealy argues that Open Source is threatening licensing revenue needed to finance J2EE's advertising and R&D. Well, with a rumored $42B in the bank, Microsoft will ALWAYS outspend Sun on marketing. Good marketing starts with a good product; don't assume developers are dumb. JBoss has spent $0, I repeat "ZERO" dollars, on marketing and manages to get more downloads than Sun's own J2EE Reference Implementation."

I haven't seen Sun doing much selling of the RI anyway. Why would they? It doesn't make them any money unlike Sun ONE. With JBoss not spending any money on marketing it's really only "selling" to a select range of developers. You need the marketing types to convince the manager types that this is a technology that they can feel comfortable with. In the experience I've had they were willing to spend money on .Net rather than J2EE (free or otherwise) by sheer weight of marketing.

"Our solution to financing development is to pursue the services route. J2EE is a very services intensive market. Those who know how to take advantage of this are sitting on a moneymaking machine. We certainly aren't the only ones to come to this realization. It is also IBM's take on J2EE, where they often discount software licenses if they can make money on services, or, like Sun, on hardware. At JBoss Group, we bill our services at expert rates. We understand Open Source, we understand remote networking and we understand the code because we wrote it. "

"All Free Software really destroys is licensing revenue. This is economically sound since the cost of production is marginal and the cost of re-production almost nil. So that leaves us with hardware revenue. Since there is no free hardware, it seems that is safe."

Hardware revenue is not safe especially from the likes of Dell, IBM, HP/Compaq or even Apple. Sun sees its way out with software and JBoss then gives it away for free. At least Apple can still charge US$129.

Tuesday, August 27, 2002

Screensaver as Background

One reason why Apple will continue to make further hardware sales:

/System/Library/Frameworks/ScreenSaver.framework/Resources/\ -background -module "ScreenSaverName"

Monday, August 26, 2002

RDF Storage Survey

This report is part of SWAD-Europe Work package 10: Tools for Scalability and Storage and addresses the scope, features and purpose of developer tools for providing semantic web data storage using existing systems that are licensed as Free Software or Open Source.

Mentions: 4Suite, EOR, Haystack, Inkling, Jena, KAON, Parka DB, rdfDB, Python RDFLib, RDFStore, RDFSuite, Redfoot, Redland, Sesame and Edutella.

Mentions large datasets available and Network APIs.

"How many triples can these systems store?

(Is this even a sensible question to ask?) Yes, although the stores may not necessarily express the data in triples. There have been reports in Survey of RDF/Triple Data Stores[TRIPSURVEY] of several million 1.5M (RDFStore), 6M (RDF Suite), 20M (rdfDB), 1.5M (Redland), 300K+ (Sesame), 800K+ (Jena)."

20 million is not that impressive. Storage of billions of triples is more reasonable. There's also the fact that it is a simpler structure so storing things like data types definately requires more space (depending on whether you use statements of course).

Effect of Removing Open Source

"In fact, a July 2002 report for the Department of Defense that really looked at what was going on in DOD found just that. The question the report was commissioned to answer was "What would happen if open source was banned from DOD?" The conclusion was that security would be greatly reduced, costs would rise, and overall efficiency would drop. It said banning open source would be a disaster, so the report ended up recommending that far from banning open source in the DOD, it should be officially recognized, endorsed and encouraged."

Knowledge Navigator

There's a video at UMBC Agents on the Apple Knowledge Navigator. Like the paper below, the most important thing about this movie isn't so much the technology (anymore anyway) but the fact that the professor is interacting with one of the top experts in the field and that they can do it so readily. Apparently, this caused a lot of press at the time (1987) and John Sculley wrote an article about his vision of the future (see page 5 - did he predict OS X?).

Sharing Knowledge

"The macro view: quantify the intangible assets of an organisation by using tools such as the Balanced Scorecard, score boards, indexes and ‘navigators’...The micro view: How can the impact of single knowledge projects be assessed and quantified?"

"It emerged that quantitative measures can be actually very limited in ‘measuring’ knowledge processes. For instance, system usage is very easy to measure but there is no guarantee that this will actually result in individual or business performance. The measure is too indirect."

"As knowledge sharing is a social activity, it crucially relies on how people behave towards their colleagues, bosses, customers and suppliers. In turn, behaviour can be informed either by culture or incentives."

Their current research is whether senior management is committed to knowlege management.

Saturday, August 24, 2002

Mini ITX

The VIA EPIA motherboard is pretty cool. Measuring only 17cm square it's got firewire, usb, video and sound all integrated. It seems to do an alright job playing DVDs, DivX, MP3s and being a file server. The CPU is either fanless at 533MHz ($AU214) or with a quiet fan at 800Mhz ($AU245).

A good review by Tech PC:

It looks like the return of the Mac cube in PC form with versions including an old Sun IPX, grey metallic ones and the ice cube. When will someone do the inevitable sphere?

Friday, August 23, 2002

OS X 10.2

Well I've got it, installed it (yay for me!). It's as they all are saying: faster, better, easy to administer firewall and IPv6. They do internationalization properly allowing full Unicode characters (20 bits) so finally people can do Egyptian hieroglyphes, Mayan, Linear-B or others. They still haven't got rid of the little flag to indicate the keyboard layout. It's right in the center of the screen with all of those stupid flag colours. You almost never change it and it's right there front and centre. We're not all patriotic Americans (or Australians) you know!

The World

The data that we used (36 million triples) were converted from:

Semio Bought Out

KMWorld news reports:

"Webversa reports it has purchased Semio in a move said to accelerate expansion of its technology platform and expand its customer base. Webversa sells its voice-to-enterprise (V2E) access and alerting software to private sector and government customers through strategic professional services partners as well as directly. Combined with Webversa’s real-time alerting and multimodal interactive access technology, the new Webversa/Semio suite’s robotic sensing and alerting engine will provide enterprises with the ability to continuously monitor any data stream, revealing patterns of information activity that meet established criteria and notifying the appropriate individuals in real-time. In addition to the suite, Webversa will continue to support and enhance existing Semio and Webversa products in the marketplace."


"The experiments performed at JLAB generate immense quantities of data, which in turn require powerful computational resources. Every day, up to a terabyte of data is generated in experiments; JLAB's 12,000-slot StorageTek tape silo can hold a year's worth of raw, processed and simulation data, and a server farm of 175 dual-processor Linux machines processes the first-pass raw data from experiments."

"Scientists who wish to access the resources at JLAB may do so using a Web portal, called the Hadron Lattice Portal (HLP), a name chosen to invoke the image of a lattice (as opposed to hierarchical structuring) of interconnected computing resources. The HLP provides access to Linux clusters and data storage resources at JLAB, as well as several other institutions participating in the Hadron Lattice Physics Collaboration."

THE WEB SERVICES REPORT: Disseminating Megadata

Thursday, August 22, 2002

New Version of TKS

We released a new version of TKS today. It included much better documentation on doing searches across metadata and free text indices. There was also some optimization based on our experience with "The World". This is a huge sample of RDF, bigger than dmoz, of all of the places in the world with their details like latitude and longitude. The effect of all of this data had little degredation on its speed. You could also do lots of fun queries like finding out all of the places in Queensland or on the same longitude, etc.

Open Source Directory Change

The web site has changed it's method of listing open source products.

From the email they sent:
"Anyhoo, the new concept is based on hearing from the users of apps. After speaking with many many many folks, the general feeling is that you and us claiming that your products are stable are nice & all but people want to know what the users think, and that people get others excited about trying new things not a list of data. So we're putting that front and center. While we're at it we want to hear why they like them, aside from them being stable."

As long as the developer gets the feedback and a right of response that sounds fine to me.

Shiny Things

Porn, spam, get rich quick, miracle cures - all shiny things.

"When a new text is read for the first time on the Rhizome website, it appears on StarryNight as a dim star.

Each time a text gets read again—by any Internet user around the world—the corresponding star gets a bit brighter. Over time, the page comes to resemble a starry night sky, with bright stars corresponding to the most popular texts in the database, and dim stars corresponding to less-popular ones."

The only problem I had with the user interface was that I was moving the mouse down to scroll the text instead of up.

Wednesday, August 21, 2002

Sen:te Software

I've been using CVL which is a great CVS client for OS X. And a friend of mine has been raving about this GO client. Made by the same company, anyway...

Libraries and the Semantic Web

"The Scorpion database consists of an unordered set of concepts that are useful for classifying documents. Each concept is defined in a database record. Following the familiar vector-space model of information retrieval (Salton and McGill 1988), documents are submitted as queries against a database. The query returns a list of database records that contain terms found in the document, ranked in order of their similarity to the document. The result can be interpreted as a prioritized list of concepts that roughly characterize the content of the document.

In our studies, we have created Scorpion databases from two library classification schemes: the Dewey Decimal Classification (DDC) and the Library of Congress Classification (LCC). We include a test database derived from a portion of the LCC in this installation and refer to it here to illustrate a simple Scorpion database design."

What you wouldn't use is something like WordNet; as the concepts must be distinct. It relies on a Pears database and Gwen the search engine. Pears is a database designed for storage of hierarchically structured data. I think the Dewey Decimal system is fine when a book has to be put in one physical location but if you can say something as being 90% about one area and 50% about another that would provide a much better search system. Networks can flatten out to hierachies you just need to pick a starting point.

It seems that libraries are crying out for metadata extraction and storage programs. Their current process is strictly done by hand or by poor desktop tools! The National Library of Australia even gets a mention: "The work at NLA developed a practical model for dealing with the immediate threat of disappearing digital objects, and established a workable distributed archive. Similarly, a number of projects and researches - such as OAIS (Open Archival Information System), CEDARS (CURL Exemplars in Digital Archives), NEDLIB (Networked European Deposit Library), and others - have investigated options for dealing with long-term preservation challenges." The New Zealand Digital Library understands how to build a digital library with tools that infer compositional hierarchies or extract the most relevant words and phrases using a modified Bayesian approach. I have both Managing Gigabytes and Data Mining.

Game Over

"Schwartz's deal is to ride the Linux magic bus straight through IBM's DB2 and WebSphere revenue, and take a slice out of Microsoft along the way. "We will go drive Linux like a wedge. We will go hollow out DB2 and WebSphere with either a free database or a free app server," Schwartz recites. "We can give away the Apache Web server, Postgres, a free directory, a free messaging system, a free calendar and free portal, a free infrastructure. All of it for free for 100 users, and we can still make money. Microsoft can't."

Schwartz is on a roll. "So what happens to Dell now? Where are they going to go get [the software] from? They have to pay Oracle, they have to pay Red Hat, they have to pay somebody, so they are going to get squeezed." And here's Johnny with the punch line: "We can make life really hard for Microsoft by giving away the software. Dell has got to find that software from somewhere and, believe me, when they come to me and ask for it, it ain't going to be free."

"But my, isn't it interesting that the largest single open-source activity is StarOffice, followed closely by Mozilla?" Schwartz arches an eyebrow. "Wouldn't it be interesting if something interesting happened with Java, too?"

Open sourcing the Java desktop stack would certainly be a logical next step, and stitching together Jini and Jxta would present the Liberty Alliance with a HailStorm doppelganger."

Bunch of Stuff

Crystal Reports 9 now provides an SDK that fully integrates into JSP, Servlet, and EJB applications running on Websphere or Weblogic. This is a fairly big deal because that's all some people do with databases during their work day.

Eric Raymond has an implementation of a spam filter using Bayesian techniques - hope it's better than FetchMail ;-).

The Semantic Web is as easy as 1-2-3. Another introduction to RDF - which is not complete or anything but nice from another persons perspective.

There's also an interview with David Ascher who's been working on ActiveState's Komodo. Talking about how great Mozilla and XUL are.

Tuesday, August 20, 2002

Rip, Extract, Query

Good to see another company extracting structured and unstructured metadata and storing it specifically for metadata querying.

"As well as leading our own in-house research, CognIT is a partner in a variety of international co-operation projects, such as the €2.5M EU OnToKnowledge project. CognIT’s exclusive Mímír engine forms the core technology for the OnToKnowledge project, which aims to develop tools and methods for supporting knowledge management relying on shareable and reusable knowledge ontologies. "

Monday, August 19, 2002

Whatever Happened to Jini?

"In the world of XML-based Web services and SOAP, you would process the document upon receipt, possibly solving the equation. Imagine, however, what happens when you receive such a document but do not know how to solve differential equations. Even if I place a special note on the document asking that you, please, solve the formula before displaying it, you can't do that without knowing how. In the mobile code world, I would not only send you the differential equation, but also the instructions needed to process it. You could load those instructions into your mental "virtual machine" and solve the formula. Thus, although you lack prior capabilities to process a piece of information that came from the network, the network provided you with the solution—the instruction codes—needed for that processing."

An interesting comparison on the activity and developer size of Jini vs JXTA and companies using Jini over J2EE:

"In effect, XML is the answer, not Java. At least it is the answer chosen by users. The short history of computer networking tells us that heterogeneity is best served by platform-neutral communication protocols (such as TCP/IP and HTTP). Jini does just the opposite. It embraces a diversity of protocols, allowing each service to have its own system of communication, so long as it can supply a Java proxy object that understands this system to a client. Rather than make the communication system the locus of commonality, Jini makes the computational system, Java, the locus. I will bet that the JXTA peer-to-peer project (, which defines platform-neutral communication protocols, will eclipse Jini as a vehicle for distributed computing, simply on that account, as Web services already have."

An interview, with Rob Gingell who is Sun Microsystems' chief engineer, fails to say much at all except that maybe J2EE is the path of least resistance and that Jini is as successful as client side Java.


From the "I can't believe it's not X-Windows":

"According to usability research firm Nielsen Norman Group, "Billions of dollars are wasted yearly in lost productivity as people wait for Web pages to perform duties that could be better handled by a 1984 Macintosh-style GUI application." Over time, Web pages -- JavaServer PagesTM technology and Active Server Pages -- have become the dominant user interface for Internet-based applications."

There's a fairly interesting world clock demo linked of the article's main page. Which links to The Mind Electric web site which I've seen before but can't remember the context exactly. It seems to have all of the features you could want in a distrubted GUI environment and low bandwidth usage. Although, I'm not sure where they get 26.6Kbps I'm sure my modem was 28.8.

Original article from Sun's web site:

"It will also help bridge the gap between conventional GUI toolkit developers and web based GUI developers by providing familiar APIs for GUI components, component state, and for rendering and input processing. Comprehensive support for internationalization and basic input validation will ensure that developers include these features in the first release of their applications."

Availalbe as JSR 127 (an impressive list of companies):

However, they are keeping themselves covered by support C++ and in the future .Net.

Sunday, August 18, 2002


Review of three products: Applied Semantics Inc.'s Auto-Categorizer 1.1, Interwoven Inc.'s MetaTagger 3.0 and Thunderstone Software LLC's Texis Categorizer 4.1.

Auto-categorizer is a hierachical categorizer which requires ontologies for specific content such as health and insurance. Metatagger is integrated into Team Site and with an extraordinary six figure price. Categorizer is much cheaper solution and provides a manual selection of concepts that are relevant to a document. It requires a size of 20 highly relevant document to use a training set.,3959,381807,00.asp

Office as a Web Service

"The ODK is a set of tools, libraries, jar files, header files and idl files which are necessary to develop components for the using the OpenOffice API and the component technology UNO (Universal Network Objects). Furthermore, the tarballs contain all below mentioned examples (C++, Java, and Basic), which demonstrate the UNO technology and the use of the API."

"In order to connect the following client programs to the running office server, before running those programs, you should invoke the office with the following command:
soffice "-accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager"

"The following examples demonstrate how to benefit from the included word processor, spreadsheet, presentation software, graphics program, and database."

Documentation for the ODK.

Saturday, August 17, 2002

Hypermedia and the Semantic Web

"On the one hand, the Semantic Web infrastructure should enable several features commonly found in systems developed within the hypermedia community that are currently missing on the Web. On the other hand, the development of the currently emerging Semantic Web infrastructure could directly benefit from the models, systems and lessons learned within the hypermedia community.

Based on the articles mentioned above, we identified around 30 features that have been grouped into the eight categories..."

"Within Open Hypermedia Research, the user interface is part of the application's functionality and is usually more or less ignored. Within other hypermedia application domains, such as temporal hypermedia [36,37], spatial hypermedia [51] and taxonomic hypermedia [57], the presentation and interactive behavior of hypermedia structures is more complex than the typical button-like behavior of navigational links, and is often tightly intertwined with the underlying semantics of these structures."

Down with hierarchies!


"Jelly is a Java and XML based scripting and processing engine. Jelly can be used as a more flexible front end to Ant such as in the Maven project, as a testing framework such as JellyUnit, in an intgration or workflow system such as werkflow or as a page templating system inside engines like Cocoon."

Jelly has the advantage of is that it isn't tied to JSPs, or Servlets. This allows you to use it for unit tests, Swing UIs, scripted build tasks or job scheduling. It also has XPath, Javascript and BSF support too.

They use Maven which is a nice replacement for CruiseControl.

Friday, August 16, 2002

Semantic Grid

By combining the grid computing idea with the semantic web they plan to "...we see the emerging semantic web infrastrucure as an infrastructure for grid computing applications."

Their homepage:

Their main paper (now 8 month old):

Grid Computing Framework:
"The Open Grid Services Architecture (OGSA) is a proposed evolution of the current Globus Toolkit towards a Grid system architecture based on an integration of Grid and Web services concepts and technologies. Initial proposed technical specifications have been developed by the Globus Project and IBM, and are being put forward at the Global Grid Forum for discussion, refinement, and (we hope) eventual standardization."


From a well respected member of the Semantic Web community.

1. You do not talk about Semantic Web Club
2. You do not talk about Semantic Web Club
3. When someone yells "Stop" or goes limp, or taps out, the fight is over.
4. Only two guys to a fight.
5. One fight at a time.
6. No shirts, no shoes.
7. Fights go on as long as they have to.
8. If this is your first night at Semantic Web Club, you have to fight.

Thursday, August 15, 2002

Goto Considered Cool

I swear, everytime I lookup something in my C# book it makes me want to throw up. Considering Dijkstra's recent passing I thought this was especially relevant.

" read in a value;
while value != Sentinel do begin
process the value;
read in a value

Unfortunately, this approach has two serious drawbacks. First, it requires two copies of the statements required to read the input value. Duplication of code presents a serious maintenance problem, because subsequent edits to one set of statements may not always be made to the other. The second drawback is that the order of operations in the loop is not the one that most people would expect."

Generally, the way around this is to have the "read in a value" inside the "while" line or if it's too big put it in a method and call that. And don't get me started on callbacks, no checked exceptions and pointers in C#.

Thinlet and RSS

I was looking at Thinlet last week. I didn't write it up as I was still a little peeved about my experiences with XWT. Thinlets use a nifty scheme of XML to describe the presentation and Java to do any logic. The demo was a little flawed under Linux with some event problems but it was generally very impressive in size (26KB) and features. It is just one giant file with lots of non-OO code in it. It's written for Java 1.1 and is designed to support light Java clients such as PDAs as well as browsers. Something that another toolkit has been aiming at too. One of the things that did annoy me about XWT was the scripting side of things. Being a lazy programmer you sometimes get dependent on the compiler to pick up some errors (like variable mispellings).

AWT is thread safe and there are differences across platforms between Windows and Linux, for example. Debugging threading issues is a problem, debugging OS dependent issues is doubly unfun. In the article about what to through away in Java AWT was one of them. An under OS X AWT objects are not hardware accelerated and they suggest you use Swing.

The use of XML to describe user interfaces is something that I think Java developer deserve. Just not this. For example, an RSS feed viewer in an hour. Then a friend of his wrote a resource editor.

Tuesday, August 13, 2002

Morbid Fascination

Seems that a couple of odd web pages have come my way today. The first one was a list of all of the characters, guest stars, voice actors etc. that have died and appeared on the Simpsons. It includes people like Phil Hartman, Doris Grau, Steve Allen, etc. You can now date the episode based on which characters speak or not or the guest star that appeared.

The next is the Top Earning Dead Stars including Elvis at number 1 and J.R.R. Tolkein and Robert Ludlum. It's cool to think that you can earn money after you're dead - or even earn more money than when alive. Much like a deceased humour/sci-fi author who continued to sell Mac hardware and software (the movie still works BTW) well after he was dead.

Of course, this couldn't be a 2002 post about death without the inevitable September 11th reference. Amazing and ghoulish.

Groove Debate

I think this is just fantastic. Ray Ozzie and Dan Gillmor duking it out over Groove and Windows. Personally, I wasn't that impressed that it was seemingly tied to Windows and to Outlook when I saw it ealier this year.

Dan says:
"Groove, is becoming almost part of Windows -- and will be, at best, an afterthought (my interpretation, not Ray's) on other platforms. There's a disconnect here, I believe"

"Many of Groove's customers like the control-freak stuff. Corporations want to lock down their PCs in some ways, and Microsoft will be helpful to them in this effort. It may also be simpler, and more cost effective for Groove as a company, to turn the product into nothing more than an admittedly dynamite feature of Windows. It's not, I believe, in the ultimate interest of Groove's users -- not all of them, at any rate."
Dan's Column

"I can tell you that I will do what it takes to ensure that Groove has a chance to have a very broad, ubiquitous impact, that there are many potential users and customers out there with varying platform and feature needs, that I'm very pragmatic in achieving desired outcomes, and that the only "grand plan" is to create something of substantive value for customers."

Of course, Ray has fueled the idea of further Groove and Windows/Office integration and Dan is reacting to Microsoft's Paladium plan. Ozzie giveth and Bill taketh away.

Free (as in Freedom) Storage

Hmm, petabytes.

"The cluster file system helps database administrators more easily manage the file system on Oracle 9i RAC clustering technology. It offers, for instance, a graphical tool for managing a complete disk farm as one file system, said Robert Shimp, vice president of Oracle 9i database marketing.

"Oracle is releasing the source code in Linux to help increase the adoption in the high-end enterprise market," Shimp said. "This is part of an ongoing series of efforts by Oracle to help build up Linux in the market place.",3959,456894,00.asp

Monday, August 12, 2002


"ABYSML is a markup language that puts comprehensibility of the input text above convenience for the language processor. In this respect it is very different from XML, which is both dreadfully difficult to read, and yet a tremendous pain in the ass to parse; XML brings to mind H.L. Mencken's concept of "the libido for the ugly". Well, ABYSML is beautiful and simple and I guarantee you won't give old H.L. a second thought."

Example of the code:
NAME: Dog-Problem
NAME: light-on
OUTCOME: false
PROPERTY: 'position = (73, 165)'

(Available from the RISO site).

Sunday, August 11, 2002

Death Of EMail

"With Notes essentially cloned and the original Ozzie team resurfaced at Groove Networks, Microsoft shifted its attention to XML Web Services. With Microsoft's investment in Groove, collaboration R&D is now focused on the intersection of Groove's decentralized peer-to-peer model and Microsoft's centralized STS (SharePoint Team Services)."

Inforworld article

Ray Ozzie on collaboration:

"We spent years and years at Lotus trying to convince people of the "higher order" value of collaborative processes, sharing, and KM. And I learned the hard way that fighting what appear to be natural organizational and social dynamics is very tough. Which is why eMail is the most popular collaboration tool on the planet: it works the way that people naturally want to work. And which is why Groove is built upon a client-side, personally empowering "email model" than an "app server" model. Mobile, instant, ad hoc, private. Effective collaboration tools strike a balance between personal need/behavior and collective/organizational need."

Saturday, August 10, 2002


"RISO: distributed, heterogeneous belief networks. A belief network is a probability model defined on an acyclic directed graph; distributed means nodes can be on different hosts, and heterogeneous means allowing different conditional distributions."

From the thesis:
"RISO is a system to aid reasoning in spatially and temporally extended problems, which implements a class of graphical probability models called distributed
belief networks."

"As the number of sites is increasing 400% per year, we can expect RISO installations to soon outnumber elementary particles, not to mention available IP

"It is natural to represent each geographical unit with a belief network, and to represent the flow of information from one locale to another by edges connecting variables in
separate belief networks; different kinds of messages travel with the arrows and against the arrows."

"The RISO inference algorithm is based on the polytree algorithm for belief network inference, in which “messages” (predictive distributions called 1/-messages and likelihood functions called ¸-messages) are computed."

When we create this distributed belief system we can finally find Elvis (25 years dead this month).

I've been able to get the Java code going with a little juggling. Wrote an Ant script which helps. It still needs some work and the initial instructions are wrong. Cutting and pasting from PDF is fun too: I think "inference" came out with a registered symbol (®), 1/- is pi and ,- is rho (or something).

Live Free or Die - California Pulls a Peru

"Named the "Digital Software Security Act," the proposal essentially would make California the "Live Free or Die" state when it comes to software. If enacted as written, state agencies would be able to buy software only from companies that do not place restrictions on use or access to source code. The agencies would also be given the freedom to "make and distribute copies of the software."

"The point of the proposal isn't to punish developers of proprietary software. Instead, advocates point out that "closed" software adds costs and creates security risks, two problems the state needs to reduce."

Looks like California is cruising for a "donation" from Mr Gates.

Mozilla Monetizing

"So here you are, a multi-billion dollar corporation with a sagging stock price that has spent hundreds of man-years and millions of dollars on a layout engine and a Web browser. You've thrown all this money into the support of this cool "standards-compliant" layout engine, and you don't even know what that means! You're out all this money, and you have to find some way to make it back.

Well, do I have the plan for you. Enroll in David Hyatt's "Monetize that Browser!" seminar today, and you can learn how to recoup your losses. I'll teach you time-honored methods for making that money back. Yes, in just 30 short days, and for the low price of $599.95, I'll whip your money pit into a cash-generating machine!"

With "New Furniture from IKEA" under "File" and under "View" "Movies @", "Tools" has "Buy More Tools from Home Depot". It's lucky they went with "Bookmarks" and not "Favourites".

David's post

Two More Semantic Web Articles

This is a fairly interesting about the maturing use of ontologies.

"Large ontologies are essential components in many online applications including search (such as Yahoo and Lycos), e-commerce (such as Amazon and eBay), configuration (such as Dell and PC-Order), etc."

"One of the simplest notions of a possible ontology may be a controlled vocabulary – i.e., a finite list of terms. Catalogs are an example of this category. Catalogs can provide an unambiguous interpretation of terms – for example, every use of a term, say car – will denote exactly the same identifier – say 25. "

"The next point includes frames[4]. Here classes include property information. For example, the “Apparel” class may include properties of “price” and “isMadeFrom”. My specific dress may have a price of $100 and may be made from cotton."

Also lists 7 uses of simple ontologies/taxonomies and 8 uses of structured ontologies. Things like consistency checking, disambiguation, site organisation, comparative searching, etc.

This is yet another primer for the sematic web and ontologies.

"As with a conventional CMS, we tag the content elements with metadata that describes the element type, like headline, body text, and publication date. We also need to tag the content with metadata that describes what the content is, like product description, white paper, support document, or retailer. In addition, the system must know how these components relate to each other."

"You can see we begin to describe not just content but concepts, ideas about what is in an organization and how those concepts relate to each other. The goal behind this kind of model is to represent our understanding of the organization and to document it in a way that will eventually become readable by the CMS."

"The model described above - the concepts, relationships, plus some additional information - is called an ontology."

Friday, August 09, 2002

dmoz RDF

Much in the same vein as RooDolF this is an RDF wrapper around dmoz. It uses the Dublic Core name space which is good.

Unfortunately, it's the first 1.5 GB of dmoz so much of it seems to be porn or "adult" content:

IBM to Supply Next CPU for Mac

"IBM is to release a version of the dual-core Power4 processor aimed at the desktop, and will disclosed details at Microprocessor Forum in October.

The new chip designed for "desktops and entry level servers", and will be an 8-way superscalar, SMP-ready design capable of 6.4GB/s throughput. Tantalisingly, the processor has it own "vector processing until implementing over 160 specialized vector instructions."

The "over 160" number is quite significant: the AltiVec vector unit for the PowerPC G4 has… 162 instructions.

Now why would IBM want to do create a desktop RISC processor? It needs to remain competitive with entry-level workstations against the likes of Sun and HP's Alpha, where the size and heat dissipation of the mighty POWER4 have kept it out of systems below $12,000. IBM's desktop workstations still run POWER3 (but then you can find UltraSPARC Iis in Sun's bargain basement)."

Thursday, August 08, 2002

It's Knowledge Sharing

"Yes, I still hold the view that the term 'knowledge management' is a misnomer...In other words, manage the explicit knowledge - the stuff that is written down - of the organisation and we will have success. We found that this was not sufficient to achieve success because it dealt with just a small percentage of the knowledge in the company."

"We then worked on re-defining the educational opportunities that were available to our associates by changing the pedagogical approach to education. We shifted from sending people to class to one of delivering the class to the student anytime/anywhere. This allowed us to change the cost equation of education sufficiently that we could offer most of the courses from the universities free of charge to our associates provided they obtained a passing grade. We are now moving to just-in-time learning. (This is about 5 per cent of the effort.) We are now spending a lot of time and effort to improve our ability to function as global teams. "

While I don't think Office is seamlessly integrated or that the productivity gains found by using products like Office were not forcast 10 years ago (I would say 40 years ago) he says that future gains in the knowledge management will include:
1. Seamless translation (universal translator),
2. Seamless interface (removing the keyboard),
3. Internet connectivity (further growth).,2276,51302,00.html

Interview with Eric Miller

Answers the important questions like what the semantic web is, how it will function, what technologies make it up, etc.

"The Semantic Web is simply an extension of the current Web that allows for more effective sharing and combining of information. To an information professional, this translates into greater access, more accurate and timely information, and reduced costs." magazine has an article based on the interview.


This is a simple idea but it converts the Google results into RDF. May he get all the riches and fame he deserves. Does the way you combine Google's results mean that it is producing more metadata than data?

Wednesday, August 07, 2002

OWL and Lattices

I've started reading the OWL draft specification. Having the sets all and nothing (owl:Thing and owl:Nothing) reminded me a lot of lattices. I wonder if they will do things like using lattice multiplication across ontologies.

Formal Concept Analysis: Mathematical Foundations:

OWL Web Ontology Language 1.0 Reference:

HyperCuP paper

A very short one page description of the HyperCuP semantic routing:

"It would be desirable to restrict the broadcast of a query message to peers that can potentially provide information related to the concepts asked for in the query. We address this problem by constructing more than one hypercube in the network: One hypercube is created for each ontology concept, creating a cluster of peers which carry information related to the concept."

Skin your Swing

Allows you to skin any Swing based application. Has XP, Aqua, or any themepack (it uses a GTK and Gnome theme and an XML descriptor).

The Coolest Button

"PowerMate is the coolest volume knob your computer has ever seen – and so much more. Use it to edit home movies or scroll through long documents and web pages. Best of all, PowerMate is an assignable controller. Program it to do anything you want in any application. Customize it to your own needs and get wild."

Tuesday, August 06, 2002

Decentralized Meta-Data Strategies

" Recent developments in peer to peer networks * have centered around the concept of distributed hashtables (DHT) or content routing [Ratnasamy et al., 2001][Stoica et al., 2001]. These approaches assume possession of a hash or other identifier that precisely specifies the document the user wishes to retrieve. Naturally there are some situations in which a user only has more general information about their needs, e.g. keywords or other meta-data. A number of different strategies for handling this kind of search in peer to peer networks have recently come to light. Several are summarized below along with an attempt to identify some common themes."

Description of Anthill:
" The crucial difference in their scheme (apart from calling messages ants) is that each nest (node) stores a routing table associating keyword hashes with sets of other nests. Different nests become associated with different parts of the hashed keyword space thus avoiding the problem that keyword space itself is highly clustered (presumably around various spellings of "Britney Spears")."

Covers Edutella, FASD, Anthill, Routing Indices, Alpine, Associative P2P Networks, PlanetP, Query Routing, SIONet, JXTASearch, NeuroGrid, Reptile, Semplesh, and HyperCuP.

I'll have to check these out at a later date especially NeuroGrid, SIONet, Platet and HyperCuP.

10 Things I Hate About Java

Ditch AWT, ditch legacy code, redesign i/o, get rid of primitives all good stuff.

This scared me. Open source implementations of J2EE makes Bill Gates laugh mockingly at Scott McNealy. I think someone has a complex. Although, he might be right about the advertising. I hate the .Net advertising being everywhere. Sun needs to be at least that annoying.

Meanwhile, JBoss continues to be one of the more innovative J2EE servers:

There's also JOnAS which uses JORM (Java Object Repository Mapping):

Monday, August 05, 2002

So You Want to Write an Operating System

Yeah, so apparently if you're smart you don't write your own operating system. Although, if you do you do simple things like pick your audience, your goals, be God and the architecture you want. I liked the idea of Freedows and it was started by another young hacker (it seemed like it died because of personality clashes rather than technical although that probably helped).

Yeah, yeah I'll finish it this month:

Of course, using Java to teach operating systems is dumb, especially after you've done it already:

Anyway, real operating systems are written in assembly:

Spinning the Semantic Web

MIT Press is going to publish in November a book written by some of the people responsible for OntoWeb. Foreword by Tim Berners-Lee.

"Spinning the Semantic Web describes an exciting new type of hierarchy and standardization that will replace the current "web of links" with a "web of meaning." Using a flexible set of languages and tools, the Semantic Web will make all available information--display elements, metadata, services, images, and especially content--accessible. The result will be an immense repository of information accessible for a wide range of new applications."

MIT catalogue entry for Spinning the Semantic Web

A few of the authors:

Sunday, August 04, 2002

Threads, Threads, Threads

A topic I never get sick of threads. Javaworld series:

OptimizeIt Suite (Thread Debugger):

My favourite author on the subject (Doug Lea):

Apple to go to Intel

"Neff, for instance, predicted Apple, which uses chips from Motorola and IBM that currently top out at 1GHz, will switch to Intel, whose chips run at 2.5GHz, to get a performance boost and gain more customers. There's a better than 80 percent chance Apple will make the jump in two to four years, he said."

I guess that's Neff said about that.

Saturday, August 03, 2002

Cave or Community

"Of the top 100, 70 were individuals or very small groups (typically pairs). These individuals accounted for 46.1 percent of the code and 50.4 percent of projects. One individual had contributed to 267 projects."

"Similarly, previous authors have identified the strong hand of the leader of an OSS program. Moon and Sproull refer to Linus Torvalds as a "great man". Others have pointed out that Torvalds essentially did not have a life and spent considerable number of hours rewriting code submissions by others."

My most recent experience with an Open Source project is having my patches ignored by the author. Then only to see it reimplemented by him. With my own, small project, it has basically been me as the contributor. However, I've always offered help to understand and to contribute to the project. I can't imagine ignoring a patch (especially if it was for the Disk Scheduler or something as someone is working on). I've had patches accepted by the Jena group. They weren't earth shattering (total of up to 5 lines of code). But they were functional and met features that I needed. The people at Jena could've rewrote it but it was a waste of time. The ramp up to get someone productive on a project is considerable even if the goal (as in mine project) is to make the code as easy to understand as possible. The more that are familiar with the code the more likely the code will grow and get better.

I'm not the best coder - my patches could've been lame or crap. I can't be objective about my own code but from what I've seen happen to others in other projects is that good patches can go to waste. Most of the time this is just from people not understanding the original code or the patch.

The reasons why people contribute to OS according to the article is:
1. To take part in an intellectually stimulating project.
2. To improve their skill.
3. To take the opportunity to work with open-source code.
4. Non-work functionality.
5. Work-related functionality.

On taking up OS projects (especially for work) there's considerable risk involved when it's just one contributor. There maybe a very good reason why there is only one.


"The HITS algorithm is a two-step process, applying a sampling step that identifies a focused collection of several thousand Web pages likely to contain numerous pages considered authoritative on the topic, and a weight-propagation step that uses an iterative procedure to assign scores to pages regarding the quality of their hubness and authoritativeness."

Improving HITS:

Metadata Registry for the Semantic Web:

Friday, August 02, 2002

KVM for ARM Processors

"The company has released CLDC HotSpot, an implementation of its J2ME Virtual Machine for the Connected Limited Device Configuration that is optimized for processors using ARM’s 32-bit cores. Sun claims that it outperforms its predecessor by a factor of 10, and should fill developers’ needs for the next half-decade."

"Lorain said that due to time constraints, the CLDC HotSpot implementation does not take advantage of Jazelle, ARM’s on-chip JVM that Sun helped develop. “It uses more footprint than Jazelle and won’t do as good a job at preserving battery life, but it will still do a better job than the current CLDC implementation,” he said, adding that it requires about 225KB of device memory, while the previous version used 90KB. Both require an additional 70KB for J2ME libraries. Sun and ARM still are working on “a joint solution that will leverage CLDC HotSpot and Jazelle,” Lorain said, but he did not give a timeline."

Thursday, August 01, 2002

Episode 3 in Lego Form

Winner of the Audience Choice Award at the Brickfest 2002 Animation Competition. The aim was to create a trailer for a Star Wars movie, either existing or made up.

AAAI-2002 Workshop on Ontologies and the Semantic Web

There are ten papers available from the web site. The ones I found interesting were: "Haystack: A Platform for Personalized Information Management Built on RDF", "Learning Environments Using Reusable Knowledge Units" and "LCW-Based Agent Planning for the Semantic Web". But there's a few others that I wouldn't mind reading if I get the chance.


"The Ozone user interface is the sole client at this time. The Haystack server hosts a number of information stores, including an RDF store. Other stores are simply wrappers around other sources of information such as LDAP servers, IMAP servers, and Microsoft Outlook stores. All stores are federated by a single Federation Service, which is responsible for dispatching queries to and combining query results from several stores."

I'd like to download or at least somehow look at the Adenine language which sounds interesting:
"Adenine takes the p-code concept one step further by making the ontology explicit and extensible and by replacing byte codes with RDF. Instead of dealing with the syntactic issue of introducing byte codes for new instructions and semantics, Adenine takes advantage of RDF’s ability to extend the directed “object code” graph with new predicate types...Values in Adenine are represented as Java objects in the underlying system."

This was presented at the 2002 W3C conference: