More News: 09/01/2002

Monday, September 30, 2002

UML 2.0

"'UML 2.0 is starting to include BPM. In [UML 1.4], the activity graphs were a bit restricted for BPM. Even in 2.0, they could still be better. But the trend is in the right direction.''"

"Many consider real-time design and code execution issues odd men out in the first UML, but that should change with UML 2.0.

Part of the push toward better real-time support rides on a submission by a group including famed methodologist Stephen Mellor. Mellor -- co-author of a new book from Addison-Wesley called Executable UML: A Foundation for Model Driven Architecture -- was not one of the cheering throng when UML 1.0 happened, but he has more recently joined efforts to produce a UML that leads more directly to executable systems. Such an approach could become an alternative to Java.""

http://www.adtmag.com/article.asp?id=6671

Spring now for Jaguar

http://www.usercreations.com/spring/. Playing around with it at the moment it seems as interesting as what I've read. I'm having fun, searching for leeks and Dell stocks in Daypop.

Sunday, September 29, 2002

Pingback vs Trackback

Pingback 1.0 - "It basically boils down to telling a Web log when you've linked to it, by fetching the page, looking for a header advertising a pingback server, and then invoking an XML-RPC call on that server. The best thing about this idea is that unlike similar schemes like TrackBack, it is totally transparent to both users. It is also software-agnostic, so any Web logging system can implement pingback and interoperate with all other pingback-enabled Web logs." Maybe by changing the use of "documents" to resources we might have something even better, bi-directional links.

A list of feedback on pingback one week on. The Trackback documentation.

XML Namespaces

It would seem to me that Dave implemented something without understanding it (he says XML is flawed, right). I still reckon some sort of levy on this type of mistake would be in order. The more I look at the process of RSS 2.0 the more I appreciate the W3C.

Here are a couple of blogs that I agree with:
"[on Dave saying XML is broken] is about the same as me saying that MySQL and PHP aren't working correctly on my system because the theories behind relational databases and web application script engines are fundamentally flawed." - RSS, XML, Namespaces, oh my.

"It's funny how he sees RSS 2.0 as a community effort when it's an advantage to him, and as his own playing field when it isn't." - RSS 2.0 busted

"Throwing code at a problem doesn't solve it. Mindlessly following the trends, solves it even slower. Start trying to combine mindless coding with sheep-like trend following and I start to wonder why I bother implementing anything." - Mindless Coding.

A truly exhausting thread.

Luckily, I haven't spent too much time on RSS 2.0. Mainly because it's not that exciting a technology. Something to be integrated with when there's some content. Code to standards or at least code to something that has had some decent thought put into it.

Friday, September 27, 2002

Shades of Grey is All I See

"As with many so-called illusions, this effect really demonstrates the success rather than the failure of the visual system. The visual system is not very good at being a physical light meter, but that is not its purpose. The important task is to break the image information down into meaningful components, and thereby perceive the nature of the objects in view."

http://www-bcs.mit.edu/people/adelson/checkershadow_illusion.html

Watson - Another Plugin Platform

An interesting interview with the creator of Watson.

"DS: Do you think Watson is going to help change our idea of the Web? We're still very browser-centric in our thinking. Watson represents a shift away from the browser. More like a Web services thing. What's your take on that?

DW: I get a lot of comments from users that Watson opened their eyes that the Internet doesn't have to be just a Web browser. It's very easy for people to get used to one paradigm and get stuck in it. A year ago, there was nothing like Watson to quickly access the most useful services. And now we're starting to see a few other applications that break the boundaries of the Web browser and use the Internet in completely different ways. Take Spring from UserCreations. It puts a canvas of interconnected icons representing real-world objects like people, books, places, and foods on your desktop, and each object is connected to the net. NetNewsWire from Ranchero gathers up news feeds and presents them in a simple UI. WeatherPop from Glucose puts a quick weather display on your menu bar. I don't think that the Web as viewed through your browser will go away anytime soon, but people are starting to realize that it may not be the best way to view structured information. "

According to Tim O'Reilly Watson will be successful over Sherlock because "a platform strategy beats an application strategy every time." Ray Ozzie tends to agree.

Trillian Pro

So, Trillian Pro has a plugin system to allow you to add things like RSS feeds. No RDF though, hah!

Also, saw Messenger which is just a facade over JMS.

I should read Rebelutionary more often.

Beyond Google

"I wonder about how it lets us search. Google allows us to look for things we never would have been able to, or even thought of, before. I don’t even think twice now about searching for phrases buried deep within documents, knowing I’m searching over 2 billion Web pages, using strategies that would have been punishingly expensive and time-consuming on Dialog. That’s potent stuff; but at the same time, it doesn’t allow anything approaching proximity searching and actively discourages Boolean searching—techniques we know are very powerful but which must be used with care and take time to learn and that we consider to be professional signatures. Will this have an impact on our searching skills and performance?"

"Will that also mean that the kinds of professional-level resources we rely on—ProQuest, Lexis/Nexis, EbscoHost, and their kind—will then be less used, even though we pay big bucks and make them available for our communities?"

Links to an interesting paper from the founders of Google presented at the 7th W3C conference.

http://www.ala.org/alonline/netlib/il1002.html

" The truth is, nobody knows how wide the Web is. Some say 5 billion pages, some 8 billion, some even more. Anyway, what's definite is that the major search engines (SEs) index only a fraction of the "publicly indexable Web". Moreover, every SE indexes different Web pages, which means if you use only one SE you will miss relevant results that can be found in other search engines." Lists about 20 "very good" search engines and about twice as many others.

http://www.llrx.com/features/metasearch.htm

Thursday, September 26, 2002

Another Visualizer

UML2DAML and RDF Grapher are two rather professional looking tools. RDF Grapher will convert RDF into SVG, GIF or JPEG and is only 290k (requires Jena 1.3.2 and Batik though). Source code and ant build script all under GPL. The UML2DAML tool is about 6MB but converts XMI (which I got interested in after JSR 40) to the same sort of view using DAML+OIL. It compares well with other tools such as RDFSViz, RDFViz, RDF Author, and Ideagraph (or the older Svolgo). This is interesting because you could use any UML 1.4 compliant tool to produce configuration or business flow changes, convert it to RDF and then use that to change the workings of your currently running code.

Wednesday, September 25, 2002

Getting Backup up

Steps to get the OS X Backup utility to work without a .Mac account.

Tuesday, September 24, 2002

Jackpot

Not having looked at either of these tools deeply (mainly because they're not available) it's easy for me to imagine that using the "annotated parse tree" and expressing Java objects in RDF like Adenine or SPRUT are similar. Also talks about how the GPL defies physics, "It takes an awful lot of energy to produce software. Somehow or other that has to be matched. The energy that comes out as software, something's got to go in—if only to pay the salaries of the people who are working on the software." Okay, the IBM IDE and IDEA have been doing this for a while but adding different language support would be easier.

http://www.eweek.com/article2/0,3959,545638,00.asp

Monday, September 23, 2002

Free For All

With Parka DB or Jena, MDF and MLJ you have all of the other tools that you need for the demo application that I talk about below (without using TKS or TMex which are better and all Java, please buy now!).

Economics Driven P2P

"The Compute Power Market (CPM) Project uses economics approach in managing computational resource consumers and provides across the world in peer-to-peer computing style. It allows application users to access computing power with ease and simplicity of accessing electric power form a wall socket. It even allows to choose computing power/resource providers that offer cost-effective service on demand. Thus creating a competitive market approach to service oriented P2P computing."

Solving problems like compute this for me, within 2 hours for $5 dollars. Sounds a lot like smart contracts. The economy grid page has some of the latest snippets of news about grid computing and a thesis on using economics for resource management in computing. See my previous postings too.

http://compute-power-market.jxta.org/

More of these projects from GridBus:
http://www.gridbus.org/

Sunday, September 22, 2002

Demo Application

In answer to "What would you do with RSS? I'd say integrate, store everything as RDF and make it peer-to-peer. I hope that the Google future is not one that's so centralised. If you combined Limewire, NetNewsWire interface and restricted versions of TKS and TMex you'd have a fairly neat application. Or maybe a Universal Inbox is a better idea (with calendar sharing and file browsing). There already are plenty of implementations of email integration. The Gnutella network turns everyone into a web server/client which could make it a distributed daypop. I think the only difference between smaller, enterprise, central deployments and the large Internet one is the enabled tools. The cheap or free client would have a hard coded set of tools. You don't want people downloading a frontend that is effectively an application server. A smaller version handling RSS, RDF, Gnutella and HTML should be enough. Something like this combined search. You could add a lite concept extraction tool and store all the files metadata into TKS (would speed up Limewire just as it is). Just use the query interface across Gnutella, calendars, RSS feeds, HTML/XML/Web, mail (SMTP, IMAP, POP) and local hard drives.

Graph Tutorial

"Gato - the Graph Animation Toolbox - is a software which visualizes algorithms on graphs. Graphs are mathematical objects consisting of vertices and edges connecting pairs of vertices: think of cities as vertices and interstates as edges connecting two cities. Algorithms might find a shortest path - the fastest route -- or a minimal spanning tree or solve one of other interesting problems on graphs: maximal-flow, weighted and non-weighted matching and min-cost flow."

A nice example of an educational tool.

http://www.zpr.uni-koeln.de/~gato/

Saturday, September 21, 2002

Wired on Wireless

"Nicholas Negroponte explains why Wi-Fi "lily pads and frogs" will transform the future of telecom." While this is fine for some, " I pay a fixed fee and am happy to share." All the telecom companies have to do is start charging your broadband service based on usage to kill Wi-Fi. He says, "it can cover more than 20 kilometers" - yes cover Africa or Australia or Asia 20 kilometers at a time. Anyway, there's still these two big ponds you have to cross called the Pacific and Atlantic - telecoms still have the power.

http://www.wired.com/wired/archive/10.10/wireless.html

Legal XML and RDF

MetaLex is an open XML standard for the markup of legal sources. Version 1.0 of this standard was presented to the general public at the EGOV conference...in-depth information and downloadable files: a manual, examples in XML, examples in RDF, XML Schema documents, XSL Stylesheets etc.

This includes RDF dictionaries:
http://www.lexml.de/rdf.htm
http://rdfdictionary.sourceforge.net/

Stylesheet to convert the XML vocabulary to RDF (they're using bags):
http://www.metalex.nl/examples/rome-rdf-ex.xml

Guidelines:
http://www.metalex.nl/guidelines.html

Parka DB Released

Mindswap have released ParkaDB under MIT license. So far it can only store 2.5 million triples.

Friday, September 20, 2002

Lost Data

While this is an interesting article, about how quickly digital information is being created and lost, it did highlight one unintended consequence: "Software patents that can be infringed freely because the original software no longer works, preventing the patent holders from proving prior art." I guess all this can be done now by simulation of the hardware that it used to run on. Is it a great loss if we don't capture every email ever sent? They cover emulation, encapsulation and migration (print everything out and scan it back in?). Why not instead of using formats that are not human understandable (like Word 95) use XML and the like (which is hopefully forward compatible)? Being able to play old games 20 years later is cool, I recently played both the Atari 2600 version of H.E.R.O. and Day of the Tentacle very neat.

Doing Your Taxes with N3

Tim Berners-Lee did a presentation on the Sementic web (with a TiBook). Which is available in high and low bandwidth formats. Discussed Algae, cwm, Blindfold, doing his taxes using RDF in N3 format, Dan Connelly's travel tools, the various ways in which patents might slow down the adoption of the Semantic Web, whether a Wonder GUI (generic UI) will be used to produce and query RDF or whether it will be integrated with existing applications, no one working on a better database for RDF (darn) and that there's no shrink wrapped applications. The slides are here.

JXTA for Wireless.

"JXTA defines several types of resources; for example, network nodes (peers), peer groups, communication channels, pieces of data, and so forth.

Every node in a JXTA network is a peer. Every peer connected to JXTA network should have a unique identity, called a Peer ID. The peer ID will be dynamically (late) bound to its IP or TCP address by the JXTA network.

A peer group is a logical rather than a physical entity; it is formed by the grouping of peers sharing common interests. For example, there can be groups, one each for all the different types of music, so that music lovers can join groups according to their taste to discuss and exchange songs.

Peers are identified by their IDs in a JXTA network. Therefore, a peer group has no concern over the IP address currently used by any of its group members. This effectively hides the unreliability associated with the dynamic behavior and changing topology of interconnected networks."

This is one of the things at the time that I thought was so good about JXTA at the time. They also mention JXTA4J2ME or JXTA for J2ME.

http://softwaredev.earthweb.com/java/article/0,,12082_1464091,00.html

Tuesday, September 17, 2002

ClearForest

There's an upcoming knowledge management conference. In the list of exhibitors which included Convera, Inktomi, Intelliseek, TheBrain, Vivisimo was ClearForest. Their products seem directed specifically at the financial market (mergers, acquisitions, executives, stocks, etc):

http://www.clearforest.com/products/products.asp

However, it seems suited to general tasks as they won the KDD Cup which is over fruit fly material. The ClearCharts (once you get past the speil) and ClearResearch have fairly neat UIs. The ClearTags demo doesn't work. Their downloads include more marketing material.

Bring Back the Happy Mac

You can change your boot images of OS X 10.2.

http://www.resexcellence.com/bootimage/index.shtml

Monday, September 16, 2002

A Learning Search Engine

"The system uses technology that literally helps the engine automatically learn the difference between "good" and "bad" results, over time...the search engine crawls both science and medical web pages and news sources, offering an interesting mix of information. Unlike nearly every other search engine, it also allows you to choose the query processing algorithm used to determine results. You can select a "conventional" algorithm (which the site says is like Google's), or three different "vectorspace" models, typically used by more traditional search systems."

If there wasn't the requirement to hit the "proceed" button, it might actually be useful.

http://phibot.org/new/Search/CTEvaluation

The main page allows you to search by channels and the research data is available for viewing.

Make Money Extracting Carbon

" But if Australia does ratify the agreement it could benefit by providing emission trading credits in the fledgling international carbon trading market. This allows polluting countries to meet their greenhouse gas emission targets by investing in "carbon sinks", such as forests or clean energy projects.

Australia could sell these carbon credits to countries such as Japan. But if it does not ratify the protocol, it could miss out on the emerging market."

After spending 200 years making money by pulling out trees Australia gets to make money putting them back in.

http://www.smh.com.au/articles/2002/09/16/1032054714933.html

Jena 1.6

Looks like there may possibly be a Jena 1.6 before 2.0. Mainly bug fixes but also:

Jena-Sesame integration (http://sesame.aidministrator.nl/ - Sesame is an Open Source RDF Schema-based Repository and Querying facility)

N3 Reader - It runs at about 19K statements/s (it is I/O bound in the lexer - optimization makes no difference) parsing, 9K triples/s generating RDF.

Original 1.6 email

One OSX FAQ

How to enable WebDAV on OSX

Friday, September 13, 2002

Interesting JXTA Applications

Edutella, Reptile (and Blognet), jnushare, JXTASpaces, Distributed Indexing and JXTA Search.

TAPache

Guha, I think, has come up with not only a set of compelling applications/use cases, but an easy to deploy tool and simple API (apparently trying to be the GET of HTTP). Simply install the Apache module and place RDF into a directory and you're done. The query API is quite nice too, being able to query by property or by URI if you have one.

http://tap.stanford.edu/tap/docs/client.html

In his PowerPoint slides he talks about the TAP vocabulary, using it as a boot strap to turn the existing shared meaning into more shared meaning. The applications discussed are a sidebar for news articles, PeopleNet and Internet WetLab (distributed database with experimental results). That's neat. 5 years away from making money I don't think so.

I think there needs to be a version created so that you can deploy it in Java application servers or as stand alone. This reminds me of the web before search engines. Some sort of automatic discovery system will probably be needed.

IBM Pressures Open Source Java

"In my book, a technology that has not received the imprimatur of an independent standards setting consortia like the W3C or ISO--especially a technology that's not royalty-free--is no more a standard than other technologies where prevalence gets confused with the term "standard" and royalties are charged for deployment (i.e.: Intel's x86 instruction set or Microsoft's Windows)."

"Why was Sutor so adamant about the all-encompassing nature of this neutral body? What he said next made it clear: "If Java was an open standard, technologies like [Microsoft's] C# and the technologies it works with [like .Net] might not exist today."

"Along those lines, Gingell says Sun intends to open-source Java, but that it's not a simple process because Sun doesn't own all the intellectual property in all the JSRs. For the same reasons it can't open source all of Solaris, Sun apparently can't legally open source all of Java either. The company is working on clearing the legal hurdles cleared."

ZDNet article

Fastest JVM

BEA Systems Inc. on Monday will release JRockit 7.0 for Intel Corp. hardware, which one company official termed the "world's fastest" Java Virtual Machine.

"We've been working with Intel closely for the past year on optimizing this JVM for the Intel architecture, so what you'll see with JRockit 7.0 is the world's fastest JVM on the Intel architecture. It's actually the world's fastest JVM on any architecture, period," Griswold said.

JRockit 7.0 will be available free of charge, but the company hopes to leverage it to boost sales of its WebLogic Server application server platform, although the JVM will function with other application servers as well, Griswold said.

Platforms include:
* Microsoft Windows 2000 SP2
o JDK 1.3.1 & 1.4
o Intel Pentium / Xeon (IA32) servers

* Red Hat Linux Advanced Server 2.1
o JDK 1.3.1 & 1.4
o Intel Pentium / Xeon (IA32) servers

* Microsoft .NET ES RC1
o JDK 1.4
o Intel Itanium 2 (IA64) servers
o Note — WebLogic JRockit 7.0 is being released as Technology Preview only on this platform

http://www.bea.com/products/weblogic/jrockit/index.shtml
http://www.computerworld.com.au/IDG2.NSF/a/0007CE3A?OpenDocument&n=e&c=CS

UIs

I'm really tired of all the various web and Swing frameworks for producing user interfaces. Especially, when they don't have separate views and controllers. Perhaps the thing to do, is not only open up development of Swing and the new Java Server Faces but to combine them into one framework. It would then scale up or down depending on where it's rendered (events could live on the client in a Javascript or Java client or on the server in a thin web client) via Java Server Faces RenderKit (extending it from the default HTML 4 one even). Stealing an idea from another framework using XSLT to generate the Java code so that it can be used in Java IDEs.

Documentum moves to J2EE

"The first change is a move to Java-based application programming interfaces so that enterprises will no longer need Documentum's proprietary application server.

"With BEA's Weblogic and IBM's Websphere the world does not need another application server as most of our clients use one or both of those products," said David Gingell, European marketing director at Documentum. "
http://www.vnunet.com/News/1134974

"The company says that new usability features include a unified framework for all Web-based user interfaces, providing consistency and easy, intuitive point-and-click commands. The framework is based on its Web Development Kit (WDK 5) and provides a common J2EE architecture for Web-based user interfaces as well as easy customization."

http://www.kmworld.com/news/index.cfm#2696

Thursday, September 12, 2002

P2P and Java

" And P2P is going mainstream too, as manufacturers begin to use it for on-line data storage and distributed, or grid, computing. So far, Kazaa and Gnutella account for 90 per cent of all P2P traffic, but soon business applications will start swapping data files instead of MP3s — already, chip giant Intel Corp. uses P2P technology internally on its intranet, and claims to be saving millions of dollars with this tool by pooling the processing power of idle desktops and servers."

P2P applications are bandwidth hogs.

"The world produces between one and two exabytes of unique information per year, which is roughly 250 megabytes for every man, woman, and child on earth...This democratization of data is quite remarkable. A century ago the average person could create and access only a small amount of information. Now, ordinary people not only have access to huge amounts of data, but are also able to create gigabytes of data themselves and, potentially, publish it to the world via the Internet, if they choose to do so." Time to get started with JXTA.

http://www.press.umich.edu/jep/06-02/lyman.html

Wednesday, September 11, 2002

Scale This

DirectConnect is claiming that they have reached one petabyte of data (in March this year) online although their current stats shows more than three. This is apparently larger than other P2P networks combined. Lucky for DirectConnect they maybe brought down. The architecture has a central hub, surrounded by hubs and clients. The schema and search protocol look fairly boring no neat schemas or namespaces just hardcoded file extensions.

Three of the better pieces of software for it:
http://dcplusplus.sourceforge.net/
http://www.lwave.ca/shasta/
http://javadc.sourceforge.net/

The deep web (as of 2001) was estimated to be roughly 7.5 petabytes in size, 400 to 550 times larger, and is growing faster than the surface web:
http://www.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp

The article excludes the whole P2P thing in its calculations, the shallow/deep web being separate. For some perspective Internet traffice in the US for the month of May reached 100 petabytes. Some surface web sizes include 100 terabytes for the Wayback machine and 1 petabyte at Google.

To some extent, all this excludes high bandwidth communities like Mildura in Victoria:
http://www.neighborhoodcable.com.au/pages/aboutus/company.html

Tuesday, September 10, 2002

TKS with Datatyping

Released yesterday was TKS build 366. It now supports basic RDF datatyping (doubles and dates) and improvements in its raw I/O speed. We have also made an academic license available.

Sunday, September 08, 2002

Has RDF Failed?

RDF has two problems:
* The first one is that there is no added value in using RDF over plain XML...So these applications don't need a general semantic format, because they have the RSS semantics built-in.
* It's a bit ugly and the purpose of the extra syntax is far from self-explaining, which quickly lead to mistakes in implementations.

"Note that these 2 problems are hardly RSS specific. They will also arise in other attempts to deploy RDF in a community. RDF seems doomed. But I think RDF still has the future, or at least the model of RDF has the future. The problems are in the serialization of RDF."

"So if you design a general format, it is always going to be inferior to a specificly designed format. With RDF this is even worse because RDF can make absolutely no concessions towards semantics...Given all this it is unlikely that RDF as serialized in XML is going to be widely adopted."

http://w3future.com/weblog/2002/09/07.html

I hope RDF is not a failure. His talk of semantic schemas reminds me of the way we use schemas to create and render RDF. By creating schemas that describe how they are to be rendered you get the ability to transform RDF into any bracketed language (HTML, XML, etc). If RDF becomes nothing more than a transport language from one proprietary system to another I think it will have succeeded somewhat. Not using RDF as the primary format of RSS is something that I had expected. It is not necessarily relevant to RDF's long term success. Without it, though, RDF still has to find its first killer application - I expect that's why people are fighting so hard. I would think something in the knowledge management or search engine space (like TAP) but it would take someone like Google to pull it off.

Reading the thread it does have a lot of similarities to the XML Schema debate which began a rebellion. The power of something like Relax NG and HTML is the moments of understanding - of it making sense. With RDF serialized in XML you don't always get that. RDF is flexible enough to get confusing. With HTML, primary schoolers could understand it and use it. Once tools were made, people used them but it was still possible for you to understand and change it when the tools failed. RDF needs at least that ability. Sequences and bags don't help.

It's interesting in that thread that Netscape came close to writing their own RDF database.

I like this quote: "The growling seems to have started with a piece by James “Markup Deity” Clark on an IETF mailing list, in which he states flatly that the IETF shouldn’t settle on XML Schema because (wildly inaccurate paraphrase here) it’s a piece of slack, incomprehensible garbage compared with RELAX NG...Next stop: Clark demystifies RDF serialization. (Hey, I can dream.)" from this blog.

"The claim has been made, offlist, that there was a community consensus to move to namespaces and RDF and modules. If there was such a consensus, now is the time to show where the record of that is. Ken provided a pointer, but it’s not what I asked for, because no one asked "Is it OK if we call this RSS 1.0?"" - said Dave Winer.

There is a semantic web club and by the looks of things an RSS club - everyone gets upset when they're not included. Could you really abandon XML (cause it sucks)? Like web services, the reason is not so much the technology but the acceptance in the developer community - content (and metadata from that content) is still king. Or to put it another way, "Can the common man produce RDF? If the answer is NO which it is, then HOW IN THE FUCK IS RDF EVER GOING TO REACH CRITICAL MASS?".

Most users won't want to read RDF. However, if you're a developer or empowered user, you do. The explosion of HTML tools were due to it being easy for humans to understand and debug (oh and the promise of big dollars). You could look at Frontpage and Dreamweaver's output and decide which was the better tool. Being human readable or decipherable meant that tool writers could debug the programs more readily. This is good stuff don't throw it away. So the goal is to make it as simple and as human readable as possible, given the constraints of XML. It's a cop out to say that you should only use tools; until the machines write the tools that is.

The Burningbird weblog has some counter arguments about using RDF in RSS.

Saturday, September 07, 2002

The Next Three Big Things

"Memo from Forrester Research to venture capitalists: Stop scratching your heads and start investing in certain technology subsectors such as security patches, search technology and ultra-wideband silicon."

"In another memo, Forrester software analyst Joshua Walker points to a surge of corporate interest in search technology. Buyers include multinationals that can't find critical intellectual property on their intranets and online sellers with buried merchandise. Walker contends that a single type of search engine meets all large enterprise needs so that there's room for several winners. He also notes that the rewards can be large, citing two startups in the sector, Verity Inc. and Endeca Technologies Inc., which have average-size annual contracts of $385,000 and $450,000, respectively."

Article on TheDeal.Com

Meanwhile, The Register has an article about how little America understands mobile phones and how little venture capital has learnt:
"We can confidently, if not happily, predict that the next tech crash of 2010 is going to look remarkably similar to the one we've just endured. "

"The cream of American Venture Capital - once thought of as the wellspring of the nation's economic well being - isn't just unrepentant, it seems determined to remake the next bubble with the same muddle-headed assumptions as it spun the last one."

http://www.theregister.co.uk/content/7/26891.html

RSS 2.0

The plan is to get around the complexity of RSS 1.0 by using XML name spaces?

RSS 3.0 with it's even simpler syntax is available too. I still like absmyl if only for the way he describes XML.

I had this really good piece on how adding XML Schema was a bad thing to do to RSS and it turned out to be name spaces (or namespaces). The worse thing is that I not only wasted my time but other peoples (if they read it). If there was some way you could tax or inflict pain on people who waste your time it would get rid of dumb user interfaces, spam, bureaucracy, boring blogs etc. Unlike email, blogging allows you to update your stories with updates, fix formatting, spelling and grammar later. Or remove any mention of political enemies - Stalin style.

Weaving A Web of Ideas

"Autonomy Corp. (Cambridge, UK) and the Palo Alto (Calif.) Research Center [recently spun off from Xerox (73)], each, in different ways, use mathematical models of how long-term memory works in the brain to create concept maps out of the words on Web pages. At Verity Inc. (Sunnyvale, Calif.), researchers add things like organization charts and address books to infuse amorphous corporate documents with additional structure."

http://www.spectrum.ieee.org/WEBONLY/publicfeature/sep02/sem.html

Friday, September 06, 2002

Tiredly Linking

Ray Ozzie says we need a rich client and Swing ML are looking for help. I evaluated it during my XWT follies - I thought it wasn't quite as good as XWT but then in the end XWT sucked.

"Storyspace links connect specific sections of text, images, or entire writing spaces. Link adapt instantly whenever you move or edit a writing space. Storyspace links are never broken." This one isn't just for the Mac. The main page includes use cases in film criticism, replacing a database with it and documenting government processes.

NLA's digitises documents on demand from their collection. I've never seen a collection described in kilometers before. "As far as we know, we are the only archives that allow the public to choose which records will be digitised, and we provide this service at no cost. Even though there has been little publicity the demand was instantaneous and it has shown no sign of abating. "

"The service is currently free. Our view is why should someone have to pay for a digital copy that is then loaded onto our Web site for the entire world to see for free? Furthermore, the service we are now providing is intended to promote equal opportunity for those researchers who cannot visit our reading rooms, where they could access the records at no charge. We could introduce a fee in return for a fast tracking service, but we believe that this would only create a disparate level of service whereby those who can pay receive one standard of service, and those who cannot pay receive a lesser standard."

http://www.rlg.org/preserv/diginews/diginews6-4.html#feature1

While this isn't a brilliant article I thought this was interesting:
"A company called Collexis, based in the Netherlands, uses a system not dependent on keyword searching, but on concept searching. Barend Mons, co-owner of Collexis and assistant professor, University of Rotterdam, says Collexis uses concept numbers to find information." Collexis has Elsvier Science as one of its customers.

http://www.the-scientist.com/yr2002/aug/bahls_p16_020819.html

Thursday, September 05, 2002

Animals are People too

" Wise's accounts of animals' mental abilities are fascinating and thought-provoking. But in the end, it wasn't their relative autonomy scores that swung my sympathies. It was the description of Koko making a joke; of a mother elephant involving her youngster in a game so she could complete a task. It was the account of Alex, the parrot, left at the vet for surgery, calling after her keeper, "Come here. I love you. I'm sorry. I want to go back.""

http://salon.com/books/review/2002/09/04/wise/print.html

No Imagination

That's the problem with Open Source. I hope Sony and Apple lock you all up.

Wednesday, September 04, 2002

Sesame 0.6

New features in this release include the Ontology Middleware Module by OntoText (see
http://www.ontotext.com/omm/), which allows advanced features such as change tracking, security and partial DAML+OIL reasoning.

Their public server is available via SOAP and RMI. The client is a 24K jar file.

http://sourceforge.net/projects/sesame/

Stellent Offers Portlet and J2EE Integration

"The Stellent Content Integration (CI) Kit provides standard integration methods, such as SOAP, XML, EJB, COM, a Java API, or easy portal integration options using Stellent Portlets...Stellent offers a Java 2 Enterprise Edition (J2EE)-compliant Content Server that operates entirely within an application server."

"With Stellent Content Management, enterprise content is managed with robust library services – including version control, check-in and check-out functionality, indexing, an extended metadata model, security, and workflow capabilities. Stellent also extends content management to the workgroup level, empowering project teams to manage ad-hoc, teambased content creation and collaboration. Customers, partners and employees can view more than 225 native formats without the native application using Stellent."

The formats are converted to XML and can then be stored in the Tamino XML server.

Their other products include content categorization, content managment and an audio/video indexer.

http://www.stellent.com/ibm

Tuesday, September 03, 2002

Lime to Grapefruit

This is another interesting OS X "feature":

http://www.macslash.org/articles/02/08/27/1116221.shtml

Turns it to white on black with other weird colour changes - "Jaguar" mode (blue goes to orange). It turned my Limewire icon pink. The screenshot appears like normal when you're in Jaguar mode.

Rete, Wasp and jxtaSpaces

"jxtaSpaces is an experimental project to design and implement a distributed shared memory service on the JXTA peer-to-peer computing platform."
http://jxtaspaces.jxta.org/

Implementations of the Rete Algorithm:
http://drools.org/index.html - Supports Jelly, Java and Python. Also, using Jira for bug tracking and Maven.
http://www.haley.com/CafeRete.html - Support EJB, Sevlets and Applets.

Wasp is a code analysis tool for Java:
http://www.waspsoft.com/ - Wasp also produces detailed and precise method call graph. The method call graph of a program helps to know for each method what actual methods are called in its body.

Monday, September 02, 2002

Spring has Sprung

"The rationale behind Spring is to provide an environment where information meets behavior. Publish Spring Objects representing books and add behavior allowing users to purchase the book with a single click! Publish Spring Objects representing users and add behavior allowing people to email, AIM (SM), get directions to the party they're hosting, etc. in a single click!"

http://www.usercreations.com/spring/

"Rather than proposing "yet-another-standard" to achieve this, "yet-another-closed protocol" to get this done, we're proposing a completely open, standards-based way to define Spring Objects. We want developers to take this ball and run with it, leverage what we've got, and take users in directions never imagined."

These standards include XML, HTTP URIs, XML-RPC/SOAP (via scripting runtimes), XSL, CSS.

The screenshots are highly impressive and if I hadn't upgraded to Jaguar I might try it:
http://www.usercreations.com/spring/screenshots.html

It also goes to show you that Aaron Swartz is everywhere.

September 11, the .COM Crash and Knowledge Management

This article titled "Why Knowledge Management Systems Fail?" starts by outlining the similarities between the failure to detect the terrorist involved in September 11 and the failure of the "new economy". His thesis is that both occurred because of incorrect assumptions in knowledge processes.

This article contrasts the traditional model (Model 1) of "the information systems themselves -- not the people -- can become the stable structure of the organization" with an adaptive model (Model 2) taking into consideration human decision making and processing.

"Developing an information-sharing technological infrastructure is an exercise in engineering design, whereas enabling use of that infrastructure for sharing high quality information and generating new knowledge is an exercise in emergence."

"Next generation KMS will need to accommodate the managers need for ongoing questioning of the programmed logic and very high level of adaptability to incorporate dynamic changes in business models and information architectures. Designers of information architectures will need to ensure that they deliver upon the need for efficiency and optimization for knowledge harvesting while providing for flexibility for facilitating innovative business models and value propositions."

In "Weaving the Web", Tim Berners-Lee gives the example of a tax program to describe a use for RDF - being able to ask why and to correct faulty assumptions.

http://www.kmbook.com/

Sunday, September 01, 2002

Looks Like a Job for XML and RDF

"The National Archives and Records Administration is searching for some workable way to save electronic records for decades or even centuries. But the agency faces at least two daunting problems: Fast-changing technology means that electronic files created just a few years ago are already in obsolete formats and may no longer be retrievable. And the sheer volume of e-records — 36.5 billion a year in e-mail messages alone — is overwhelming."

http://www.fcw.com/fcw/articles/2002/0819/news-nara-08-19-02.asp

"For instance, a conservative 1999 estimate indicates that the yearly volume of email traffic in the Federal Government is approaching 36.5 billion messages per year. Although only a percentage of those messages may be permanently valuable, the volume is still orders of magnitude larger than what NARA has had to manage in the past. Moreover, the Census Bureau will be transferring images of up to 600 million pages of information, comprising tens of terabytes of data, from the 2000 Census. In comparison, in its thirty-year history, NARA has captured and fully-processed only between one and two terabytes of data."

RFI synposis

Platform or Application?

Joel's argument is that Groove is a closed platform. That you have to pay to be part of it and if they don't give it away for free no one will write applications for it (this works for Jena). Dave's argument is that most platforms fail so you shouldn't bet that your application will become one. His solution seems to be get some money first and keep them coming back.

Joel: "...there is no free Groove runtime...nobody has Groove yet...That doomed the Groove idea from the start. I talked to some of the Groove "partners" who allegedly are developing software for Groove. "Was the Groove relationship worth anything?" I asked. "HA!" they said. "We paid $1500 and in exchange we get less than 10 clickthroughs a month from their web page. A waste of money. We couldn't even get Groove to share their customer lists with us.""

Dave: "Gassee had a sense of keeping the users entertained. He got it. Keep them coming back for new goodies. And maybe a breakthru. For that you need lots of risk-taking developers. That's how you get winners, not by hiring researchers to invent miracles for you...Apple did the opposite, invading competitive markets with void-creating white papers. "

Both Dave and Joel seem to care about the respective platforms in their articles and both felt the same way. Meanwhile, Ray just comments about how people are using Groove and marketing.

If anything, this is a lesson for anyone trying to write a piece of architecture. Listen to your developers and decide if you are a platform (building infrastructure) or an application. If you are a platform be open and free and hopefully charge for it later. If you're building an application, charge for it now and get any cent you can, because one day the big boys (like Microsoft or Apple or whoever) will come and give it away or integrate it into another product.