Thursday, May 13, 2004
Pure Abstraction
This is the true potential of RDF. It's not a substitute for databases or XML. RDF is a directed labelled graph. It hides information and provides a data interface which can be used to aggregate multiple datasources that can be themselves RDF datasources. This is the purest data abstraction. Theoretically, if you weren't worried about performance you could access an RDF datasource and not know or care whether you're accessing a database, a web service, or both. The RDF datasource might even optimize its data structures based on what your queries or iteration patterns have been, in much the same way as hotspot compilation optimizes algorithms in the JVM.
A good question at this point is whether RDF is as powerful an API as JDBC. The answer is yes, or not quite. RDF is only a data structure, and there are several APIs for manipulating that data structure. The most well known implementation is Jena. Jena's API for manipulating RDF graphs is very powerful, and encompasses most things you would find in JDBC, such as transactions, a query language, and prepared statements."
| 0 comments | Link me |
The Blank Node Release
The next release should have some inferencing and resolver work done to it which will be a major change to the architecture - unless we find some critical bug.
Paul has also updated the plans for a new triple store for Kowari/JRDF. I'm favouring the second approach but they both have positives and negatives. The ring structure has some advantages over the current structure (and the first approach) as it greatly reduces the amount of disk usage. I think bloom filters have been put on the back burner.
| 0 comments | Link me |
Wednesday, May 12, 2004
Rewerse
1. networking and structuring a scientific community that needs it, and by
2. providing tangible technological bases that do not exist today for an industrial software development of advanced Web systems and applications."
| 0 comments | Link me |
Tuesday, May 11, 2004
Bloom Filter Implementation
I'll have to get JRDF up to scratch. Some recent comments from both Paul and others make me acutely aware that it's definitely not even close to being the last word in Java RDF APIs.
| 0 comments | Link me |
Danny Hillis on the Knowledge Web
"In the long run, the Internet will arrive at a much richer infrastructure, in which ideas can potentially evolve outside of human minds. You can imagine something happening on the Internet along evolutionary lines, as in the simulations I run on my parallel computers. It already happens in trivial ways, with viruses, but that's just the beginning. I can imagine nontrivial forms of organization evolving on the Internet. Ideas could evolve on the Internet that are much too complicated to hold in any human mind.""
"In this regard, thanks to funding from the Markle Foundation, Danny been able to assemble a group of people to begin to discuss of the implementation of a medical application based on his ideas."
Medical application, hasn't this been done before? Only Jaron Lenier mentions the Semantic Web.
| 0 comments | Link me |
Euroweb
| 0 comments | Link me |
Securing the Semantic Web
* This document describes the ACL storage and query mechanisms used by W3C, as well as the availability and use of this data on the semantic web.
* Semantic Web Trust and Security Resource Guide and more specifically KAoS Policy and Domain Services:Toward a Description-Logic Approach to PolicyRepresentation, Deconfliction, and Enforcement.
| 0 comments | Link me |
Monday, May 10, 2004
kSpaces.net
Metadata associated with a file can be viewed and edited through the kSpaces Node application, supported by editor plugins. Five editor plugins have been included in the proof-of-concept, four of which are read only. These plugins allow the management of a subset of Dublin Core metadata, EXIF metadata, ID3 metadata and kSpaces-specific metadata. The Raw RDF plugin shows the raw RDF metadata associated with a knowledge asset."
Cool, an extensible metadata extrator.
| 0 comments | Link me |
Saturday, May 08, 2004
Broken Windows
Don't Live with Broken Windows "You don't want to let technical debt get out of hand. You want to stop the small problems before they grow into big problems. Mayor Guiliani used this approach very successfully in New York City. By being very tough on minor quality of life infractions like jaywalking, graffiti, pan handling—crimes you wouldn't think mattered—he cut the major crime rates of murder, burglary, and robbery by about half over four or five years.
In the realm of psychology, this actually works. If you do something to keep on top of the small problems, they don't grow and become big problems. They don't inflict collateral damage. Bad code can cause a tremendous amount of collateral damage unrelated to its own function. It will start hurting other things in the system, if you're not on top of it. So you don't want to allow broken windows on your project.
As soon as something is broken—whether it is a bug in the code, a problem with your process, a bad requirement, bad documentation—something you know is just wrong, you really have to stop and address it right then and there. Just fix it. And if you just can't fix it, put up police tape around it. Nail plywood over it. Make sure everybody knows it is broken, that they shouldn't trust it, shouldn't go near it. It is as important to show you are on top of the situation as it is to actually fix the problem. As soon as something is broken and not fixed, it starts spreading a malaise across the team. "Well, that's broken. Oh I just broke that. Oh well." "
| 0 comments | Link me |
Thursday, May 06, 2004
Eh
Edd Dumbill won't be attending WWW2004 and that's a shame. Especially because I'm going to be there - hopefully to put faces to names. I think the developer's day looks good - "Doug Cutting, the leader of the Nutch open-source search engine project, and an interactive luncheon Q&A with Tim Berners-Lee." TKS will be there too. TKS is Kowari with other features such as security, related to queries, support and GUI management.
| 0 comments | Link me |
Sunday, May 02, 2004
Kowari linkers
DeliverableS4Simile "Naive use of persistent store in Jena decreases performance by 100x:
* Part of the problem is limited expressivity in RDQL
* Look at performance tuning on databases
* Ryan: Look at the performance of more specialized RDF databases, cf. Kowari"
Open Source Projects That Use Java NIO "The storage engine of Kowari is a transactional triplestore known as the XA Triplestore. ll relevant fields of in-memory and on-disk data structures are 64 bits wide..."
Kowari for hundreds of millions of triples "Jim Hendler emailed me in response to my having mentioned on www-rdf-interest@w3c.org that I was surveying triple stores for use in data mining and machine learning. He mentioned a Java-based, non-relational, triple store called Kowari that is available in open source form..."
RDQLPlus "I discovered Kowari last week. The iTQL language is very similar to what I've come up with for RDQLPlus... having equally been inspired by SQL/DDL. Kowari looks like a nice database. I've downloaded it but haven't had a chance to play with it yet..."
Some Tools "Kowari is a layer on top of Jena with OWL reasoning, too"
I would say that Kowari is a layer beneath Jena - one that provides persistance. The OWL reasoning is really only offered at the moment through Jena. However, we will be getting some basic inferencing, at our own query layer, in there soon too.
[protege-discussion] Re: large data sets, bulk data acquisition "I had a really bad time with Kowari earlier this year, it wouldn't compile and then pass its own self-tests...."
This is basically problems with Windows. We develop on Linux and OS X and only do QA on Windows. Our initial release had known problems under Windows - which lead to failing unit tests but was not fatal for data storage. Anyway, it is fixed now; although Windows does have some drawbacks when it comes to using NIO.
| 0 comments | Link me |
Ontological Software Development
Considering that statement, it's also clear that application independence of ontological models makes these applications candidates for reference models. We do this by stripping the applications of the semantic divergences that were introduced to satisfy their requirements, thus creating a common application integration foundation for use as the basis for an application integration project."
"Once we define the ontologies, we must account for the semantic mismatches that occur during translations between the various terminologies. Therefore, we have the need for mapping.
Creating maps is significant work that leverages a great deal of reuse. The use of mapping requires the "ontology engineer" to modify and reuse mapping. Such mapping necessitates a mediator system that can interpret the mappings in order to translate between the different ontologies that exist in the problem domain. It is also logical to include a library of mapping and conversion functions, as there are many standards transformations employable from mapping to mapping."
| 0 comments | Link me |
Friday, April 30, 2004
Limiting Complexity
I've spoken to people that spent several years of their lives coming up with an ontology and their perception is that the complexity over time of these models to cover a particular domain saturates, does not continue to grow.
This is a basic, but vital assumption for this entire approach to work: if the ontologies grow linearely with the amount of information they can describe, the ontology creation/maintainance process simply won't scale globally."
| 0 comments | Link me |
Intellidimension's SWS
From the FAQ:
"...using our standard search engine interface you can just type a one or more of keywords describing the information you are trying to locate. This is no more complicated than a traditional Web search engine. However like a traditional Web search engine this can lead to a large number of irrelevant results. To narrow your search you can restrict it to the specific type of resource that you are trying to locate such as a person (FOAF Person) or news article (RSS Item). If your search is still producing a large number of irrelevant results than you can refine it further by specifying one or more specific property values that the resource must have."
| 0 comments | Link me |
Browse the Semantic Web
Some thoughts on RDF rendering had some ideas on visualizing the Semantic Web too.
| 0 comments | Link me |
The Passion of RDF
See also: The Vision of a Semantic New Testament: "Just as important as avoiding commercial barriers to sharing is the requirement that SemANT support existing and emerging standards that enable use across the Internet. To this end, SemANT will build on the Semantic Web Activity of the World Wide Web Consortium (W3C), including XML as a syntactic standard for data interchange, and RDF for ontology-based representation, and DAML/OWL for additional semantic expressiveness.".
Another example of when your technology has matured like: PCs, CDROMs, Hypertext, the Web, etc.
| 0 comments | Link me |
XUL vs XAML
XAML, Microsoft warned, is more potent than XUL in its ability to reflect exactly what's in the operating system.
"XUL is not the multipurpose declarative language that Gnome probably wants," said Ed Kaim, product manager for the Windows developer platform. "People say that when all you've got is a hammer, everything looks like a nail. In the same way, people are trying to figure out how to crush XUL into an OS it really wasn't designed for. The browser is great for a lot of things, but when it comes to robust client side applications, it's not the best."
Another trick will be in reconciling XUL with Gnome's existing user interface technology.
"There are ways to marry them," said Bruce Perens, an open-source consultant who serves as executive director of the Desktop Linux Consortium, a marketing organization. "But it's very difficult to get the two teams working in the same direction. They both went on a several-year tour of technical creation where they sat down and created everything they needed to do GUI [graphical user interface] applications — and they didn't create the same thing. Now to get them together it would take some number of years to resolve the technical diversions.""
| 0 comments | Link me |
Query Use Cases
Although a little self advertisement and some missing languages, its a good thing to read. If you need info about RDF Query languages, read it.
My previous demand about "optional joins in queries" is answered by SeRQL."
The report is an excellent example of the current features required from a query language. From what I can tell iTQL implements 11 of the 14. Some of the others are fairly trival to add support for (like the data type support).
| 0 comments | Link me |
Wednesday, April 28, 2004
Will the Semantic Web Scale?
Lately, several researchers doubted whether the Semantic Web idea will ever scale for numerous reasons technological [But03], theoretical [van02] and practical [MS03,Sow]. Dedicated workshops on that topic [CKDE03,VDC03] have been organized recently to promote research to improve scalability. We will pick up these three categories of doubt by organizing the panel in three parts discussing each aspect: theory, technology/implementation, and practise."
I'm not sure I agree, there are very few Semantic Web systems that don't reuse existing SQL databases - they just suck at storing triples. With Kowari, and I'm sure with other native stores, the data structures and techniques used are taken directly from databases. They mention "Is the semantic web hype?" (which I responded to) and a few others. Although, there's no links to syllogism, metacrap or gnomes. BTW, I'm still not sure why you'd want an XML version of OWL.
"Network round-trips are often considerably less costly than the time taken for a transactional database operation due to the need to forcibly log transactional operations which is very costly in terms of disk performance. i.e. network round-trips aren't always the performance bottleneck." From Martin Fowler's First Law of Distribution.
As long as you keep the Semantic Web like the Web there's no real reason why it shouldn't scale.
| 0 comments | Link me |
Google Watching
Google Goes Public? The Rich Get Richer "People speculate. People dream. And if the numbers are to be believed, people will drool. The current prediction is that Google, if it decides to sell shares to investors this year, would probably end up with a market value of $20 billion to $25 billion by the end of its first day as a publicly traded company."
Google's Brin Talks on Gmail Future "It was interesting to me that you did finally hit on the word conversation. It seems to me that there's a synergy between the elements of the conversation in the RSS space and what you're doing in the e-mail space.
I think that's very true. Part of the things we've seen why blogs and RSS feeds are such a success is that you can actually read it—you don't have to stop, click back and forth, collect bits and pieces here and there—but it is all presented to you as one. "
| 0 comments | Link me |
Mozilla to Upgrade RDF
| 0 comments | Link me |
Tuesday, April 27, 2004
Ant is now more useful
In a nutshell, the
Also new is macrodef: "Macrodef is a way to define a new Ant task in an Ant build itself. Macrodef allows you to define standard tasks that have attributes and elements given to them when they are called."
| 0 comments | Link me |
Paul's Blog
| 0 comments | Link me |
Friday, April 23, 2004
Free Bits of Description Logic
| 0 comments | Link me |
Metaweb Graph Updated
| 0 comments | Link me |
Semaview Interview
"My company, Semaview has developed an application called eventSherpa. eventSherpa is making it simple to create and organize schedules and share them over the Internet. Our application automatically creates Semantic Web content transparently without the end user knowing it...Aside from reducing the complexity issue...I believe the largest challenge is convincing application developers to make their data available in semantic format. However it is "a chicken and egg problem" -- the more content available in a semantic format, the more applications that will be developed to take advantage of it; and vice versa."
| 0 comments | Link me |
Knobot
From Danny Ayers: Knobot PlanetRDF Demo.
| 0 comments | Link me |
XML 2004
"Consequently, even at the low level of operating systems vendors are seeing the need and advantages of implementing metadata storage and manipulation.
This is good. We have the tools to support this, whichever way you swing on the technology issues. RDF & OWL, Topic Maps, W3C XML Schema: all have the right machinery. Unfortunately that's not the biggest issue. The main problem is which terms, schemas, and ontologies to use. That's just not clear right now for most if not all metadata applications. At best, we'll get inconsistently classified information, which defeats the promise of interoperability. More typically, we'll end up with little tagged metadata and islands of de facto proprietary information."
"As an RDF fan, the realization of this truth causes me some pain. The way out is to stop thinking of RDF as an XML application, and look to easier syntaxes such as Turtle and N3."
| 0 comments | Link me |
Wednesday, April 21, 2004
Unix Job Ad
"But the point of all that is, Unix is basically a sort of secret society where you either know it, or you don’t. And since most people just really can’t be bothered going through the agonies of learning it, it’s why we have jobs like this: “Unix Specialist”. Of course that means nothing, or at least it means about as much as “Car Specialist” or “Bread Specialist”. Bread Specialist? What the hell is that? What kind of bread? White, multigrain, mixed grain, wholemeal, sourdough? Sliced or unsliced? If sliced, sliced for sandwiches or for toast? Crusty or soft? No matter! Just eat your bread!"
| 0 comments | Link me |
RDF Engine
| 0 comments | Link me |
Tuesday, April 20, 2004
80/20 REST to SOAP
| 0 comments | Link me |
Monday, April 19, 2004
Phew
BTW, you can now do things like: "select $s $p $o from <http://www.w3c.org/2000/08/w3c-synd/home.rss> where $s $p $o ;". You can combine local and remote (via file or http) models by using "and" and "or" in the FROM clause.
| 0 comments | Link me |
Bloom Filters in Social Networks
"If any one of the filters is intercepted, it will register the full 50% false-positive rate. So I am able to hedge my privacy risk across several interactions, and have some control over how accurately other people can see my network. My friends can be sure with a high degree of certainty whether someone is on my contact list, but someone who manages to snag just one or two of my filters will learn almost nothing about me."
"Additionally, you can combine two Bloom filters that have the same length and hash functions with the bitwise OR operator to create a composite filter."
| 0 comments | Link me |
Sunday, April 18, 2004
Save Our Software
"First, the FCC is conducting a proceeding to set the guidelines for what it calls 'software defined radio...in the near future, it will be possible use spectrum more efficiently and to increase competition in a space that's now the sole domain of incumbent operators. But only if we keep the FCC from regulating these technologies."
"Second...the FCC generally approved a mandate called the 'broadcast flag,' which would require that digital TV broadcasts include an anti-theft code to prevent consumers from recording programs off the air."
| 0 comments | Link me |