BNodes Out! discusses how any usefully scalable system doesn't use blank nodes. What is interesting is the comment on YADS (Yet Another DOI Service). The best reference is Tony's presentation although it is mentioned in Jane's as well. "YADS implements a simple, safe and predictable recursive data model for describing resource collections. The aim is to assist in programming complex resource descriptions across multiple applications and to foster interoperability between them...So, the YADS model makes extensive use of bNodes to manage hierarchies of “fat” resources - i.e. resource islands, a resource decorated with properties. The bNodes are only used as a mechanism for managing containment."
This sounds a lot like RDF molecules and supports visualization (apparently). This seems like a good use of molecules that I hadn't previously thought of (Tony's talk gives an example of the London underground). The main homepage of YADS isn't around anymore - it'll be interesting to see if it's still being used/worked on.
Update: Tony has fixed up the YADS home page (there's also an older version).
Tuesday, July 29, 2008
Monday, July 28, 2008
Hadoop and Microsoft
Pluggable Hadoop lists some extensions to Hadoop in the pipeline: job scheduling (including one based on Linux's completely fair scheduler), block placement, instrumentation, serialization, component lifecycle, and code cleanup (the analysis used Structure101).
I found the reason why HQL was removed from HBase (to be replaced by a Ruby DSL and to ensure that HBase wasn't confused with an SQL database) and moved to HRdfStore.
There's also rumours that Microsoft's recent investment in Apache may lead to them working on Hadoop too.
I found the reason why HQL was removed from HBase (to be replaced by a Ruby DSL and to ensure that HBase wasn't confused with an SQL database) and moved to HRdfStore.
There's also rumours that Microsoft's recent investment in Apache may lead to them working on Hadoop too.
Tuesday, July 22, 2008
Save us China
I was in Victoria when the ETS for Australia was announced (well the discussion papers). It's fairly funny, that replacing the world's worst plants even with other coal plants (Hazelwood is the world's worst), with Chineese brown coal plant technology, would reduce emissions by 30% to 40% (by just drying out the brown coal). It's still very poluting but it just shows how far behind Australia is. This has lead to greater compensation to Victorian polluters (which is just mad). At the same time Queensland is creating another coal port because we can't export the carbon fast enough.
The exclusions were annoying (aluminium, cement and some types of steel). Cement is annoying (5% of all CO2 apparently) as there exists green alternative technologies. The time is to invest not compensate.
The exclusions were annoying (aluminium, cement and some types of steel). Cement is annoying (5% of all CO2 apparently) as there exists green alternative technologies. The time is to invest not compensate.
Square brackets are scary
For what may be an increasing trend of surfing the Web at 320x480 I noticed Cydia has a number of applications for Jailbroken iPhones (Java, Python and Ruby mainly). The mailing list on iPhone/Java doesn't have much on it except some interesting uses of JocStrap and UICaboodle (available from SVN by Jay Freeman). There's also the Sun blog that has some interesting sample applications using different Java implementations on the iPhone.
Friday, July 04, 2008
JRDF 0.5.5.1 Released
Just a quick note about a new version of JRDF. It's been a short time between releases but it still contains one significant advance over the previous one and that's persistent graphs. It's still in the early stages but it's basic enough for simple use cases. It also contains text serialization (based on NTriples) that is useful for moving RDF molcules around nodes in a cluster (for example). A lot of this code is fairly much "spike" code and I expect that another release will be released after we exercise these new features more (and write some tests/rewrite the code).
Update: 0.5.5.2 is now available fixing many bugs and introducing FILTER support.
Update: 0.5.5.2 is now available fixing many bugs and introducing FILTER support.
Of Mats and Cats
No universal things Re: comparing XML and RDF data models was started by Bernard Vatant. This comes to the heart of whether people can know reality (well that's how I'd summarize the idea of universals see Beyond Concepts).
There were a few quotes that I found interesting:
Bijan wrote:
I really feel like an interested amateur and my view is probably influenced by databases in computer science, where you are taking the non-realist approach. I say this because there are usually properties in databases that are not really based on reality but are a result of other requirements (like a column like "isDeleted" rather than actually deleting the statement).
There were a few quotes that I found interesting:
It's been counter-productive in science for centuries. Physics had to go over the notion of universal thing to understand that light is neither a wave, nor a particle. Biology to go over the notion of taxa as rigid concepts based on phenotypes to understand genetics etc. Many examples can be found in all science domains. My day-to-day experience in ontology building, listening to domain experts, is indeed not that 'there are things that people are trying to describe', but that 'there are descriptions people take for granted they represent things before you ask, but really don't know exactly what those things are when you make them look closely'.
Bijan wrote:
I do think that the family of views in computational ontologies generally called "realist" is indeed naive and fundamentally wrong headed. Whether it's a "useful fiction" that helps people write better or more compatible ontologies is an open empirical question.
But I, for one, wouldn't bet on it.
I remember also a project where we were trying to get people to write simple triples. They got that they needed triples. But what they ended up putting into the tool was things like
S P O
"The cat is" "on the" "mat".
"Mary eats" "pudding" "on toast"
They just split up the sentences into somewhat equal parts!
I really feel like an interested amateur and my view is probably influenced by databases in computer science, where you are taking the non-realist approach. I say this because there are usually properties in databases that are not really based on reality but are a result of other requirements (like a column like "isDeleted" rather than actually deleting the statement).
Wednesday, July 02, 2008
Round of Links
- Apache Hadoop Wins Terabyte Sort Benchmark "One of Yahoo's Hadoop clusters sorted 1 terabyte of data in 209 seconds...This is the first time that either a Java or an open source program has won." There were just under 1000 nodes, the benchmark results are hosted by HP (a tad more detail here).
- Microsoft buys Powerset one of the interesting things is that they use Hadoop (see their blog). It's hard to tell whether this is bad or good for Hadoop.
- Google vs Microsoft - oh for structure.
- Tom talking about GridGain from his presentation in February. C++ isn't as productive as Java?
- Applets are back (according to Sun).
- Why commenting is for n00bs. "And Haskell, OCaml and their ilk are part of a 45-year-old static-typing movement within academia to try to force people to model everything. Programmers hate that. These languages will never, ever enjoy any substantial commercial success, for the exact same reason the Semantic Web is a failure. You can't force people to provide metadata for everything they do. They'll hate you."
- Some interesting discussion on Web 2.0 and the future of the web.
- Rich text editor for browsers. Not free though.
- Linked data and what it is.
- ThoughtWorks Podcasts (the REST talk was what drew me to it).
- Turtle specification. I've been looking at this for serialization of RDF molecules but it seems that you can't have blank nodes as objects using the nested syntax.
- Semantic Web for bioinformatics.
- Data structure stuff: Linear Bloom Filters, Bloom filters for Spell Checking, Optimal Bloom Filter replacements and scalable btree and B-tries for Disk-based String Management.
Tuesday, July 01, 2008
Ob. iPhone 2
Good to see carriers actually putting up a bit of a fight for iPhone business. Telstra announces iPhone 3G details with $279, $30 a month on a 24 month, with free access to WiFi hotspots. This better be true.
Update: Optus releases pricing
Update: Optus releases pricing
Subscribe to:
Posts (Atom)