Many things have changed since I started JRDF in 2003. It feels like JRDF has come to a natural conclusion.
Some of them are things I've failed to do very well: get contributors, implement different file format parsers, find enough time to refactor existing bits, etc. I'm also not that interested in Java anymore (as was becoming increasingly obvious as it did have Scala in there at one point and has some Groovy DSL code in there).
The most recent change I've seen is that JSON has achieved some of what RDF was trying to do and I see it more and more in the way people use it to expose their data in a RESTful way. The tooling is less onerous and the ease of use is higher even if what you get is much less.
Also, external factors like W3C's official RDF API (for Java and Javascript) is largely the same thing but with official backing.
I've enjoyed developing it and meeting and talking to other people in other groups (especially Jena and Sesame). And of course, none of this would've happened if it wasn't for a lot of other people: Paul Gearon, Simon Raboczi, David Wood, David Makepeace, Tom Adams, Yuan Fang-Li, Robert Turner, Brad Clow, Guido Governatori, Jane Hunter, Imran Khan and Abdul Alabri and the other guys and girls Tucana/Plugged In Software/UQ.
Showing posts with label java. Show all posts
Showing posts with label java. Show all posts
Sunday, May 08, 2011
Monday, June 28, 2010
Tuesday, December 29, 2009
Syntax Hell
As anyone who has applied a more functional style to Java will realize, the Java syntax really gets in the way. I was using my usual guinea pig, JRDF, with "with" so you don't have to iterate but apply a function and so you don't have to close a ClosableIterator. The typical code to print out the graph is:
Using "ClosableIterators.with" this becomes:
ClosableIterable<Triple> triples = graph.find(ANY_TRIPLE);
try {
for (Triple triple : triples) {
System.out.println("Graph: " + triple);
}
} finally {
triples.iterator().close();
}
try {
for (Triple triple : triples) {
System.out.println("Graph: " + triple);
}
} finally {
triples.iterator().close();
}
Using "ClosableIterators.with" this becomes:
with(graph.find(ANY_TRIPLE), new Function<Void, ClosableIterable<Triple>>() {
public Void apply(ClosableIterable<Triple> object) {
for (Triple triple : object) {
System.out.println("Graph: " + triple);
}
return null;
}
});
It's typically one line less but that's not that much of an improvement.
public Void apply(ClosableIterable<Triple> object) {
for (Triple triple : object) {
System.out.println("Graph: " + triple);
}
return null;
}
});
Saturday, November 21, 2009
JRDF 0.5.6 and testing
JRDF 0.5.6 has JSON SPARQL support and many things got a good refactoring including the MergeJoin (which now looks more like typical text book examples - for better or worse). The upgrade to Java BDB version 4 has improved disk usage and performance mostly around temporary results from finds and the like. The next version will include named graph support.
I'm considering this being the last version to support Java 5.
This release was also about learning things like Hamcrest, JUnit 4.7, various mock extensions (Unitils, Powermock and Spring automocking (which didn't make it due to not being able to mix and match runners). This seems to be a design flaw that I've been encountering with JUnit - you can't mix and match features from various runners. Even Powermock and JUnit's Rules (for exceptions anyway) was problematic. The answer was to go back to the inner class block version.
I'm considering this being the last version to support Java 5.
This release was also about learning things like Hamcrest, JUnit 4.7, various mock extensions (Unitils, Powermock and Spring automocking (which didn't make it due to not being able to mix and match runners). This seems to be a design flaw that I've been encountering with JUnit - you can't mix and match features from various runners. Even Powermock and JUnit's Rules (for exceptions anyway) was problematic. The answer was to go back to the inner class block version.
Sunday, July 12, 2009
JRDF 0.5.5.5
It's been a long while between updates but it's finally here. There's been some general concessions made to long standing "features" in JRDF namely relational semantics and checked exceptions - both are gone. Yuan-Fang added merge-join support which improved join performance (by up to 8 times). There's Groovy support and a nasty memory leak fixed. It's in the usual place. The next version won't be so far away with some further SPARQL query improvements including perhaps some of the newly proposed features.
Thursday, February 12, 2009
LOL (List of Links)
- Hamcrest integrated in JUnit 4.4 and above (JMock like expectations). Also allows Popper theories (the link in the release notes is available through archive.org). which are similar to TestNG's data points. There are also some examples: here, here, here and here (using Groovy).
- Window Licker an interesting approach to Swing testing. See the Calculator example and XP Day 2008 presentation.
- The bushfires are retribution for the damned. Apparently, God waited 5 months (or maybe he was too busy appearing in toast) to smite those Victorians down. The most annoying part of this is the fuzzy thinking. Instead of considering that it might be poor fuel management practices, bad advice on whether to stay or leave or global warming it's actually about abortion. If you can speak of a real Christian attitude, the people of North Queensland donating their flood relief money straight to the Victorians seems more along the right lines.
- Haskell book to read and buy. Luckily in that order.
- SemWeb project VoiD a simple vocabulary for linking and describing different data sets.
- A couple of ways of fixing the billion dollar mistake using Java Rebel or maybe Maybe Monad (using Java 5 for loops even). Speaking of which: Monoid fingertree (trees in Haskell) and JQuery is a Monad (apparently that makes it slow). I also realized that Void is a valid type in Java Generics which is helpful in the dreaded Visitor pattern.
- Bill Moyers on Gaza and Dr Izzeldine Abuelaish sharing his grief on Israeli television. More information is available from ABC's Foreign Correspondent story, "The Doctor and his Daughters" because the wikipedia page has been deleted.
- The Poppendieck's on Lean development as to balance Martin Fowler on Scrum.
- Straight Skeleton as a way to generate a skeleton from a polygon.
- RESTEasy JBoss' REST framework.
- Microsoft's terrible vengence, Songsmith. If you love music don't click here.
- And What's up for Java in 2009?
Tuesday, December 09, 2008
Restlet Talk
I spoke last night at the Java Users Group about Restlet. It's a basic introduction to both Restlet and trying to link data across web sites. I wasn't very happy with the example - it was basically stolen from a Rails introduction. At least I could answer the question about why you would allow your data to be searched (to sell adverts on your recipe web site). I think it went down okay, mainly because most Java developers are used to large frameworks and complicated APIs in order to do what Restlet does (so it's impressive), the Rails developers knew some of the concepts already and while most are wary of RDF, SPARQL, OWL and the Semantic Web stack it was a fairly incremental addition to achieve something reasonably powerful.
Thursday, December 04, 2008
Getting Groovy with JRDF
In an effort to speed up and improve the test coverage in JRDF I've started writing some of the tests in Groovy. It's been a good experience so far - so much so that I'm probably not going back to writing tests in Java in the future.
One of the things I wanted to try was a RdfBuilder, this is similar to Groovy's NodeBuilder.
There are a couple of things that make it a bit tricky. When parsing or debugging builders I haven't yet found a way to find the methods/properties available even using MetaClass. And of course, when the magic goes wrong it's a bit harder to debug Groovy versus java.
It certainly smartens up the creation of triples, for example (bits from the NTriples test case):
Using the builder results in a file that's smaller than the test case file. You could remove some duplication by creating a method that takes in a number and the object and generates "eg:resource$number" "eg:property" "$object" but doing that may actually make it harder to read.
If you stick to only using URIs you can do things like:
I expect that JRDF will only be more Groovy friendly in the future.
One of the things I wanted to try was a RdfBuilder, this is similar to Groovy's NodeBuilder.
There are a couple of things that make it a bit tricky. When parsing or debugging builders I haven't yet found a way to find the methods/properties available even using MetaClass. And of course, when the magic goes wrong it's a bit harder to debug Groovy versus java.
It certainly smartens up the creation of triples, for example (bits from the NTriples test case):
def rdf = new RdfBuilder(graph)
rdf.with {
namespace("eg", "http://example.org/")
namespace("rdfs", "http://www.w3.org/2000/01/rdf-schema#")
"eg:resource1" "eg:property":"eg:resource2"
"_:anon" "eg:property":"eg:resource2"
"eg:resource1" "eg:property":"_:anon"
(3..6).each {
"eg:resource$it" "eg:property":"eg:resource2"
}
"eg:resource7" "eg:property":'"simple literal"'
"eg:resource17" ("eg:property":['"\\u20AC"',
'"\\uD800\\uDC00"', '"\\uD84C\\uDFB4"', '"\\uDBFF\\uDFFF"'])
"eg:resource24" "eg:property":'"<a></a>"^^rdfs:XMLLiteral'
"eg:resource31" "eg:property": '"chat"@en'
}
The first two lines defines the two namespaces used. The third line shows the general use of RDF and Groovy. It works out well, an RDF predicate and object maps to an attribute and value in Groovy. The next two lines show how you refer to the same blank node across two statements. And the following lines show using ranges and creating different types of literals. The third last line creates 4 triples with the same subject and predicate but with different objects. Using the builder results in a file that's smaller than the test case file. You could remove some duplication by creating a method that takes in a number and the object and generates "eg:resource$number" "eg:property" "$object" but doing that may actually make it harder to read.
If you stick to only using URIs you can do things like:
rdf.with {
urn.foo6 {
urn.bar {
urn.baz1
urn.baz2
}
}
}
Which produces two triples: "urn:foo6, urn:bar, urn:baz1" and "urn:foo6, urn:bar, urn:baz2".I expect that JRDF will only be more Groovy friendly in the future.
Tuesday, November 11, 2008
While you were away...
Now that I'm currently looking around for jobs, I came across a presentation on some of the work the the easyDoc project did at Suncorp, "Technical Lessons Learned Turning the Agile Dials to Eleven". It includes automating getter/setter testing, hibernate, and immutability. It's good to see the sophistication continued to increase after I left to reach quite a high level (like automatic triangulation and doing molecule level testing).
Tuesday, August 12, 2008
Real Developers don't use Ruby
Hadoop: When grownups do open source. It's quite an amusing read - especially the part about the word count example on 9,000 blogs, the digg at Twitter, Starfish being practically useless (using MySQL and no Reduce phase) and the bit about understanding something being harder than writing a Ruby version of it.
Ahh the joys of installing Visual Studio - enough time to install IntelliJ, run it up, and catch up on news."Twitter decided they would be cute and trendy. They wrote their code in Ruby: the official state language of the hipster-developer nation. Doug Cutting, on the other hand, decided he would get xxxx done, and wrote Hadoop in Java. Starling was hidden away in some corner and forgotten (it's hosted at RubyForge...). Hadoop lives prominently at the Apache Software Foundation. Starling is a re-hash of an existing Java Enterprise API called JMS that has several open source implementations. Hadoop is an implementation of Google's MapReduce, a system that publicly only existed on paper. Hadoop has the added benefit of actually working."
Tuesday, July 22, 2008
Square brackets are scary
For what may be an increasing trend of surfing the Web at 320x480 I noticed Cydia has a number of applications for Jailbroken iPhones (Java, Python and Ruby mainly). The mailing list on iPhone/Java doesn't have much on it except some interesting uses of JocStrap and UICaboodle (available from SVN by Jay Freeman). There's also the Sun blog that has some interesting sample applications using different Java implementations on the iPhone.
Wednesday, July 02, 2008
Round of Links
- Apache Hadoop Wins Terabyte Sort Benchmark "One of Yahoo's Hadoop clusters sorted 1 terabyte of data in 209 seconds...This is the first time that either a Java or an open source program has won." There were just under 1000 nodes, the benchmark results are hosted by HP (a tad more detail here).
- Microsoft buys Powerset one of the interesting things is that they use Hadoop (see their blog). It's hard to tell whether this is bad or good for Hadoop.
- Google vs Microsoft - oh for structure.
- Tom talking about GridGain from his presentation in February. C++ isn't as productive as Java?
- Applets are back (according to Sun).
- Why commenting is for n00bs. "And Haskell, OCaml and their ilk are part of a 45-year-old static-typing movement within academia to try to force people to model everything. Programmers hate that. These languages will never, ever enjoy any substantial commercial success, for the exact same reason the Semantic Web is a failure. You can't force people to provide metadata for everything they do. They'll hate you."
- Some interesting discussion on Web 2.0 and the future of the web.
- Rich text editor for browsers. Not free though.
- Linked data and what it is.
- ThoughtWorks Podcasts (the REST talk was what drew me to it).
- Turtle specification. I've been looking at this for serialization of RDF molecules but it seems that you can't have blank nodes as objects using the nested syntax.
- Semantic Web for bioinformatics.
- Data structure stuff: Linear Bloom Filters, Bloom filters for Spell Checking, Optimal Bloom Filter replacements and scalable btree and B-tries for Disk-based String Management.
Monday, June 30, 2008
ScalaCC
Formal Language Processing in Scala which links to External DSLs made easy with Scala Parser Combinators that I'd read from here.
Although, just to keep it balanced I have noticed Steve Yegge's comments, under "Static Typing's Paper Tigers", on the complexity of Scala's typing (it does have a lot) and it has been pointed out that this does lead to problems with writing IDEs to support it.
Although, just to keep it balanced I have noticed Steve Yegge's comments, under "Static Typing's Paper Tigers", on the complexity of Scala's typing (it does have a lot) and it has been pointed out that this does lead to problems with writing IDEs to support it.
Thursday, November 15, 2007
Sesame Native Store
I'm very impressed at the moment with OpenRDF's native store as others have been in the past. One of the best things is how easy it was to work into the existing JRDF code.
As I've said before I've been searching for an on disk solution for loading and simple processing of RDF/XML. In the experiments I've been doing OpenRDF's btree index is much faster than any other solution (again not unexpected based on previous tests). The nodepool/string pool or ValueStore though is a bit slower than both Bdb and Db4o.
Loading 100,000 triples on my MacBook Pro 2GHz takes 37 secs with pure Sesame, 27 with the Sesame index and Db4o value store, 35 with Bdb value store and ehCache is still going (> 5 minutes). A million takes around 5 minutes with Sesame index and Db4o nodepool (about 3,400 triples/second) and 3 minutes with a Sesame index and memory nodepool (about 5500 triples/second).
There's lots of cleanup to go and there's no caching or anything clever going on at the moment, as I'm trying to hit deadlines. 0.5.2 is going to be a lot faster than 0.5.1 for this stuff.
Update: I've done some testing on some fairly low-end servers (PowerEdge SC440, Xeon 1.86GHz, 2GB RAM) and the results are quite impressive. With 100,000 triples averaging around 11,000 triples/second and 10 million averaging 9,451 triples/second.
Update 2: JRDF 0.5.2 is out. This is a fairly minor release for end user functionality but meets the desired goal of creating, reading and writing lots of RDF/XML quickly. Just to give some more figures: Bdb/Sesame/db4o (SortedDiskJRDFFactory) is 30% faster for adds and 10% slower for writing out RDF/XML than Bdb/Sesame (SortedBdbJRDFFactory). Both have roughly the same performance for finds. I removed ehcache as it was too slow compared to the other approaches.
As I've said before I've been searching for an on disk solution for loading and simple processing of RDF/XML. In the experiments I've been doing OpenRDF's btree index is much faster than any other solution (again not unexpected based on previous tests). The nodepool/string pool or ValueStore though is a bit slower than both Bdb and Db4o.
Loading 100,000 triples on my MacBook Pro 2GHz takes 37 secs with pure Sesame, 27 with the Sesame index and Db4o value store, 35 with Bdb value store and ehCache is still going (> 5 minutes). A million takes around 5 minutes with Sesame index and Db4o nodepool (about 3,400 triples/second) and 3 minutes with a Sesame index and memory nodepool (about 5500 triples/second).
There's lots of cleanup to go and there's no caching or anything clever going on at the moment, as I'm trying to hit deadlines. 0.5.2 is going to be a lot faster than 0.5.1 for this stuff.
Update: I've done some testing on some fairly low-end servers (PowerEdge SC440, Xeon 1.86GHz, 2GB RAM) and the results are quite impressive. With 100,000 triples averaging around 11,000 triples/second and 10 million averaging 9,451 triples/second.
Update 2: JRDF 0.5.2 is out. This is a fairly minor release for end user functionality but meets the desired goal of creating, reading and writing lots of RDF/XML quickly. Just to give some more figures: Bdb/Sesame/db4o (SortedDiskJRDFFactory) is 30% faster for adds and 10% slower for writing out RDF/XML than Bdb/Sesame (SortedBdbJRDFFactory). Both have roughly the same performance for finds. I removed ehcache as it was too slow compared to the other approaches.
Friday, November 09, 2007
JRDF 0.5.1 Released
This release is mainly a bug release. There are improvements and fixes to the Resource API, datatype support and persistence. Another persistence library has been added, db4o, which has some different characteristics compared to the BDB implementation. However, it's generally a little slower that BDB. The persistence offered is currently only useful for processing large RDF files in environments with low memory requirements.
Also, the bug fixes made to One JAR have been integrated, so JRDF no longer has its own version.
Available here.
Also, the bug fixes made to One JAR have been integrated, so JRDF no longer has its own version.
Available here.
Saturday, October 27, 2007
No Java 6 for You!
Just in case you're like me and you upgraded to Leopard only to find Java 6 no longer works and Java 5 unstable, here's a fix:
At least it comes with better Ruby support and a RubyCocoa bridge.
Update: I don't think I like Leopard. I don't like: the 3D or the 2D look of the dock, the semi-transparent menu bar, Spaces behaviour - you can't have multiple windows spanning desktops from the same application (like two browser windows in separate desktops), the removal of text lists (why not have Fan, Grid and List?), in Quick Look you can go to a page in a PDF or Word file but when you click on it it goes to the first page, and Java support (IntelliJ and others seems to have weird refresh issues and you can't seem to allocate it to a virtual desktop).
There's of course a lot to like though too (tab terminals, RSS reader, better Spotlight, Cover flow, Safari, etc).
Update 2: So the Spaces thing. If you minimise a window, it goes to the dock, change to another virtual desktop, then expand, it goes to the desktop that the window was originally in. As noted in the comments though (and it is in the guided tour), if you activate Spaces (the default is F8), then move the window to a new virtual desktop then it is tied to it. There are two other ways: click and hold on the window and switch to another desktop or click and hold on the window, move it to the edge of the screen, wait, and it will go to the next desktop.
To me, some of the behavior breaks the illusion/metaphor of the virtual desktop and seems unnecessarily difficult. I'd prefer non-click and hold options supported as well (pretty much like other features like copying files, etc).
The networking improvements (non-blocking, built in VNC) overcomes what was probably one of the worst things about OS X.
The IntelliJ issue that I mentioned is logged as issue 16084.
-- First, delete ~/Library/Java/Caches/deployment.properties
-- Move aside your Java 1.6 directory. The 1.6 preview on Tiger does not work on Leopard.
% cd /System/Library/Frameworks/JavaVM.framework/Versions
% sudo mv 1.6.0 Tiger_1.6
% sudo rm 1.6
At least it comes with better Ruby support and a RubyCocoa bridge.
Update: I don't think I like Leopard. I don't like: the 3D or the 2D look of the dock, the semi-transparent menu bar, Spaces behaviour - you can't have multiple windows spanning desktops from the same application (like two browser windows in separate desktops), the removal of text lists (why not have Fan, Grid and List?), in Quick Look you can go to a page in a PDF or Word file but when you click on it it goes to the first page, and Java support (IntelliJ and others seems to have weird refresh issues and you can't seem to allocate it to a virtual desktop).
There's of course a lot to like though too (tab terminals, RSS reader, better Spotlight, Cover flow, Safari, etc).
Update 2: So the Spaces thing. If you minimise a window, it goes to the dock, change to another virtual desktop, then expand, it goes to the desktop that the window was originally in. As noted in the comments though (and it is in the guided tour), if you activate Spaces (the default is F8), then move the window to a new virtual desktop then it is tied to it. There are two other ways: click and hold on the window and switch to another desktop or click and hold on the window, move it to the edge of the screen, wait, and it will go to the next desktop.
To me, some of the behavior breaks the illusion/metaphor of the virtual desktop and seems unnecessarily difficult. I'd prefer non-click and hold options supported as well (pretty much like other features like copying files, etc).
The networking improvements (non-blocking, built in VNC) overcomes what was probably one of the worst things about OS X.
The IntelliJ issue that I mentioned is logged as issue 16084.
Saturday, September 22, 2007
Migration
With the announcement of JRuby in Glassfish or the end of mongrel it seems to offer Sun the hope of capturing more developers not just from Ruby but from .NET. Many people from both Java and C# worlds jumped to Ruby and it occurred to me that .NET developers using Netbeans for Ruby development may notice that there's a suprisingly good C# clone under the covers.
Wednesday, September 19, 2007
Beautiful or Otherwise
Two of the chapters from Beautiful Code Alberto Savoia chapter Beautiful Tests and Simon Peyton Jones on concurrency are available in PDF form. Beautiful Tests covers most of the different kinds of tests and how to make changes to code to make it more testable and starts to cover creating and validating theories.
Speaking of tests, Scala, DSLs, Behavior Driven Development?, talks about how Java is poor at creating DSLs specifically compared to Scala. And what's the application? Behavior driven development in a Java project called beanSpec (which superficially looks similar to Instinct probably as they're both based on RSpec and using the same stack example). So you have this neat sort of convergence where people are looking at testing Java better in ways that are more declarative (functional even).
Making Java more functional is on the cards for Java 7 in, Will Java 7 be Beautiful?, it links to point free programming in Java and Haskell and how the new language proposals (closures) make Java 7 look a lot like Haskell with the suggestion that all Java 7 functions should be curried.
While the syntax is potentially getting more beautiful, the user interfaces are traditionally pretty poor in Java (even with OS X support) but even that is changing. I've been following Chet Haase's blog and recently Filthy Rich Clients was made available to purchase. It's all about improving the look and feel of Java and to finally approach the richness of native OS X and Windows applications. He even has some links about language proposals too, including bringing back line numbers. Some examples of their work are on the web site and on Roman Guy's blog which includes an entry called Beautiful Swing.
Roman also linked to a movie of *7 the prototype handheld device running Green (Java). Who knew Duke had a house?
Speaking of tests, Scala, DSLs, Behavior Driven Development?, talks about how Java is poor at creating DSLs specifically compared to Scala. And what's the application? Behavior driven development in a Java project called beanSpec (which superficially looks similar to Instinct probably as they're both based on RSpec and using the same stack example). So you have this neat sort of convergence where people are looking at testing Java better in ways that are more declarative (functional even).
Making Java more functional is on the cards for Java 7 in, Will Java 7 be Beautiful?, it links to point free programming in Java and Haskell and how the new language proposals (closures) make Java 7 look a lot like Haskell with the suggestion that all Java 7 functions should be curried.
While the syntax is potentially getting more beautiful, the user interfaces are traditionally pretty poor in Java (even with OS X support) but even that is changing. I've been following Chet Haase's blog and recently Filthy Rich Clients was made available to purchase. It's all about improving the look and feel of Java and to finally approach the richness of native OS X and Windows applications. He even has some links about language proposals too, including bringing back line numbers. Some examples of their work are on the web site and on Roman Guy's blog which includes an entry called Beautiful Swing.
Roman also linked to a movie of *7 the prototype handheld device running Green (Java). Who knew Duke had a house?
Thursday, September 13, 2007
A Real LINQ Clone for Java
Introducing Quaere - Language integrated queryies for Java
Via, Quaere: LINQ for Java.
The Quaere DSL is very flexible and it lets you perform a wide range of queries against any data structure that is an array, or implements the java.lang.Iterable or the org.quaere.Queryable interface. Below is an overview of the querying and other features available through the DSL interface, the underlying query expression model and query engine. See the examples section to gain an understanding of how these features are used.
* Ability to perform queries against arrays or data structure implementing the Iterable interface.
* An internal DSL (based on static imports and fluent interfaces) that lets you integrate the query language with regular Java code. No preprocessing or code generation steps are required to use the DSL, simply add a reference to the quaere.jar file (and its dependencies).
* A large number of querying operators including restriction, selection, projection, set, partitioning, grouping, ordering, quantification, aggregation and conversion operators.
* Support for lambda expressions.
* The ability to dynamically define and instantiate anonymous classes.
* Many new “keywords” for Java 1.5 and later.
Via, Quaere: LINQ for Java.
Saturday, August 18, 2007
Off to See the Wizard
Microsoft comes up with a quite elegant and well integrated language extension for relational and XML data. IBM fires back 2 years later with a wizard that auto-generates beans and Java code with embedded SQL. The article also makes assertions like:
It does seem like it does give you some benefits when writing SQL but it's simply not LINQ.
The most popular way to objectize to programmatically access and manipulate relational data has been through special APIs and wrappers that provide one or more SQL statements written as text strings.
It does seem like it does give you some benefits when writing SQL but it's simply not LINQ.
Subscribe to:
Posts (Atom)