Saturday, April 29, 2006

RDF on Rails

While searching for Ruby APIs for RDF I came across: ActiveRDF: object-oriented RDF in Ruby "Although most developers are object-oriented, programming RDF is triple-oriented. Bridging this gap, by developing a truly object-oriented API that uses domain terminology, is not straightforward, because of the dynamic and semi-structured nature of RDF and the open-world semantics of RDF Schema.

We present ActiveRDF, our object-oriented library for accessing RDF data. ActiveRDF is completely dynamic, offers full manipulation and querying of RDF data, does not rely on a schema and can be used against different data-stores. In addition, the integration with the popular Rails framework enables very easy development of Semantic Web applications."

"The development of such APIs has been attempted using a statically-typed language (Java) in RdfReactor, Elmo and Jastor. These approaches ignore the flexible and semi-structured nature of RDF data and instead:
1. assume the existence of a schema, because they rely on the RDF Schema to generate corresponding classes,
2. assume the stability of the schema, because they require manual regeneration and recompilation if the schema changes and
3. assume the conformance of RDF data to such a schema, because they do not allow objects with different structure than their class definition.

Unfortunately, these three assumptions are generally wrong, and severely restrict the usage of RDF. A dynamic scripting language on the other hand is very well suited for exposing RDF data and allows us to address the above issues."

The homepage: ActiveRDF. Uses YARS and Redland.

Recent related W3C note: "A Semantic Web Primer for Object-Oriented Software Developers".

Related previous posting: "Scripting the Semantic Web".

Update: It looks like Henry Story noticed ActiveRDF as a good idea too.

Update 2: Brian Gilman recent posting to the Life Science list links to the BioRuby project.

Wednesday, April 26, 2006

Links


  • On setters, constructors and modelling reality "In my experience there are almost no situations where a setter method is a good idea...I often see setters used where information relating to an object is not known at construction time, and needs to be added later. This is usually a symptom of choosing the wrong class to store that data...Note that just because all the fields are final and there are no setters, this doesn't mean that the object is immutable. It just means that its behaviour more closely models reality, and your interactions with it will be far more meaningful."

  • ABC Video On Demand including all your Chaser needs.

  • Agile Development: Schema Evolution "One of the characteristics of agile development is that designs are often refactored. As result, the data model may evolve during the lifetime of a project. What happens if the data model of the db4o databases is changed?"

  • Smalltalk to Java - the Good, the Bad, and the Unbelievably Ugly "The Smalltalk code was "too Smalltalky" for the translation (again, design is language specific IMHO). They had DNU handlers, descendants from nil, #perform, etc. Then there's the whole utility class issue - final classes in Java (String, Date) that would simply be subclassed in Smalltalk. Joshua Bloch, call your office :/"

  • Running Code Doesn't Lie "This statement pretty much sums up why agile software development practices like eXtreme Programming and Getting Real are so powerful. Instead of business analysts or programmers writing pages of documentation for what they think the system does, the system tells you what it does. I like to call this a Self Describing System."

  • TestDox "TestDox creates simple documentation from the method names in JUnit test cases."

  • Multi-core Processors: transputers reborn "
    In the Eighties, there were only a small number of people who were interested in fast drawing of Mandelbrot sets. The same is true today. But today, everyone is interested in having their web sites scale well in terms of performance and in terms of price. The killer application of parallel computing is all around us. It is called the Web."

Monday, April 24, 2006

Festive Flesh

Reading this about Capybara "The popularity of capybara meat in Venezuela is attributed to a 16th century theological decision by the Roman Catholic Church. Responding to queries by Venezuelan Catholics, the Church declared the capybara meat to be equivalent to fish meat, and thus allowed its consumption during Lent [1]. The decision may have been taken on the basis of incomplete or inaccurate descriptions of the capybara available to the Church authorities in Rome; but it was never reversed, and to this day the capybara is the only warm-blooded animal with that status. (This story should be treated with caution, however, since similar claims have been circulated concerning other semi-aquatic mammals, such as beavers and muskrats[2].)"

Sunday, April 23, 2006

Raving about Maven

So for every pro-Maven story I've seen I get, For F___ Sake, why does maven suck so much "On a more specific note of why I hate maven. The first one is the whole idea of maven repository. I agree that maven repo works if things are simple and the jars aren't changed that often. For all other real world cases where jars or versions change rapidly, maven often fails to update the repo. When that happens maven just craps itself. What's worse if one's network connection is buggy and unreliable. having a remote repo is worth crap when the network blows. Sure, one can setup a local repo, but what if the LAN is unreliable."

"Overriding build.properties is a bitch if one has lots and lots of properties. Say I have an EJB that is built nightly, but new branches are created every month. What if I need to test different branches on an on going basis? Well with ANT I can pass it a different build.xml file. With maven I haven't been able to find a way to do, I don't believe it can be done. It would be great if i could start maven and point it to say branch2build.properties instead of build.properites."

From what I can tell this is Maven 1, whereas Maven 2 is apparently a lot better.

Saturday, April 22, 2006

Triple Fest '06

Aperture "...is a Java framework for extracting and querying full-text content and metadata from various information systems (e.g. file systems, web sites, mail boxes) and the file formats (e.g. documents, images) occurring in these systems."

Supports: Plain text, HTML, XHTML, XML, PDF (Portable Document Format), RTF (Rich Text Format), Microsoft Office: Word, Excel, Powerpoint, Visio, Publisher, OpenOffice, OpenDocument, Corel WordPerfect, Quattro, Presentations and Emails (.eml files). Check out the Extractor API and associated interfaces.

Put all that together with stuff like Wikipedia3 and others.

The BBC's open programme information project... including Jon Pertwee in FOAF.

Friday, April 21, 2006

I Object

A response to some points in, The perils of avoiding heresy (or "What are Design Patterns") and Visitor Pattern and Trees Considered Harmful.

""21 reasons C++ sucks; 1 embarassment; and an Abstract Syntax Tree"...So if the book is predominately a catalogue of unfortunately necessary kludges around the semantic weaknesses of mainstream OO-languages, why do I still highly recommend it to new programmers?"

Design patterns are not reliant on OO, Java, C++ or any particular language or programming paradigm. It's not even reliant on Computer Science. The source of design patterns comes from architecture (Christopher Alexander) and is used in diverse areas such as dating. Grady Booch's Handbook lists 1000s of patterns, so many reasons OO-languages suck.

"Singleton. This is a global variable."

So variables and objects are not the same thing - variables lack behaviour (amongst other things). The Singleton pattern as implemented by IoC containers like Spring, Yan, PicoContainer all use Singleton as configuration on a plain Java object - so again it's not necessarily a concept in code at all.

"In a language with first-class constructors, the Factory Method Pattern consists of factory = MyClass.constructor"

This just shows a misunderstanding of the the Factory pattern - its intention is to decouple object instantiation, especially at runtime. In languages that have first-class constructors, like Ruby, you still use the Factory pattern.

An interesting discussion, Factory and Singleton are false Design Patterns?.

So the old chestnut, the visitor. "The visitor pattern is kludge used by programmers in a language that doesn't support multiple-dispatch to provide themselves with a static version of double-dispatch."

This I agree with. But the rest seems simply a rant against OO.

So Wikipedia says, "...visitor design pattern is a way of separating an algorithm from an object structure. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures."

So there's no need for multiple if-then-elses or case statements which is what happened in Python as there is no switch statement.

The Visitor pattern is not the ultimate answer to writing a compiler. It's very straight-forward for small languages, simple languages, languages where you might find changing the code and the syntax independently, etc. Having written a C compiler using SableCC I have seen the issue of breaking encapsulation - in fact most compiler implementations I've seen tend to create global state of one sort or another anyway.

The main issue with compiler compilers that I tried before SableCC (which uses the Visitor pattern) is the combination of grammar and Java code. I have similar problems with PL/SQL, JSPs, etc. This combination is also detrimental to tools like debuggers, IDEs etc. (the work that goes into JSP support in IDEs is a good example of how hard it is).

I can't think of a successful combination of the two styles, declarative and imperitive. To me it seems to always end in very poor, unmaintainable code. And through the eyes of a lemming it's much poorer than sticking within the OO paradigm.

A better article is, Translators Should Use Tree Grammars. At the end of article the author writes, "...aspect-oriented specifications as long as we are thinking about putting actions outside of the grammar. Action execution is just an aspect and each phase would be an aspect." which seems very similar to the way things like transactions and context have been implemented in Spring. It might well be that combining AOP and OOP will provide a clean solution to writing compilers.

Sharing Instances in Spring

So I had a fairly basic problem that I probably have solved before but wasn't able to remember and took some time to work out.

I had three objects A, B and C. Both B and C need to share the same instance of A. But I want to create multiple Bs and Cs – so they are not singletons. I also did not want other Bs and Cs to share the one instance of A so making A a singleton was also out of the question.

There’s several ways I came up with but none of which I was particularly happy with:
1/ Flatten out the dependencies – make C depend on B which depends on A for instance.
2/ Make A a singleton and create bunches Bs and Cs by using multiple BeanFactories.
3/ I could throw away dependency injection and let the objects using B and C construct/set-up B and C using A.

The solution that I finally settled on is creating a BCFactory (interface) that requires A, constructs B and C (in the implementation) and has getB() and getC() methods.

Wednesday, April 19, 2006

Link Parking

Monday, April 17, 2006

Spring Aspects - A Transactional Thing

I've recently been looking at transactions in Spring 2.0 - it certainly has more things to make your code sing.

The Spring 2.0 Transaction document has some useful information on using a TransactionTemplate to wrap transactional operations. Of course, the best way is Spring AOP for declarative transaction management and using Java 5.0's annotations.

There's a previous discussion of applying transactions using Spring 1.x using Hibernate in, Wire Hibernate Transactions in Spring.

Javapolis Spring Update and Maven 2.0 (see also, this Maven 2.0 article) most of talks are well worth a listen.

Articles on using the new AOP features in Spring 2.0: Typed Advice in Spring 2.0 (M2) (about finding methods in Spring AOP) and POJO Aspects in Spring 2.0: A Simple Example.

An article on JPA (Java Persistance API), Using the Java Persistence API with Spring 2.0.

Tuesday, April 11, 2006

Optimal Pair Swapping: 90 minutes

Promiscuous Pairing and Beginner’s Mind: Embrace Inexperience "It often takes days for a given pair to be comfortable enough with each other to be able to achieve Pair Flow at all. This means that pairings tend to be long. The longer the mean time between pair swaps, the less effectively pair net distributes information through the team."

"While two people are paired, they share knowledge. When the pair splits for a pair swap, the knowledge then spreads to all four participants. In this way, knowledge will slowly but automatically spread around the group."

And the surprising result:
"This makes it easy to see the shape of the curve near the 90-minute optimal point. However, we did note that longer pair times had slightly higher mean velocities."

So you think you're doing extreme XP and someone always has to crank it up one more level. 90 minutes seems pretty close to me to be the minimum period of time to actually get a chunk of work done - considering that getting work done by yourself usually requires an hour.

I was actually looking for evidence that pair programming's flow state is better/worse than normal. Specifically,
  • How much more or less is "pair flow" susceptible to interruptions,
  • Whether pairs are able to resume, after interruption, more quickly into a shared flow state,
  • Whether it's faster or slower to get into a "pair flow" state, and
  • The effects experience has on the above.

Even Flow

"Flow is a mental state of operation in which the person is fully immersed in what he or she is doing, characterized by a feeling of energized focus, full involvement, and success in the process of the activity. Proposed by psychologist Mihaly Csikszentmihalyi, the concept has been widely referenced across a variety of fields."

Alister Cockburn's Team Per Task, "It takes about 20 minutes to reach this state of flow, and only a minute to lose it. Our designers found that it took about an hour to get into flow and make progress after having been stopped. If a meeting or other task arrived during this hour, the entire period was essentially lost. As it also took energy to get into the flow, a distraction cost but energy as well as time."

Pairing to to the rescue (again). When a pair gets into a flow it's more resilient to distractions - it seems to allow one person in the pair to get back into it more quickly after being distracted. Maybe it is related to mirror neurons (video).

A Point of Difference

Reading, "Organizational Patterns of Agile Software Development" it suggests code ownership because "Something that is everybody's responsibility is really no one's responsibility". He does note some concerns related to one owner of the code including, "...tunnel vision, the implied risk of having only a single individual who understands a piece of code in depth, and a breakdown of global knowledge." as well as the bus factor and introduction of bottlenecks.

This disagreement between XP and Coplien's agile software development was noted by Kent Beck in a 1999 interview: "KB: I was talking about it with Cope [Jim Coplien] yesterday; this is our favorite fighting topic. The rule is, if you see a problem with code anywhere in the system, you fix it. So we’re sitting there with our DB40 transactions, and we say, “Y’know, if this export object over here was just structured this way, it would be really easy for us to do our stuff.” If it would clean things up else-where, we just do it."

"JV: So anybody can change anything, anywhere…

KB: Yup, absolutely. If you see it, and you got your partner there, and you’re going to run the tests within a few minutes—so it’s not like you’re going to break something—you just do it. And the system’s going to get better.

Now, I came up with this because I was working in strict individual code ownership shops, and we’d say, “Gee, we keep calling these same three methods in this object. Why don’t we just make a method in the object that calls the three methods for us?” “We don’t own that.” “But the guy’s just across the hall—” “Naah, don’t wanna bother him.”

The pace of evolution in projects that use individual code ownership, in my experience, is glacial compared to what is possible to do, in a controlled way, if everybody takes responsibility for making all the code as good as they can make it."

"JV: Has this been without pitfall?

KB: Yes. It’s just not a problem. You’ve got to have collective ego instead of individual ego. That’s the hardest problem, so that no one comes up and says, “Hey man, you changed my class.” "

See also: Ron Jefferies' Code Ownership and a list different kinds of code ownership.

Until Now


  • It's like, how much more black could this be? And the answer is none. None more black.

  • Managers smell funny or the meaninglessness of managers as class names.

  • Twin Prime Conjecture film clip.

  • Live Clipboard - Metadata Quality, Events Databases and Live Clipboard "This is the big problem with data mapping. In Jon's example, the location is called Colonial Theater in Upcoming and Colonial Theater (New Hampshire) in Eventful. In Eventful it has a street address while in Upcoming only the street name is provided. Little differences like these are what makes data mapping a hard problem. Jon's solution is for the community to come up with global identifiers for venues as tags (e.g. Colonial_Theater_NH_03431) instead of waiting for technologists to come up with a solution."

  • Exploring Live Clipboard "Like David Janes, Danny Ayers prefers URIs. Of the five listed above, the Wikipedia URL would clearly be Danny's first choice. "If there are fairly solid reference services like Wikipedia or IMDB," he writes, "then use their URIs." I'll go along with that, so long as the URIs are easy for people to invent, to read, and to write. And so long as they can function as tags in social classification systems -- which for now, it seems, they cannot. "

  • It's people committing piracy vs corporate piracy, in "Who Owns Culture?". Lawrence Lessig taking many of the themes from the book "Free Culture".

  • SPARQL now a W3C Recommendation, Sparql Calendar Demo and SPARQL2SQL Rewriter "There are rewriter-specific limitations as well, e.g. multiple/nested UNIONs, combined expressions, some of the built-ins (e.g. lang, langMatches), custom functions, and several other things are not supported yet."

Thursday, April 06, 2006

Kicking Native OSX Games in the Happy Sack

Boot Camp Public Beta As Apple now says, "Macs do Windows, too". The download page says this is a feature coming in Leopard. Boot Camp Beta: Requirements, installation, and frequently asked questions (FAQ) "Even after installling the Macintosh Drivers CD, the Apple Remote Control (IR), Apple Wireless (Bluetooth) keyboard or mouse, Apple USB Modem, MacBook Pro's sudden motion sensor, MacBook Pro's ambient light sensor, and built-in iSight camera will not function correctly when running Windows."

For some reason I haven't been able to bring myself to install it.

I found this a good summary of the issues with dual booting: "We've found that dual-boot scenarios tend to leave users spending the bulk of their time booted into one of the two OSes—typically the one that hosts the most narrowly compatible software (that is, many Windows applications).

Virtualization or terminal services work much better for enabling, for instance, a Mac OS X or Linux user to run Windows-only software.

When users lack full access to files on either operating system's partition, as is the case with Boot Camp, users will find it that much tougher for the OSes to coexist.

We'd love to see VMware cook up a version of its VMware Player for the Mac. "

Update: Easy DOS It "So Apple will at least offer the option for users to run a virtualized version of Windows Vista atop OS X, which brings with it two HUGE advantages. First, the bad guys and script kiddies will have to get through OS X security before they even have a chance at cracking Vista security. Second, by running a virtual version of Windows Vista loaded from a read-only partition, Microsoft's recommended method of dealing with malware (periodically wipe the OS and application from your disk and load them anew) can be done in seconds instead of hours and can be done daily instead of monthly or quarterly or yearly."

Tuesday, April 04, 2006

For new MacBook Pro Owners

Apple Addresses MacBook Pro Issues "According to Apple, it has begun replacing the mainboard inside its MacBook Pros with a new revision. It calls the udpated product "revision D", which is indentifiable by product serial number.

* Serial numbers starting with W8611: revision D
* Serial numbers starting with W8610: revision C

Apple said that revision D MacBook Pros have many issues addressed and improvements made, including fixes to the above mentioned issues. We were also able to get a hold of a MacBook Pro that just arrived during the week with a serial number starting with W8612, which did not exhibit any of the above issues."

10.4.6 is out, start your downloads and has fixes for Spotlight and the MacBook Pro.

There's also a new article which seems self explanatory: How to use your PowerBook G4 or MacBook Pro with the display closed.

DDD (Dog Driven Design)

From, "On the Edge": "Miner was an interesting sight at the Cyan labs. People who saw him in the halls often had to take a second look because a tiny black shadow seemed to follow his every move. The shadow was actually a little black Cockapoo named Mitchy that followed Miner everywhere."

"The dog became a fixture at Atari. Miner had a brass nameplate on his door that read, ‘J.G. Miner’, and just below it was a smaller nameplate, ‘Mitchy’. Mitchy even had her own tiny photo-ID badge clipped to her collar as she happily trotted through the halls. While Miner worked on his groundbreaking systems, Mitchy sat on a couch watching with puzzlement as her master slaved over diagrams and schematics."

"...Mitchy did most of the design on the system; much more than Jay did...Jay would draw gates, and he would look down at Mitchy and Mitchy would shake her head. Jay would erase it and draw it upside down, and try it a different way and look down and Mitchy would pant. He did design by dog."

Monday, April 03, 2006

JRDF 0.4 Released

JRDF 0.4.0 is now out. The main difference between it and previous versions is that it is for Java 1.5 and above only. Also, Tom's SPARQL query engine is in. There's also bug fixes to do with RDF/XML parsing (mainly to do with feedback from Kowari developers). There's also initial interfaces for graph and relational operations (which may change). The beginnings of an index interface (using the idea of perfect indexes) has begun (org.jrdf.graph.index package). An NTriple grammar has also been added but there's no parser as yet.

Blowing Your Mind

In pursuit of code quality: Monitoring cyclomatic complexity "This report's section labeled Top 30 functions containing the most NCSS details the largest methods in the code base, which incidentally almost always correlate to methods containing the highest cyclomatic complexity. For instance, the report lists the class DBInsertQueue's updatePCensus() method as having a noncommenting line count of 283 and a cyclomatic complexity (labeled as CCN) of 114.

As demonstrated above, cyclomatic complexity is a good indicator of code complexity; moreover, it's an excellent barometer for developer testing. A good rule of thumb is to create a number of test cases equal to the cyclomatic complexity value of the code being tested. In the case of the updatePCensus() method seen in Figure 2, you would need 114 test cases to achieve full coverage."

"Because cyclomatic complexity is such a good indicator of code complexity, there is a strong relationship between test-driven development and low CC values. When tests are written often (note, I'm not implying first), developers have the tendency to write uncomplicated code because complicated code is hard to test. If you find that you're having difficulty writing a test, it's a red flag that the code under test may be complex. The short "code, test, code, test" cycle of TDD invites refactoring in these cases, which continually drives the development of uncomplex code."

Sunday, April 02, 2006

Many Mock Frameworks

Mock frameworks - the two main ones for Java are EasyMock and JMock. My current preference is EasyMock. Largely because EasyMock is not tied to extending a test case. This allows the shortcuts to move into test utilities - a common one is a controller factory that lets you rollup the reply/verify methods into single calls instead of calling each controller separately.

Paul King, has written up a little example of Mock Alternatives describing using EasyMock, JMock, RMock and Groovy. Tom has also made his slides available from a recent talk he gave about what I'll call Mock driven design (MDD - see "Why and When to Use Mock Objects"). Take that TDD (Test Driven Developer/Test Driven Design) or BDD (behavior driven design). Much of the interesting stuff was around the discussions and shared frustrations - especially which stage people were at with respect to what level of testing various developers and organizations were at. I think it's also important to devise strategies on how to work with different levels of tests and developers.