Thursday, March 30, 2006

Sudoku SQL

Solving Sudoku with SQL "To make it even more fun for myself, I embarked on an exercise to write a program that solves Sudoku puzzles. And to make it even more challenging I decided not to write the program in the popular object-oriented fashion (Java, C++, C#, etc.) or in any of the old-fashioned procedural programming languages (Pascal, C, Basic etc); but in Transact SQL, within SQL Server 2000. Basically, I wanted to see how the features of T-SQL can be used to develop something like a Sudoku puzzle solution. I have learnt some useful things from the exercise, which I’m eager to pass on to my fellow programmers.

T-SQL is rich in in-built programming functions and features. Far from being just for holding and manipulating data, T-SQL is a programming language in its own right. Many algorithm-based problems that used to be solved with mainstream procedural or object-oriented languages can now be dealt with completely within SQL Server using T-SQL, because not only does it have the usual programming constructs such as ‘While…End’; ‘Case’ and ‘Ifs’; it also, of course, has SQL."

Related to the previous solution using OWL: Now for my Tax Return.

Vista Chicken - System/360 all Over Again

Discontent at Microsoft. "What I saw in MS was PM's pushing hard for features:
* even if it meant that the test combinations would be very large, so the product couldn't not be tested properly.
* even if it couldn't be done properly in the time allocated. After all an estimate of time was made, now all of those features mus go in the product evne if things are taking longer than expected.
* even if the product was falling apart at the seems b/c every other pm was doing the same thing.

In fact, people often played schedule chicken. It didn't matter if you were running late by the metric of the day as long as another group was running later."

"WinFS is a great example of a file system designed by lunatic engineers and inbred GPM teams (led by a totally lunatic DirPM) without a clue as to what a real customer even looks like. Complexity in the design for complexity sake is the kiss of death. Complexity without a clear, or even muddy, picture of the problem you are actually trying to solve for the actual customer is the kiss of death. Not having customers involved at every step of the design and development process is just arrogance. Believing you know better than the customer is just stupid."

Also, Exceptions to Brooks’ Law "Brooks’s Law: adding manpower to a late software project makes it later...It depends who the manpower is...Some teams can absorb more change than others...There are worse things than being later...There are different ways to add manpower...It depends on why the project was late to begin with...Adding people can be combined with other management action."

Fortune interview with Fred Brooks: "One is to officially slip the schedule. And officially doing it has many benefits over unofficially letting it slip...That is, if you're going to take a slip, get everybody onboard, get organized, and take a six-month slip, even though you may at the moment feel as if you're only four months late."

Wednesday, March 29, 2006

RDF Beanz

Robert Turner has been doing something interesting with JRDF and mapping RDF and Java together to get dynamic objects using RDF.

RDFBeans hot off the source control.

Tuesday, March 28, 2006

To be Web 2.0...

You must integrate with Google maps. Like FOAF Map.

Monday, March 27, 2006

All Code is Agile Code

Refactoring Test Code
Includes a list of code smells specifically for test code and this reason as to why test code is important: "The downside of having many tests, however, is that changes in functionality will typically involve changes in the test code as well. The more test code we get, the more important it becomes that this test code is as easily modifiable as the production code."

"The most common case for test code will be duplication of code in the same test class. This can be removed using Extract Method (F:110). For duplication across test classes, it may provide helpful to mirror the class hierarchy of the production code into the test class hierarchy. A word of caution however: moving duplicated code from two separate classes to a common class can introduce (unwanted) dependencies between tests."

Friday, March 24, 2006

Making Things Easy

Sometimes you just need a little refactoring and time to solve an issue. In JRDF you used to have to do the following to remove triples as a result of find:
iter = graph.find(ANY_SUBJECT_NODE, ANY_PREDICATE_NODE, ANY_OBJECT_NODE);
while (iter.hasNext()) {
iter.next();
iter.remove();
}

What users really wanted to do but couldn't because it would throw a ConcurrentModificationException is:
ClosableIterator iterator = graph.find(ANY_SUBJECT_NODE, ANY_PREDICATE_NODE, ANY_OBJECT_NODE);
graph.remove(iterator);

By putting this first bit of code inside of the remove(Iterator) method the operation can then be performed much more simply by the user. The only implementation issue was determining whether the iterator was from the source graph or not.

For some reason, this solution did not present itself before - I'm putting it down to the code being more consistent (mainly the implementation of the iterators). The main reason has to be though that someone asked the question, again, why did it need to be so complicated.

It seems very similar to the situations I increasingly find when you combine TDD, 100% code coverage, zero duplication and reflecting on the design. The solutions just seem to fall out a lot more easily.

Thursday, March 23, 2006

Test Drive Parameter and Member Variables

Reflective parameter names in Java 6 points to Parameter names for Java 6 (Mustang) question "My colleague Paul Hammant is asking about how we would like reflective access to parameter names to work (if it is indeed implemented) in Mustang...I wonder whether this feature, inconjunction with an AOP framework, might allow a way to introduce keyword arguments with defaults, in a similar fashion to Python and Ruby?"

The example of enhancing the IDE code completion without source is a good example too.

My view is, it's not an option it's extra metadata on classes that should always be available.

Tuesday, March 21, 2006

Words that aren't

Here are a few words that technical people tend to use that aren't in the dictionary or aren't used correctly:
* Architected - as in, "that's a well architected design" (2 millions hits on Google - architecting has 3.3 million).
* Performant - as in, "that code is so fast, it's highly performant" (over 9 million hits on Google).
* Conformant - as in, "that code passes checkstyle, it's conformant code" (nearly 8 million hits on Google).

Monday, March 20, 2006

To the Source

Derivability, Redundancy and Consistency of Relations Stored in Large Data Banks "The first part of this paper is concerned with an explana-tion of a relational view of data. This view (or model) of data appears to be superior in several respects to the graph or network model [l, 2] presently in vogue. It provides a means of describing data with its natural structure only: that is, without superimposing any additional structure for machine representation purposes."

And the easier to find "A Relational Model of Data for Large Shared Data Banks". From, E. F. Codd.

Better Printer Software

HP gets 3.4x productivity gain from Agile Management techniques "These guys are awesome - Bret Dodd and Sterling Mortensen. Last year they attended Lean Design and Development and watched my presentation. They were so impressed and felt it was such a good fit for their process for development of printer firmware that they went back to HP and plotted the historical cumulative flow diagram."

"A 10x reduction in inventory in the system. A 5x reduction in WIP. A 3.4x increase in productivity with no new money, resources, people or any change in the way software engineering (development and test) were conducted. These figures are even better than my Microsoft XIT Sustained Engineering project results. Now here is the real kicker - a reduction in lead (cycle) time from 9 months to only 2 months. Printer firmware development at HP was never this good. Imagine what this means for the people. They now go home early on Friday afternoons, they don't work overtime, they have rediscovered their social lives, their families and their passions."

WIP = work in progress.

Also, Agile Practices that Scale links to "Seven Agile Team Practices That Scale (Part I of II )". Includes: Iteration foundation (time boxed working code), The definebuildtest component team, Smaller and more frequent releases, Two-level planning (small and large), Concurrent testing (all code is tested code), Continuous integration and Regular reflection and adaptation.

Agile Journal looks interesting including "Agile Processes: Making Metrics Simple". This hits several nails on the head with definitions of code toxicity (like code duplication), hygienity (OO), and quality (bugs released).

Tuesday, March 14, 2006

XML Configuration - Howzat?

Does Wicket Suit Your Web Framework Style? "One Web framework style tends to favor external configuration over explicit Java code. Struts, for instance, relies on one or more XML configuration files to specify the flow of a Web application. While that style works well for some developers, XML files irritate just as many, who prefer to specify Web application logic in Java code instead.

According to a recent introductory article by Guillermo Castro about the Wicket framework:

[In Wicket] all the application logic falls inside the Java classes, instead of mixing it with the pages, like JSP (true separation of concerns). The Java code is glued to the HTML page by using a special wicket:id attribute that can be assigned to almost any HTML tag, and that tells Wicket where do you want to render a component. Wicket comes with several components like Labels, Links, Lists, etc., which are uniquely defined on a webpage by setting an Id to the component, and the content which is represented by a Model.

If you're using Wicket, there's only one XML you really need to modify, web.xml, and this isn't even a Wicket requirement, but rather a servlet specification requirement (i.e. you can't make a servlet work if you don't define it in the xml)."

Copious Quality Content from Copia

An assessment of RDF/OWL modelling "We conclude that RDF/OWL is particularly suited to modelling applications which involve distributed information problems such as integration of data from multiple sources, publication of shared vocabularies to enable interoperability and development of resilient networks of systems which can cope with changes to the data models. It has less to offer in closed world or point- to-point processing problems where the data models are stable and the data is not to be made available to other clients."

"This same ability to handle irregular and optional data without losing all typing and structure information is also relevant to handling change over time.

A common requirement in many system designs is to allow loose coupling between
clients and providers so that the providers can evolve over time without breaking existing clients (backward compatibility) and older providers can successfully respond to updated clients (forward compatibility).

Achieving this resilience to change is simpler using RDF/OWL than using a strict schema-validation approach, particularly due to the open world assumption."

Via, del.icio.us bookmarks for 2006-03-10.

Semantic hairball, y'all "if you had any idea how deadly seriously Big Business is taking this stuff: it's popular in terms of dollars and cents, even if it's not the gleam in your favorite blogger's eye). On one hand we have the Daedalos committee fastening labyrinth to labyrinth. On the other hand we have the tower of Web 2.0 Babel. We need a mob in the middle to burn 80% of the AI-one-more-time-for-your-mind-magic off of RDF, 80% of the chicago-cluster-consultant-diesel off of MDA, 80% of the toolkit-vendor-flypaper off of Web services. Once the ashes clear, we need folks to build lightweight tools that actually would help with extracting value from distributed information systems without scaring off the non-Ph.D.s."

Thursday, March 09, 2006

No Snappy Title

I posted this to the JRDF list but it hasn't come up on the archives - so it's here as well. I'm using this for a current project I'm doing at Uni - I don't know if it's a good topic but at least it's one I can complete in a sane amount of time and isn't completely reliant on working code.

Here's a brief description of the changes that I've made to the relational layer of JRDF. Basically, it now tries to closely follow the concepts of relations and tuples. So far, I think it's closer than previous attempts such as, "A relational algebra for SPARQL". One of the ideas that I keep coming back to is that having duplicates are indicative of using bags/multisets not sets - and RDF is all about sets. There's a whole stream of research on the power of bag languages ("Query Languages for Bags", which claims bag oriented languages can't do transitive closure for example) that I've yet to look at more fully.

Relational Tuples

Components:
  • Type name - integer, char, sno, name.
  • Attribute Name - status, city, sno, sname.
  • Attributes - status:integer, char:city, sno:sno, sname:name
  • Attribute:Value - sno sno('s1'), sname name('smith'), status 20, city 'london'
  • Heading - sno sno, sname name, status integer, city char.

Proposed Types for RDF

Basic interface:
  • isAssignableFrom - return true if the object is a super-type of the given type. Similar to Java's and Rel's.
  • getName - the name of the type.

Type hierarchy:
  • Object -> Subject -> Predicate. Meaning that Object is a super-type of Subject, which is a super-type of Predicate. This allows joining columns of different but compatible types.
  • URI Reference, Literal and BNode are all incompatible types of each other - you won't be able to join these.

They are all nodes. In the future this will allow selecting certain types or certain operations to be performed only on certain types.

Proposed JRDF Tuples

Components:
  • Types - subject, predicate, object, uri, literal, bnode. As defined above.
  • Attribute name - variable name or default name.
  • Attribute - s?:subject, P1:predicate, O1:object, P2:predicate, ?p:object, P3:predicate, ?city:object
  • Attribute:Value - s?:subject(#s1), P1:predicate(#name), O1:object('smith'), p?:predicate(#p1)
  • Heading - s? subject, P1 predicate, O1 object.


Proposed JRDF Relation

Components:
  • Heading/Attributes - set of attributes.
  • Body/Tuples - set of tuples


An aspect of this is that the heading of the relation doesn't modify the type in the attribute of the tuple. This means you always know the position of where in the graph the value came from. This used to bug me in Kowari/TKS that the underlying layers didn't know this information.

Example

Graph:
S1:subjectP1:predicateO1:object
s1#snos1
s1#spp1
s1#spp2
s2#spp1
s2#spp2
p1#city'London'
p1#city'Paris'

Query:


select ?sno ?pno ?city
...
where ?sno #sno s1
?sno #sp ?pno
?pno #city ?city

First Relation:

?sno:subjectP1:predicateO1:object
s1#snos1

Second Relation:

?sno:subjectP2:predicate?pno:object
s1#spp1
s1#spp2
s2#spp1
s2#spp2

Third Relation:

?pno:subjectP3:predicate?city:object
p1#city'London'
p2#city'Paris'

First and Second:

?sno:subjectP1:predicateO1:objectP2:predicate?pno:object
s1#snos1#snop1
s1#snos1#snop2

First and Second and Third:

?sno:subjectP1:predicateO1:objectP2:predicate?pno:objectP3:predicate?city:object
s1#snos1#snop1#city'London'
s1#snos1#snop2#city'Paris'

After Project:

?sno:subject?pno:object?city:object
s1p1'London'
s1p2'Paris'

Overcoming the problem with static methods in Java

There's a number of possible solutions that I've recently had the opportunity to see other people try in test driving static methods. Generally, when I've come across it it's been a refactoring job.

The general process is: create interface and change the statics to be just normal member variables on the class.

The problem is when it's something you can't change, for whatever reason.

If you've been left the chance to extend it you can simple add methods that wrap the statics and base these new methods on an interface.

If you're unable to extend it then you create a wrapper (or boundary) class and a matching interface.

Graphing Gaffs

Five Signs of Trouble in an Iteration "During the course of an iteration, an agile team is able to track it's own progress through the use of burndown charts. The team and the process facilitator can use the burndown chart to watch for signs of trouble. As a coach, I find the following five burndown shapes are common indicators of trouble."

The sixth one I would put down to doing work, like refactoring, test utilities and the like, and then making the expected work less.

Data and Metadata

Hysteresis, History and empty metadata fields "Hysteresis occurs whenever the effect that accompanies some cause is delayed for some reason. The term is most often associated with processes in the physical world. The movement of interest rates, the growth of insect populations, the rise and fall of magnetic fields, that sort of thing."

"There is a hysteresis-based relationship between content and non-trivial metadata about the content."

"Writers write and categorizers categorize. There is an unavoidable delay between the two activities. The writers and the categorizers can be the same people but the activities are very different and cannot be done at the same time. Build this hysteresis into your workflows rather than fight against it. The alternative is blank or dummy metadata fields."

Riki

KaukoluWiki "KaukoluWiki is a Java-JSP-based Semantic Wiki that manages its data by using Semantic Web tools.

The main reason for developing such a Semantic Wiki was the fact that most Web pages lack machine-readable semantics, i.e. means to include the meaning of a certain piece of data in a formalized representation. This is the reason why automated integration of knowledge and reasoning over this knowledge is not possible yet."

Via, gnowsis 0.9 technology preview.

Wednesday, March 08, 2006

Still Evolving

Still Evolving, Human Genes Tell New Story "Some are genes involved in digesting particular foods like the lactose-digesting gene common in Europeans. Some are genes that mediate taste and smell as well as detoxify plant poisons, perhaps signaling a shift in diet from wild foods to domesticated plants and animals."

"Dr. Pritchard's test for selection rests on the fact that an advantageous mutation is inherited along with its gene and a large block of DNA in which the gene sits. If the improved gene spreads quickly, the DNA region that includes it will become less diverse across a population because so many people now carry the same sequence of DNA units at that location."

Via Slashdot.

More on Benefits of Pair Programming

A slight dated bibliography on Pair Programming.

A Pair Programming Experience "The error analysis showed the project had achieved an error rate that was three orders of magnitude less than normal for the organization. Integration of the first two components (approximately 10,000 source lines) was completed with only two coding errors and one design error. The third component was integrated with no errors. The remaining three components had more errors, but the number of errors for these components was significantly less than normal."

Also, linked via Pair Programming.com.

Wednesday, March 01, 2006

XP Sunscreen

The New XP "...nature continuously uses fractal structures, which are similar to themselves but at various scales. The same principle should be applied to software development: we should be able to reuse similar solutions, in different contexts."

Or as Greg says, "Only by pursuing code reuse in the small will you ever achieve code
reuse in the large."

"Software defects must be looked for, found and fixed in many ways (pair programming, automated testing, sit together, real customer involvement, etc.). This is redundant, because many defects will be found many times. However, quality is priceless."

"...quality must be always at maximum. Accepting a lower quality does not yield neither savings, nor faster development. On the contrary, improving quality necessarily makes an improvement of other system features, like productivity and efficiency. Moreover, quality is not only an economic factor. Team members must be proud of their work because it improves team self-esteem and effectiveness."

"It is easy to order to developers “Do this”, or “Do that”, but it does not work. Unavoidably, you ask less than what could be achieved or, more likely, more than that can be accomplished."

Natural Enemies of XP

The corner desk...

Meta-system

The Most Important Idea in Computer Science links to two Alan Kay articles. The most interesting, "A Conversation with Alan Kay", "So the problem is—I’ve said this about both Smalltalk and Lisp—they tend to eat their young. What I mean is that both Lisp and Smalltalk are really fabulous vehicles, because they have a meta-system. They have so many ways of dealing with problems that the early-binding languages don’t have, that it’s very, very difficult for people who like Lisp or Smalltalk to imagine anything else."

"I feel like my answers are quite trivial since nobody really knows how to design a good language, including me."

Punny

The semantics of BYO… "As more wine makers switch away from the venerable cork stopper to the more pragmatic yet unromantic screw-top cap, I wonder if BYO restaurants will start to charge ‘torque-age’ instead of ‘corkage’?"

Fully Covered for Quality

In pursuit of code quality: Don't be fooled by the coverage report "I'll say it one more time: you can (and should) use test coverage tools as part of your testing process, but don't be fooled by the coverage report. The main thing to understand about coverage reports is that they're best used to expose code that hasn't been adequately tested. When you examine a coverage report, seek out the low values and understand why that particular code hasn't been tested fully. Knowing this, developers, managers, and QA professionals can use test coverage tools where they really count -- namely for three common scenarios:

* Estimating the time to modify existing code
* Evaluating code quality
* Assessing functional testing"

Also a previous article, Measure test coverage with Cobertura ""The general philosophy is this: if it can't break on its own, it's too simple to break. First example is the getX() method. Suppose the getX() method only answers the value of an instance variable. In that case, getX() cannot break unless either the compiler or the interpreter is also broken. For that reason, don't test getX(); there is no benefit. The same is true of the setX() method, although if your setX() method does any parameter validation or has any side effects, you likely need to test it."

I don't agree. I've lost count of the number of bugs I've found in code that was "too simple to break." It's true that some getters and setters are so trivial that there's no way they can fail."

"In theory, there's no guarantee that writing tests for uncovered code will reveal bugs. In practice, I've never seen it fail to find them. Untested code is full of bugs. The fewer tests you have, the more undiscovered bugs lurk in your code."

Both talk about a free code coverage tool, Cobertura.

A recent thread on the AJUG QLD list.