Thursday, September 06, 2007

Perfect Storm (of Erlang)

It would be remiss of me not to keep track one of the other contenders for scalability. The first hint was a few days ago with, "CouchDB: Thinking beyond the RDBMS":
CouchDB on first look seems like the future of database without the weight that is SQL and write consistency.

It stores document in a flat space.

There are no schemas. But you do store (and retrieve) JSON objects. Cool kids rejoice.

And all this happens using Real REST (you know, the one with PUT, DELETE and no envelopes to hide stuff), so it doesn’t matter that CouchDB is implemented in Erlang. (In fact, Erlang is a feature)

CouchDB and for even cooler kids, there's a Ruby API too.

The other bit that adds to this is the RDF JSON specification or RDFON. The first specification doesn't support nesting which means only named blank nodes are supported - which I think is a bug - naming things, especially things that don't naturally have names, can be a pain. The second was criticized as being a diversion away from N3.

I'm for a way to make it easier for programmers and web browsers to have better access to RDF data. Whether there needs to be another format to describe RDF is a bit unclear though - although being able to store it in a scalable way, perform queries and update it through REST seems like a positive. JSON is in that fuzzy area between code and data with its own drawbacks (like security).

Just to keep the pro scale-out/Hadoop/MapReduce etc. architecture going here's a recent posting on the more general issue:
Functional programming paradigms, the map/reduce pattern, and to a lesser extent distributed and parallel processing in general are subjects not widely understood by most quasi-technical management. Further, the notion of commodity machines with guarenteed lack of reliability as a means of achieving high performance and high scalability is essentially counterintuitive. Even referring newcomers to what I regard as the seminal papers on these topics (the papers written by Ghemewat, Dean et al at Google, and yes I know it all started with LISP but my management wasn't even alive in the 1970s although I was :-)), people steeped in a long tradition of "shared everything" database architectures still don't quite get it. I spend considerable amounts of time in what amounts to management de-programming: No MySQL can't do this, and Oracle can't either, except with Oracle it will cost you a lot more to find that out.


Update: Dare corrects some mistakes made in the above posting about CouchDB. The way I read the original was it wasn't saying it was replacing relational databases just offering a better solution for certain kinds of problems. Where, "thinking beyond" is not being constrained by the expected features of a database.

No comments: