Monday, September 20, 2004

Scaling Redland

Impact of storing RDF triples with Redland "In my prototyping Archipel, my pet software configuration management system, I started to use Redland to store the version information as RDF triples. I quickly realised that the RDF storage (stored by Redland using Berkeley DB) was using a lot of space, compared to what I was doing. For instance, storing 143 files generated a database of 624Kb, plus a directory containing the actual file content (the RDF storage did only contain versioning information). This is something like 5 to 6 time the size I was expecting."

"It seems that unless I am not using Redland Python API properly, Redland has an important overhead on storing triples. I hoped to use it as a storage backend for Archipel, because I liked the idea of managing version information in RDF, but the overhead is disappointing, if not discouraging.

However, Redland scales really well, and obvisouly will not grow in an unexpected manner when you reach the million of triples, which makes it really robust."

No comments: