Wednesday, January 23, 2008

Solid State Drives

I was wondering what the effect would be on data structures and algorithms given the new features of SSD (solid state drives) - that is, drives with ten times the seek performance and improved read speeds. Desktop hard drives are about the same (18% slower) or beat SSDs in a straight line, sequential access, especially writes. In the short term, it looks like solid state drives biggest impact is likely to be in providing laptop drives the same performance as desktop ones.

A secondary short term impact may well be in providing another level of storage between spinning hard disks and other caches. This is mentioned in, "The five-minute rule twenty years later, and how flash memory changes the rules".

The name of their rule refers to the break-even interval between accesses. If a record (or page) is accessed more often, it should be kept in memory; otherwise, it should remain on disk and read when needed.

Not surprisingly, the optimal page size for B-tree indexes on modern high-bandwidth disks is much larger than traditional database systems have employed. The access time dominates for all small page sizes, such that additional byte transfer and thus additional utility are almost free. B-tree nodes of 256 KB are very near optimal...a traditional rotating hard disk, Table 3 indicates 337 seconds or just over 5 minutes.

Due to the lack of mechanical seeking and rotation, the transfer time dominates even for small pages. The optimal page size for B-trees on flash memory is 2 KB, much smaller than for traditional disk drives. In Table 3, the break-even interval for pages of 4 KB is 351 seconds.

Using O’Neil’s SB-trees, extents of 256 KB are the units of transfer between flash memory and disk, whereas pages of 4 KB are the unit of transfer between RAM and flash memory.


Mentions this paper, SB-Tree : An Index-Sequential Structure for High-Performance Sequential Access.

Via, Flash Memory and Databases.

No comments: