The ideas in this paper will be incorporated into the Vertica database product. And unfortunately it won’t be open source. At least that’s what one company employee commented on Slashdot.
In the same way that RAID design options (e.g. 1, 5 and 10) can accommodate multiple drive failures, the Vertica system will distribute the same slice of the database to several servers. A grid of commodity hardware can act as a high-availability system and Vertica’s shared-nothing architecture enables this feature without complex design or execution.
We call a system that tolerates K failures K-safe. C-Store will be configurable to support a range of values of K.
Inserts and updates are performed on a separate data store and merged in batches. Deletes are marked with bitmasks. Rather than building a complex locking scheme for grid members, data in the read-only and write stores is stamped with a time “epoch”. Queries specify an epoch. It’s an elegant implementation that is very well suited to a data warehouse.
Related posts:
Comments
Leave a comment Trackback