Dynalite
november 2015 by jm
Awesome new mock DynamoDB implementation:
We use DynamoDBLocal in our tests -- the availability of that tool is one of the key reasons we have adopted Dynamo so heavily, since we can safely test our code properly with it. This looks even better.
dynamodb
testing
unit-tests
integration-testing
tests
ops
dynalite
aws
leveldb
An implementation of Amazon's DynamoDB, focussed on correctness and performance, and built on LevelDB (well, @rvagg's awesome LevelUP to be precise). This project aims to match the live DynamoDB instances as closely as possible (and is tested against them in various regions), including all limits and error messages.
Why not Amazon's DynamoDB Local? Because it's too buggy! And it differs too much from the live instances in a number of key areas.
We use DynamoDBLocal in our tests -- the availability of that tool is one of the key reasons we have adopted Dynamo so heavily, since we can safely test our code properly with it. This looks even better.
november 2015 by jm
The New InfluxDB Storage Engine: A Time Structured Merge Tree
influxdb
storage
lsm-trees
leveldb
tsm-trees
data-structures
algorithms
time-series
tsd
compression
october 2015 by jm
The new engine has similarities with LSM Trees (like LevelDB and Cassandra’s underlying storage). It has a write ahead log, index files that are read only, and it occasionally performs compactions to combine index files. We’re calling it a Time Structured Merge Tree because the index files keep contiguous blocks of time and the compactions merge those blocks into larger blocks of time. Compression of the data improves as the index files are compacted. Once a shard becomes cold for writes it will be compacted into as few files as possible, which yield the best compression.
october 2015 by jm
Benchmarking LevelDB vs. RocksDB vs. HyperLevelDB vs. LMDB Performance for InfluxDB
Mind you, I'd prefer if this had tunable read/write/delete ratios, as YCSB does. Take with a pinch of salt, as with all benchmarks!
benchmarks
leveldb
datastores
storage
hyperleveldb
rocksdb
ycsb
lmdb
influxdb
june 2014 by jm
A few interesting things come out of these results. LevelDB is the winner on disk space utilization, RocksDB is the winner on reads and deletes, and HyperLevelDB is the winner on writes. On smaller runs (30M or less), LMDB came out on top on most of the metrics except for disk size. This is actually what we’d expect for B-trees: they’re faster the fewer keys you have in them.
Mind you, I'd prefer if this had tunable read/write/delete ratios, as YCSB does. Take with a pinch of salt, as with all benchmarks!
june 2014 by jm
Concurrency Improvements in HyperLevelDB
june 2014 by jm
Good-looking benchmark results here from HyperDex
hyperdex
hyperleveldb
leveldb
rocksdb
concurrency
lock-free
storage
persistence
june 2014 by jm
Faster BAM Sorting with SAMtools and RocksDB
may 2014 by jm
Now this is really really clever. Heap-merging a heavyweight genomics format, using RocksDB to speed it up.
(via the RocksDB facebook group)
rocksdb
algorithms
sorting
leveldb
bam
samtools
merging
heaps
compaction
There’s a problem with the single-pass merge described above when the number of intermediate files, N/R, is large. Merging the sorted intermediate files in limited memory requires constantly reading little bits from all those files, incurring a lot of disk seeks on rotating drives. In fact, at some point, samtools sort performance becomes effectively bound to disk seeking. [...] In this scenario, samtools rocksort can sort the same data in much less time, using no more memory, by invoking RocksDB’s background compaction capabilities. With a few extra lines of code we configure RocksDB so that, while we’re still in the process of loading the BAM data, it runs additional background threads to merge batches of existing sorted temporary files into fewer, larger, sorted files. Just like the final merge, each background compaction requires only a modest amount of working memory.
(via the RocksDB facebook group)
may 2014 by jm
Basho LevelDB supports tiered storage
april 2014 by jm
Tiered storage is turning out to be a pretty practical trick to take advantage of SSDs:
caching
tiered-storage
storage
ssds
ebs
leveldb
basho
patches
riak
iops
The justification for two types/speeds of storage arrays is simple. leveldb is extremely write intensive in its lower levels. The write intensity drops off as the level number increases. Similarly, current and frequently updated data tends to be in lower levels while archival data tends to be in higher levels. These leveldb characteristics create a desire to have faster, more expensive storage arrays for the high intensity lower levels. This branch allows the high intensity lower levels to be on expensive storage arrays while slower, less expensive storage arrays to hold the higher level data to reduce costs.
april 2014 by jm
Extending graphite’s mileage
january 2014 by jm
Ad company InMobi are using graphite heavily (albeit not as heavily as $work are), ran into the usual scaling issues, and chose to fix it in code by switching from a filesystem full of whisper files to a LevelDB per carbon-cache:
Very nice. I hope this gets merged/supported.
graphite
scalability
metrics
leveldb
storage
inmobi
whisper
carbon
open-source
The carbon server is now able to run without breaking a sweat even when 500K metrics per minute is being pumped into it. This has been in production since late August 2013 in every datacenter that we operate from.
Very nice. I hope this gets merged/supported.
january 2014 by jm
RocksDB
november 2013 by jm
' A persistent key-value store for fast storage environments', ie. BerkeleyDB/LevelDB competitor, from Facebook.
Lots of good discussion at https://news.ycombinator.com/item?id=6736900 too.
flash
ssd
rocksdb
databases
storage
nosql
facebook
bdb
disk
key-value-stores
lsm
leveldb
RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation.
We benchmarked LevelDB and found that it was unsuitable for our server workloads. Thebenchmark results look awesome at first sight, but we quickly realized that those results were for a database whose size was smaller than the size of RAM on the test machine - where the entire database could fit in the OS page cache. When we performed the same benchmarks on a database that was at least 5 times larger than main memory, the performance results were dismal.
By contrast, we've published the RocksDB benchmark results for server side workloads on Flash. We also measured the performance of LevelDB on these server-workload benchmarks and found that RocksDB solidly outperforms LevelDB for these IO bound workloads. We found that LevelDB's single-threaded compaction process was insufficient to drive server workloads. We saw frequent write-stalls with LevelDB that caused 99-percentile latency to be tremendously large. We found that mmap-ing a file into the OS cache introduced performance bottlenecks for reads. We could not make LevelDB consume all the IOs offered by the underlying Flash storage.
Lots of good discussion at https://news.ycombinator.com/item?id=6736900 too.
november 2013 by jm
LMDB response to a LevelDB-comparison blog post
august 2013 by jm
This seems like a good point to note about LMDB in general:
lmdb
leveldb
databases
openldap
storage
persistent
We state quite clearly that LMDB is read-optimized, not write-optimized. I wrote this for the OpenLDAP Project; LDAP workloads are traditionally 80-90% reads. Write performance was not the goal of this design, read performance is. We make no claims that LMDB is a silver bullet, good for every situation. It’s not meant to be – but it is still far better at many things than all of the other DBs out there that *do* claim to be good for everything.
august 2013 by jm
HyperLevelDB: A High-Performance LevelDB Fork
june 2013 by jm
'HyperLevelDB improves on LevelDB in two key ways:
Improved parallelism: HyperLevelDB uses more fine-grained locking internally to provide higher throughput for multiple writer threads.
Improved compaction: HyperLevelDB uses a different method of compaction that achieves higher throughput for write-heavy workloads, even as the database grows.'
leveldb
storage
key-value-stores
persistence
unix
libraries
open-source
Improved parallelism: HyperLevelDB uses more fine-grained locking internally to provide higher throughput for multiple writer threads.
Improved compaction: HyperLevelDB uses a different method of compaction that achieves higher throughput for write-heavy workloads, even as the database grows.'
june 2013 by jm
SSTable and Log Structured Storage: LevelDB
july 2012 by jm
good writeup of LevelDB's native storage formats; the Sorted String Table (SSTable), Log Structured Merge Trees, and Snappy compression
leveldb
nosql
data
storage
disk
persistence
google
july 2012 by jm
related tags
algorithms ⊕ aws ⊕ bam ⊕ basho ⊕ bdb ⊕ benchmarks ⊕ caching ⊕ carbon ⊕ compaction ⊕ compression ⊕ concurrency ⊕ data ⊕ data-structures ⊕ databases ⊕ datastores ⊕ disk ⊕ dynalite ⊕ dynamodb ⊕ ebs ⊕ facebook ⊕ files ⊕ flash ⊕ google ⊕ graphite ⊕ heaps ⊕ hyperdex ⊕ hyperleveldb ⊕ influxdb ⊕ inmobi ⊕ integration-testing ⊕ iops ⊕ key-value-stores ⊕ leveldb ⊖ libraries ⊕ lmdb ⊕ lock-free ⊕ lsm ⊕ lsm-trees ⊕ merging ⊕ metrics ⊕ nosql ⊕ open-source ⊕ openldap ⊕ ops ⊕ patches ⊕ performance ⊕ persistence ⊕ persistent ⊕ riak ⊕ rocksdb ⊕ samtools ⊕ scalability ⊕ sorting ⊕ ssd ⊕ ssds ⊕ storage ⊕ testing ⊕ tests ⊕ tiered-storage ⊕ time-series ⊕ tsd ⊕ tsm-trees ⊕ tuning ⊕ unit-tests ⊕ unix ⊕ whisper ⊕ ycsb ⊕Copy this bookmark: