jm + leveldb   13

Dynalite
Awesome new mock DynamoDB implementation:
An implementation of Amazon's DynamoDB, focussed on correctness and performance, and built on LevelDB (well, @rvagg's awesome LevelUP to be precise). This project aims to match the live DynamoDB instances as closely as possible (and is tested against them in various regions), including all limits and error messages.

Why not Amazon's DynamoDB Local? Because it's too buggy! And it differs too much from the live instances in a number of key areas.


We use DynamoDBLocal in our tests -- the availability of that tool is one of the key reasons we have adopted Dynamo so heavily, since we can safely test our code properly with it. This looks even better.
dynamodb  testing  unit-tests  integration-testing  tests  ops  dynalite  aws  leveldb 
november 2015 by jm
The New InfluxDB Storage Engine: A Time Structured Merge Tree
The new engine has similarities with LSM Trees (like LevelDB and Cassandra’s underlying storage). It has a write ahead log, index files that are read only, and it occasionally performs compactions to combine index files. We’re calling it a Time Structured Merge Tree because the index files keep contiguous blocks of time and the compactions merge those blocks into larger blocks of time. Compression of the data improves as the index files are compacted. Once a shard becomes cold for writes it will be compacted into as few files as possible, which yield the best compression.
influxdb  storage  lsm-trees  leveldb  tsm-trees  data-structures  algorithms  time-series  tsd  compression 
october 2015 by jm
Benchmarking LevelDB vs. RocksDB vs. HyperLevelDB vs. LMDB Performance for InfluxDB
A few interesting things come out of these results. LevelDB is the winner on disk space utilization, RocksDB is the winner on reads and deletes, and HyperLevelDB is the winner on writes. On smaller runs (30M or less), LMDB came out on top on most of the metrics except for disk size. This is actually what we’d expect for B-trees: they’re faster the fewer keys you have in them.


Mind you, I'd prefer if this had tunable read/write/delete ratios, as YCSB does. Take with a pinch of salt, as with all benchmarks!
benchmarks  leveldb  datastores  storage  hyperleveldb  rocksdb  ycsb  lmdb  influxdb 
june 2014 by jm
Faster BAM Sorting with SAMtools and RocksDB
Now this is really really clever. Heap-merging a heavyweight genomics format, using RocksDB to speed it up.
There’s a problem with the single-pass merge described above when the number of intermediate files, N/R, is large. Merging the sorted intermediate files in limited memory requires constantly reading little bits from all those files, incurring a lot of disk seeks on rotating drives. In fact, at some point, samtools sort performance becomes effectively bound to disk seeking. [...] In this scenario, samtools rocksort can sort the same data in much less time, using no more memory, by invoking RocksDB’s background compaction capabilities. With a few extra lines of code we configure RocksDB so that, while we’re still in the process of loading the BAM data, it runs additional background threads to merge batches of existing sorted temporary files into fewer, larger, sorted files. Just like the final merge, each background compaction requires only a modest amount of working memory.


(via the RocksDB facebook group)
rocksdb  algorithms  sorting  leveldb  bam  samtools  merging  heaps  compaction 
may 2014 by jm
Basho LevelDB supports tiered storage
Tiered storage is turning out to be a pretty practical trick to take advantage of SSDs:
The justification for two types/speeds of storage arrays is simple. leveldb is extremely write intensive in its lower levels. The write intensity drops off as the level number increases. Similarly, current and frequently updated data tends to be in lower levels while archival data tends to be in higher levels. These leveldb characteristics create a desire to have faster, more expensive storage arrays for the high intensity lower levels. This branch allows the high intensity lower levels to be on expensive storage arrays while slower, less expensive storage arrays to hold the higher level data to reduce costs.
caching  tiered-storage  storage  ssds  ebs  leveldb  basho  patches  riak  iops 
april 2014 by jm
Extending graphite’s mileage
Ad company InMobi are using graphite heavily (albeit not as heavily as $work are), ran into the usual scaling issues, and chose to fix it in code by switching from a filesystem full of whisper files to a LevelDB per carbon-cache:
The carbon server is now able to run without breaking a sweat even when 500K metrics per minute is being pumped into it. This has been in production since late August 2013 in every datacenter that we operate from.


Very nice. I hope this gets merged/supported.
graphite  scalability  metrics  leveldb  storage  inmobi  whisper  carbon  open-source 
january 2014 by jm
RocksDB
' A persistent key-value store for fast storage environments', ie. BerkeleyDB/LevelDB competitor, from Facebook.
RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation.

We benchmarked LevelDB and found that it was unsuitable for our server workloads. Thebenchmark results look awesome at first sight, but we quickly realized that those results were for a database whose size was smaller than the size of RAM on the test machine - where the entire database could fit in the OS page cache. When we performed the same benchmarks on a database that was at least 5 times larger than main memory, the performance results were dismal.

By contrast, we've published the RocksDB benchmark results for server side workloads on Flash. We also measured the performance of LevelDB on these server-workload benchmarks and found that RocksDB solidly outperforms LevelDB for these IO bound workloads. We found that LevelDB's single-threaded compaction process was insufficient to drive server workloads. We saw frequent write-stalls with LevelDB that caused 99-percentile latency to be tremendously large. We found that mmap-ing a file into the OS cache introduced performance bottlenecks for reads. We could not make LevelDB consume all the IOs offered by the underlying Flash storage.


Lots of good discussion at https://news.ycombinator.com/item?id=6736900 too.
flash  ssd  rocksdb  databases  storage  nosql  facebook  bdb  disk  key-value-stores  lsm  leveldb 
november 2013 by jm
LMDB response to a LevelDB-comparison blog post
This seems like a good point to note about LMDB in general:

We state quite clearly that LMDB is read-optimized, not write-optimized. I wrote this for the OpenLDAP Project; LDAP workloads are traditionally 80-90% reads. Write performance was not the goal of this design, read performance is. We make no claims that LMDB is a silver bullet, good for every situation. It’s not meant to be – but it is still far better at many things than all of the other DBs out there that *do* claim to be good for everything.
lmdb  leveldb  databases  openldap  storage  persistent 
august 2013 by jm
HyperLevelDB: A High-Performance LevelDB Fork
'HyperLevelDB improves on LevelDB in two key ways:
Improved parallelism: HyperLevelDB uses more fine-grained locking internally to provide higher throughput for multiple writer threads.
Improved compaction: HyperLevelDB uses a different method of compaction that achieves higher throughput for write-heavy workloads, even as the database grows.'
leveldb  storage  key-value-stores  persistence  unix  libraries  open-source 
june 2013 by jm
SSTable and Log Structured Storage: LevelDB
good writeup of LevelDB's native storage formats; the Sorted String Table (SSTable), Log Structured Merge Trees, and Snappy compression
leveldb  nosql  data  storage  disk  persistence  google 
july 2012 by jm
LevelDB Benchmarks
nice results, particularly for sequential ops. will be a Riak backend vs InnoDB
leveldb  riak  databases  files  disk  google  storage  benchmarks 
july 2011 by jm

Copy this bookmark:



description:


tags: