jm + rocksdb   8

[LEGAL-303] ASF, RocksDB, and Facebook's BSD+patent grant licensing
Facebook's licensing includes a "nuclear option" if a user acts in a way interpreted by Facebook as competing with them; the ASF has marked the license as "Category-X", and may not be included in Apache projects as a result. Looks like RocksDB are going to relicense as dual GPLv2/ASL2 to clear this up, but React.js has not shown any plans to do so yet
react  rocksdb  licensing  asl2  apache  asf  facebook  open-source  patents 
4 weeks ago by jm
Cherami: Uber Engineering’s Durable and Scalable Task Queue in Go - Uber Engineering Blog

a competing-consumer messaging queue that is durable, fault-tolerant, highly available and scalable. We achieve durability and fault-tolerance by replicating messages across storage hosts, and high availability by leveraging the append-only property of messaging queues and choosing eventual consistency as our basic model. Cherami is also scalable, as the design does not have single bottleneck. [...]
Cherami is completely written in Go, a language that makes building highly performant and concurrent system software a lot of fun. Additionally, Cherami uses several libraries that Uber has already open sourced: TChannel for RPC and Ringpop for health checking and group membership. Cherami depends on several third-party open source technologies: Cassandra for metadata storage, RocksDB for message storage, and many other third-party Go packages that are available on GitHub. We plan to open source Cherami in the near future.
cherami  uber  queueing  tasks  queues  architecture  scalability  go  cassandra  rocksdb 
december 2016 by jm
Counting with domain specific databases — The Smyte Blog — Medium
whoa, pretty heavily engineered scalable counting system with Kafka, RocksDB and Kubernetes
kafka  rocksdb  kubernetes  counting  databases  storage  ops 
april 2016 by jm
Benchmarking LevelDB vs. RocksDB vs. HyperLevelDB vs. LMDB Performance for InfluxDB
A few interesting things come out of these results. LevelDB is the winner on disk space utilization, RocksDB is the winner on reads and deletes, and HyperLevelDB is the winner on writes. On smaller runs (30M or less), LMDB came out on top on most of the metrics except for disk size. This is actually what we’d expect for B-trees: they’re faster the fewer keys you have in them.

Mind you, I'd prefer if this had tunable read/write/delete ratios, as YCSB does. Take with a pinch of salt, as with all benchmarks!
benchmarks  leveldb  datastores  storage  hyperleveldb  rocksdb  ycsb  lmdb  influxdb 
june 2014 by jm
a distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services.

Cockroach implements a single, monolithic sorted map from key to value where both keys and values are byte strings (not unicode). Cockroach scales linearly (theoretically up to 4 exabytes (4E) of logical data). The map is composed of one or more ranges and each range is backed by data stored in RocksDB (a variant of LevelDB), and is replicated to a total of three or more cockroach servers. Ranges are defined by start and end keys. Ranges are merged and split to maintain total byte size within a globally configurable min/max size interval. Range sizes default to target 64M in order to facilitate quick splits and merges and to distribute load at hotspots within a key range. Range replicas are intended to be located in disparate datacenters for survivability (e.g. { US-East, US-West, Japan }, { Ireland, US-East, US-West}, { Ireland, US-East, US-West, Japan, Australia }).

Single mutations to ranges are mediated via an instance of a distributed consensus algorithm to ensure consistency. We’ve chosen to use the Raft consensus algorithm. All consensus state is stored in RocksDB.

A single logical mutation may affect multiple key/value pairs. Logical mutations have ACID transactional semantics. If all keys affected by a logical mutation fall within the same range, atomicity and consistency are guaranteed by Raft; this is the fast commit path. Otherwise, a non-locking distributed commit protocol is employed between affected ranges.

Cockroach provides snapshot isolation (SI) and serializable snapshot isolation (SSI) semantics, allowing externally consistent, lock-free reads and writes--both from an historical snapshot timestamp and from the current wall clock time. SI provides lock-free reads and writes but still allows write skew. SSI eliminates write skew, but introduces a performance hit in the case of a contentious system. SSI is the default isolation; clients must consciously decide to trade correctness for performance. Cockroach implements a limited form of linearalizability, providing ordering for any observer or chain of observers.

This looks nifty. One to watch.
cockroachdb  databases  storage  georeplication  raft  consensus  acid  go  key-value-stores  rocksdb 
may 2014 by jm
Faster BAM Sorting with SAMtools and RocksDB
Now this is really really clever. Heap-merging a heavyweight genomics format, using RocksDB to speed it up.
There’s a problem with the single-pass merge described above when the number of intermediate files, N/R, is large. Merging the sorted intermediate files in limited memory requires constantly reading little bits from all those files, incurring a lot of disk seeks on rotating drives. In fact, at some point, samtools sort performance becomes effectively bound to disk seeking. [...] In this scenario, samtools rocksort can sort the same data in much less time, using no more memory, by invoking RocksDB’s background compaction capabilities. With a few extra lines of code we configure RocksDB so that, while we’re still in the process of loading the BAM data, it runs additional background threads to merge batches of existing sorted temporary files into fewer, larger, sorted files. Just like the final merge, each background compaction requires only a modest amount of working memory.

(via the RocksDB facebook group)
rocksdb  algorithms  sorting  leveldb  bam  samtools  merging  heaps  compaction 
may 2014 by jm
' A persistent key-value store for fast storage environments', ie. BerkeleyDB/LevelDB competitor, from Facebook.
RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation.

We benchmarked LevelDB and found that it was unsuitable for our server workloads. Thebenchmark results look awesome at first sight, but we quickly realized that those results were for a database whose size was smaller than the size of RAM on the test machine - where the entire database could fit in the OS page cache. When we performed the same benchmarks on a database that was at least 5 times larger than main memory, the performance results were dismal.

By contrast, we've published the RocksDB benchmark results for server side workloads on Flash. We also measured the performance of LevelDB on these server-workload benchmarks and found that RocksDB solidly outperforms LevelDB for these IO bound workloads. We found that LevelDB's single-threaded compaction process was insufficient to drive server workloads. We saw frequent write-stalls with LevelDB that caused 99-percentile latency to be tremendously large. We found that mmap-ing a file into the OS cache introduced performance bottlenecks for reads. We could not make LevelDB consume all the IOs offered by the underlying Flash storage.

Lots of good discussion at too.
flash  ssd  rocksdb  databases  storage  nosql  facebook  bdb  disk  key-value-stores  lsm  leveldb 
november 2013 by jm

Copy this bookmark: