jm + ssd   13

Evolution of Application Data Caching : From RAM to SSD
Memcached provides an external storage shim called extstore, that supports storing of data on SSD (I2) and NVMe (I3). extstore is efficient in terms of cost & storage device utilization without compromising the speed and throughput. All the metadata (key & other metadata) is stored in RAM whereas the actual data is stored on flash.
memcached  netflix  services  storage  memory  ssd  nvme  extstore  caching 
4 weeks ago by jm
RIPQ: Advanced photo caching on flash for Facebook
Interesting priority-queue algorithm optimised for caching data on SSD
priority-queue  algorithms  facebook  ssd  flash  caching  ripq  papers 
february 2015 by jm
$0.99 VPS hosting
this is nuts. 99 cents per month for a super-cheap host -- I'm sure there's a use case for this (via Elliot)
via:elliot  cheap  hosting  ssd  vps  linux  atlantic  1-dollar 
october 2014 by jm
Two traps in iostat: %util and svctm
Marc Brooker:
As a measure of general IO busyness %util is fairly handy, but as an indication of how much the system is doing compared to what it can do, it's terrible. Iostat's svctm has even fewer redeeming strengths. It's just extremely misleading for most modern storage systems and workloads. Both of these fields are likely to mislead more than inform on modern SSD-based storage systems, and their use should be treated with extreme care.
ioutil  iostat  svctm  ops  ssd  disks  hardware  metrics  stats  linux 
july 2014 by jm
SSD shadiness: Kingston and PNY caught bait-and-switching cheaper components after good reviews | ExtremeTech
Imagine buying a high-end Core i7 or AMD CPU, opening the box, and finding a midrange part sitting there with an asterisk and the label “Performs Just Like Our High End CPU In Single-Threaded SuperPi!”
ssd  storage  hardware  sketchy  kingston  pny  bait-and-switch  components  vendors  via:hn 
june 2014 by jm
Linode announces new instance specs
'TL;DR: SSDs + Insane network + Faster processors + Double the RAM + Hourly Billing'
hosting  linode  ssd  performance  linux  ops  datacenters 
april 2014 by jm
"Understanding the Robustness of SSDs under Power Fault", FAST '13 [paper]
Horrific. SSDs (including "enterprise-class storage") storing sync'd writes in volatile RAM while claiming they were synced; one device losing 72.6GB, 30% of its data, after 8 injected power faults; and all SSDs tested displayed serious errors including random bit errors, metadata corruption, serialization errors and shorn writes. Don't trust lone unreplicated, unbacked-up SSDs!
pdf  papers  ssd  storage  reliability  safety  hardware  ops  usenix  serialization  shorn-writes  bit-errors  corruption  fsync 
january 2014 by jm
' A persistent key-value store for fast storage environments', ie. BerkeleyDB/LevelDB competitor, from Facebook.
RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation.

We benchmarked LevelDB and found that it was unsuitable for our server workloads. Thebenchmark results look awesome at first sight, but we quickly realized that those results were for a database whose size was smaller than the size of RAM on the test machine - where the entire database could fit in the OS page cache. When we performed the same benchmarks on a database that was at least 5 times larger than main memory, the performance results were dismal.

By contrast, we've published the RocksDB benchmark results for server side workloads on Flash. We also measured the performance of LevelDB on these server-workload benchmarks and found that RocksDB solidly outperforms LevelDB for these IO bound workloads. We found that LevelDB's single-threaded compaction process was insufficient to drive server workloads. We saw frequent write-stalls with LevelDB that caused 99-percentile latency to be tremendously large. We found that mmap-ing a file into the OS cache introduced performance bottlenecks for reads. We could not make LevelDB consume all the IOs offered by the underlying Flash storage.

Lots of good discussion at too.
flash  ssd  rocksdb  databases  storage  nosql  facebook  bdb  disk  key-value-stores  lsm  leveldb 
november 2013 by jm
Voldemort on Solid State Drives [paper]
'This paper and talk was given by the LinkedIn Voldemort Team at the Workshop on Big Data Benchmarking (WBDB May 2012).'

With SSD, we find that garbage collection will become a very significant bottleneck, especially for systems which have little control over the storage layer and rely on Java memory management. Big heapsizes make the cost of garbage collection expensive, especially the single threaded CMS Initial mark. We believe that data systems must revisit their caching strategies with SSDs. In this regard, SSD has provided an efficient solution for handling fragmentation and moving towards predictable multitenancy.
voldemort  storage  ssd  disk  linkedin  big-data  jvm  tuning  ops  gc 
september 2013 by jm
Instagram: Making the Switch to Cassandra from Redis, a 75% 'Insta' Savings
shifting data out of RAM and onto SSDs -- unsurprisingly, big savings.
a 12 node cluster of EC2 hi1.4xlarge instances; we store around 1.2TB of data across this cluster. At peak, we're doing around 20,000 writes per second to that specific cluster and around 15,000 reads per second. We've been really impressed with how well Cassandra has been able to drop into that role.
ram  ssd  cassandra  databases  nosql  redis  instagram  storage  ec2 
june 2013 by jm
'Mythbusting Modern Hardware to gain "Mechanical Sympathy"' [slides]
Martin Thompson's latest talk -- taking a few common concepts about modern hardware performance and debunking/confirming them, mythbusters-style
mythbusters  hardware  mechanical-sympathy  martin-thompson  java  performance  cpu  disks  ssd 
may 2013 by jm
from Twitter -- 'a cache for your big data. Even though memory is thousand times faster than SSD, network connected SSD-backed memory makes sense, if we design the system in a way that network latencies dominate over the SSD latencies by a large factor. To understand why network connected SSD makes sense, it is important to understand the role distributed memory plays in large-scale web architecture. In recent years, terabyte-scale, distributed, in-memory caches have become a fundamental building block of any web architecture. In-memory indexes, hash tables, key-value stores and caches are increasingly incorporated for scaling throughput and reducing latency of persistent storage systems. However, power consumption, operational complexity and single node DRAM cost make horizontally scaling this architecture challenging. The current cost of DRAM per server increases dramatically beyond approximately 150 GB, and power cost scales similarly as DRAM density increases. Fatcache extends a volatile, in-memory cache by incorporating SSD-backed storage.'
twitter  ssd  cache  caching  memcached  memcache  memory  network  storage 
february 2013 by jm
AnandTech - The Intel SSD DC S3700: Intel's 3rd Generation Controller Analyzed
Interesting trend; Intel moved from a btree to an array-based data structure for their logical-block address indirection map, in order to reduce worst-case latencies (via Martin Thompson)
latency  intel  via:martin-thompson  optimization  speed  p99  data-structures  arrays  btrees  ssd  hardware 
november 2012 by jm

Copy this bookmark: