jm + caches   8

_Optimal Probabilistic Cache Stampede Prevention_ [pdf]
'When a frequently-accessed cache item expires, multiple requests
to that item can trigger a cache miss and start regenerating
that same item at the same time. This phenomenon,
known as cache stampede, severely limits the performance
of databases and web servers. A natural countermeasure to
this issue is to let the processes that perform such requests
to randomly ask for a regeneration before the expiration
time of the item. In this paper we give optimal algorithms
for performing such probabilistic early expirations. Our algorithms
are theoretically optimal and have much better
performances than other solutions used in real-world applications.'

(via Marc Brooker)
via:marcbrooker  caching  caches  algorithm  probabilistic  expiration  vldb  papers  expiry  cache-miss  stampedes 
16 days ago by jm
Cache-friendly binary search
by reordering items to optimize locality. Via aphyr's dad!
caches  cache-friendly  optimization  data-locality  performance  coding  algorithms 
november 2015 by jm
Caffeine cache adopts Window TinyLfu eviction policy
'Caffeine is a Java 8 rewrite of Guava's cache. In this version we focused on improving the hit rate by evaluating alternatives to the classic least-recenty-used (LRU) eviction policy. In collaboration with researchers at Israel's Technion, we developed a new algorithm that matches or exceeds the hit rate of the best alternatives (ARC, LIRS). A paper of our work is being prepared for publication.'

W-TinyLfu uses a small admission LRU that evicts to a large Segmented LRU if accepted by the TinyLfu admission policy. TinyLfu relies on a frequency sketch to probabilistically estimate the historic usage of an entry. The window allows the policy to have a high hit rate when entries exhibit a high temporal / low frequency access pattern which would otherwise be rejected. The configuration enables the cache to estimate the frequency and recency of an entry with low overhead. This implementation uses a 4-bit CountMinSketch, growing at 8 bytes per cache entry to be accurate. Unlike ARC and LIRS, this policy does not retain non-resident keys.
tinylfu  caches  caching  cache-eviction  java8  guava  caffeine  lru  count-min  sketching  algorithms 
november 2015 by jm
You're probably wrong about caching
Excellent cut-out-and-keep guide to why you should add a caching layer. I've been following this practice for the past few years, after I realised that #6 (recovering from a failed cache is hard) is a killer -- I've seen a few large-scale outages where a production system had gained enough scale that it required a cache to operate, and once that cache was damaged, bringing the system back online required a painful rewarming protocol. Better to design for the non-cached case if possible.
architecture  caching  coding  design  caches  ops  production  scalability 
september 2015 by jm
Memory Layouts for Binary Search
Key takeaway:
Nearly uni­ver­sally, B-trees win when the data gets big enough.
caches  cpu  performance  optimization  memory  binary-search  b-trees  algorithms  search  memory-layout 
may 2015 by jm
How to avoid crappy ISP caches when viewing YouTube video
Must give this a try when I get home -- I frequently have latency problems watching YT on my UPC connection, and I bet they have a crappily-managed, overloaded cache box on their network.
streaming  youtube  caching  isps  caches  firewalls  iptables  hacks  video  networking 
august 2013 by jm
Jeff Dean's list of "Numbers Everyone Should Know"
from a 2007 Google all-hands, the list of typical latency timings from ranging from an L1 cache reference (0.5 nanoseconds) to a CA->NL->CA IP round trip (150 milliseconds).
performance  latencies  google  jeff-dean  timing  caches  speed  network  zippy  disks  via:kellabyte 
march 2013 by jm
_Fast Cache for Your Text: Accelerating Exact Pattern Matching with Feed-Forward Bloom Filters_ [PDF]
intriguing application of a Bloom Filter optimised for modern CPUs (2-level, with a cache-partitioned first level), providing massive speedups vs GNU grep or trie-based approaches like Aho-Corasick -- or possibly re2c, as used in "sa-compile". On the other hand, a perl implementation of Rabin-Karp, which is similar, didn't perform as well. Still, may be worth investigating
bloom-filters  grep  filtering  spamassassin  sa-compile  text-matching  caches  aho-corasick  from delicious
september 2010 by jm

Copy this bookmark: