jm + reasoning   1

3 Rules of thumb for Bloom Filters
I often need to do rough back-of-the-envelope reasoning about things, and I find that doing a bit of work to develop an intuition for how a new technique performs is usually worthwhile. So, here are three broad rules of thumb to remember when discussing Bloom filters down the pub:

One byte per item in the input set gives about a 2% false positive rate.

The optimal number of hash functions is about 0.7 times the number of bits per item.

3 - The number of hashes dominates performance.

But see also http://stackoverflow.com/a/9554448 , http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/esa06.pdf (thanks Tony Finch!)
bloom-filters  algorithm  probabilistic  rules  reasoning  via:norman-maurer  false-positives  hashing  coding 
august 2014 by jm

Copy this bookmark:



description:


tags: