**jm + hyperloglog**
12

HyperBitBit

5 weeks ago by jm

jomsdev notes:

'Last year, in the AofA’16 conference Robert Sedgewick proposed a new algorithm for cardinality estimation. Robert Sedgwick is a professor at Princeton with a long track of publications on combinatorial/randomized algorithms. He was a good friend of Philippe Flajolet (creator of Hyperloglog) and HyperBitBit it's based on the same ideas. However, it uses less memory than Hyperloglog and can provide the same results. On practical data, HyperBitBit, for N < 2^64 estimates cardinality within 10% using only 128 + 6 bits.'

algorithms
programming
cs
hyperloglog
estimation
cardinality
counting
hyperbitbit
'Last year, in the AofA’16 conference Robert Sedgewick proposed a new algorithm for cardinality estimation. Robert Sedgwick is a professor at Princeton with a long track of publications on combinatorial/randomized algorithms. He was a good friend of Philippe Flajolet (creator of Hyperloglog) and HyperBitBit it's based on the same ideas. However, it uses less memory than Hyperloglog and can provide the same results. On practical data, HyperBitBit, for N < 2^64 estimates cardinality within 10% using only 128 + 6 bits.'

5 weeks ago by jm

hyperlogsandwich

(via Patrick McFadin)
via:patrickmcfadin
hyperloglog
cardinality
data-structures
algorithms
hyperlogsandwich
counting
estimation
lossy
multisets

may 2015 by jm

A probabilistic data structure for frequency/k-occurrence cardinality estimation of multisets. Sample implementation

(via Patrick McFadin)

may 2015 by jm

HyperLogLog - Intersection Arithmetic

hyperloglog
hll
hyperloglogplus
streamlib
intersections
sets
estimation
algorithms

april 2014 by jm

'In general HLL intersection in StreamLib works. |A INTERSECT B|

= |A| + |B| - |A UNION B|. Timon's article on intersection is

important to read though. The usefulness of HLL intersection depends

on the features of the HLLs you are intersecting.'

april 2014 by jm

Druid | How We Scaled HyperLogLog: Three Real-World Optimizations

april 2014 by jm

3 optimizations Druid.io have made to the HLL algorithm to scale it up for production use in Metamarkets: compacting registers (fixes a bug with unions of multiple HLLs); a sparse storage format (to optimize space); faster lookups using a lookup table.

druid.io
metamarkets
scaling
hyperloglog
hll
algorithms
performance
optimization
counting
estimation
april 2014 by jm

Redis adds support for HyperLogLog

april 2014 by jm

good comment thread on HN, discussing hlld and bloomd as well

hll
bloom-filters
hyperloglog
redis
data-structures
estimation
cardinality
probabilistic
probability
hashing
random
april 2014 by jm

Philippe Flajolet’s contribution to streaming algorithms [preso]

november 2013 by jm

Nice deck covering HyperLogLog and its origins, plus a slide at the end covering the Flajolet/Wegman Adaptive Sampling algorithm ("how do you count the number of elements which appear only once in stream using constant size memory?")

algorithms
sketching
hyperloglog
flajolet
wegman
adaptive-sampling
sampling
presentations
slides
november 2013 by jm

js-hll

june 2013 by jm

Good UI for exploration of HyperLogLog set intersections and unions.

javascript
ui
hll
hyperloglog
algorithms
sketching
js
sets
intersection
union
apache
open-source
One of the first things that we wanted to do with HyperLogLog when we first started playing with it was to support and expose it natively in the browser. The thought of allowing users to directly interact with these structures -- perform arbitrary unions and intersections on effectively unbounded sets all on the client -- was exhilarating to us. [...] we are pleased to announce the open-source release of AK’s HyperLogLog implementation for JavaScript, js-hll. We are releasing this code under the Apache License, Version 2.0.

We knew that we couldn’t just release a bunch of JavaScript code without allowing you to see it in action — that would be a crime. We passed a few ideas around and the one that kept bubbling to the top was a way to kill two birds with one stone. We wanted something that would showcase what you can do with HLL in the browser and give us a tool for explaining HLLs. It is typical for us to explain how HLL intersections work using a Venn diagram. You draw some overlapping circles with a border that represents the error and you talk about how if that border is close to or larger than the intersection then you can’t say much about the size of that intersection. This works just ok on a whiteboard but what you really want is to just build a visualization that allows you to select from some sets and see the overlap. Maybe even play with the precision a little bit to see how that changes the result. Well, we did just that!

june 2013 by jm

Sketch of the Day: HyperLogLog — Cornerstone of a Big Data Infrastructure

february 2013 by jm

includes a nice javascript demo of HLL

hyperloglog
loglog
algorithms
stream-processing
streams
estimation
demos
javascript
february 2013 by jm

HyperLogLog++: Google’s Take On Engineering HLL

february 2013 by jm

Google and AggregateKnowledge's improvements to the HyperLogLog cardinality estimation algorithm

hyperloglog
cardinality
estimation
streaming
stream-processing
cep
february 2013 by jm

clearspring / stream-lib

february 2013 by jm

ASL-licensed open source library of stream-processing/approximation algorithms: count-min sketch, space-saving top-k, cardinality estimation, LogLog, HyperLogLog, MurmurHash, lookup3 hash, Bloom filters, q-digest, stochastic top-k

algorithms
coding
streams
cep
stream-processing
approximation
probabilistic
space-saving
top-k
cardinality
estimation
bloom-filters
q-digest
loglog
hyperloglog
murmurhash
lookup3
february 2013 by jm

aaw/hyperloglog-redis - GitHub

january 2013 by jm

'This gem is a pure Ruby implementation of the HyperLogLog algorithm for estimating cardinalities of sets observed via a stream of events. A Redis instance is used for storing the counters.'

cardinality
sets
redis
algorithms
ruby
gems
hyperloglog
january 2013 by jm

**related tags**

Copy this bookmark: