A comprehensive 10-page probability cheatsheet that covers a semester's worth of introduction to probability.
math  probability 
9 weeks ago by mpm
Probabilistic Data Structure Showdown: Cuckoo Filters vs. Bloom Filters
This post provides an update by exploring Cuckoo filters, a new probabilistic data structure that improves upon the standard Bloom filter
algorithm  bloomfilter  probability 
december 2016 by mpm
Probability and Statistics Cookbook
The cookbook contains a succinct representation of various topics in probability theory and statistics. It provides a comprehensive reference reduced to the mathematical essence, rather than aiming for elaborate explanations
math  statistics  probability  book 
september 2012 by mpm
Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm
This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of distinct elements (the cardinality) of very large data ensembles
algorithm  probability 
september 2012 by mpm
Chart of distribution relationships
Probability distributions have a surprising number inter-connections
statistics  probability 
july 2012 by mpm
Count-Min Sketch
The Count-Min sketch is a simple technique to summarize large amounts of frequency data.  It was introduced in 2003, and since then has inspired many applications, extensions and variations.  This sitelet collects and explains this work on the Count-Min, or CM, sketch
probability  datastructure 
june 2012 by mpm
Multi-armed bandit problem
The multi-armed bandit problem takes its terminology from a casino. You are faced with a wall of slot machines, each with its own lever. You suspect that some slot machines pay out more frequently than others. How can you learn which machine is the best, and get the most coins in the fewest trials?
algorithm  probability 
may 2012 by mpm
Think Stats
Think Stats: Probability and Statistics for Programmers is a textbook for a new kind of introductory prob-stat class. It emphasizes the use of statistics to explore large datasets. It takes a computational approach
book  statistics  math  probability 
january 2012 by mpm
Darts, Dice, and Coins
You are given an n-sided die where side i has probability p i of being rolled. What is the most efficient data structure for simulating rolls of the die?
probability  algorithm  datastructure 
december 2011 by mpm

