nhaliday + online-learning   39

Predicting with confidence: the best machine learning idea you never heard of | Locklin on science
The advantages of conformal prediction are many fold. These ideas assume very little about the thing you are trying to forecast, the tool you’re using to forecast or how the world works, and they still produce a pretty good confidence interval. Even if you’re an unrepentant Bayesian, using some of the machinery of conformal prediction, you can tell when things have gone wrong with your prior. The learners work online, and with some modifications and considerations, with batch learning. One of the nice things about calculating confidence intervals as a part of your learning process is they can actually lower error rates or use in semi-supervised learning as well. Honestly, I think this is the best bag of tricks since boosting; everyone should know about and use these ideas.

The essential idea is that a “conformity function” exists. Effectively you are constructing a sort of multivariate cumulative distribution function for your machine learning gizmo using the conformity function. Such CDFs exist for classical stuff like ARIMA and linear regression under the correct circumstances; CP brings the idea to machine learning in general, and to models like ARIMA when the standard parametric confidence intervals won’t work. Within the framework, the conformity function, whatever may be, when used correctly can be guaranteed to give confidence intervals to within a probabilistic tolerance. The original proofs and treatments of conformal prediction, defined for sequences, is extremely computationally inefficient. The conditions can be relaxed in many cases, and the conformity function is in principle arbitrary, though good ones will produce narrower confidence regions. Somewhat confusingly, these good conformity functions are referred to as “efficient” -though they may not be computationally efficient.
techtariat  acmtariat  acm  machine-learning  bayesian  stats  exposition  research  online-learning  probability  decision-theory  frontier  unsupervised  confidence 
february 2017 by nhaliday
Intelligent Agent Foundations Forum | Online Learning 1: Bias-detecting online learners
apparently can maybe be used to shave exponent from Christiano's manipulation-resistant reputation system paper
ratty  clever-rats  online-learning  acm  research  ai-control  miri-cfar 
november 2016 by nhaliday
A Fervent Defense of Frequentist Statistics - Less Wrong
Short summary. This essay makes many points, each of which I think is worth reading, but if you are only going to understand one point I think it should be “Myth 5″ below, which describes the online learning framework as a response to the claim that frequentist methods need to make strong modeling assumptions. Among other things, online learning allows me to perform the following remarkable feat: if I’m betting on horses, and I get to place bets after watching other people bet but before seeing which horse wins the race, then I can guarantee that after a relatively small number of races, I will do almost as well overall as the best other person, even if the number of other people is very large (say, 1 billion), and their performance is correlated in complicated ways.

If you’re only going to understand two points, then also read about the frequentist version of Solomonoff induction, which is described in “Myth 6″.


If you are like me from, say, two years ago, you are firmly convinced that Bayesian methods are superior and that you have knockdown arguments in favor of this. If this is the case, then I hope this essay will give you an experience that I myself found life-altering: the experience of having a way of thinking that seemed unquestionably true slowly dissolve into just one of many imperfect models of reality. This experience helped me gain more explicit appreciation for the skill of viewing the world from many different angles, and of distinguishing between a very successful paradigm and reality.

If you are not like me, then you may have had the experience of bringing up one of many reasonable objections to normative Bayesian epistemology, and having it shot down by one of many “standard” arguments that seem wrong but not for easy-to-articulate reasons. I hope to lend some reprieve to those of you in this camp, by providing a collection of “standard” replies to these standard arguments.
bayesian  philosophy  stats  rhetoric  advice  debate  critique  expert  lesswrong  commentary  discussion  regularizer  essay  exposition  🤖  aphorism  spock  synthesis  clever-rats  ratty  hi-order-bits  top-n  2014  acmtariat  big-picture  acm  iidness  online-learning  lens  clarity  unit  nibble  frequentist  s:**  expert-experience  subjective-objective 
september 2016 by nhaliday
Caltech CS 101.2
by Andreas Krause so probably some detail on submodular functions
course  caltech  online-learning  reinforcement  machine-learning  submodular  lecture-notes  unit 
june 2016 by nhaliday
CS229T/STATS231: Statistical Learning Theory
Course by Percy Liang covers a mix of statistics, computational learning theory, and some online learning. Also surveys the state-of-the-art in theoretical understanding of deep learning (not much to cover unfortunately).
yoga  stanford  course  machine-learning  stats  👳  lecture-notes  acm  kernels  learning-theory  deep-learning  frontier  init  ground-up  unit  dimensionality  vc-dimension  entropy-like  extrema  moments  online-learning  bandits  p:***  explore-exploit 
june 2016 by nhaliday

bundles : academeacmframe

related tags

accretion  acm  acmtariat  adversarial  advice  ai  ai-control  algorithmic-econ  algorithms  ankur-moitra  aphorism  approximation  asia  average-case  bandits  bayesian  best-practices  biases  big-list  big-picture  bio  bits  blog  bonferroni  books  boolean-analysis  business  caltech  characterization  chart  china  clarity  classic  classification  clever-rats  coding-theory  combo-optimization  commentary  concentration-of-measure  concept  conference  confidence  context  convexity-curvature  cool  cornell  counting  course  critique  crypto  curvature  data-science  debate  decision-making  decision-theory  deep-learning  deepgoog  differential-privacy  dimensionality  discrete  discussion  distributional  draft  duality  encyclopedic  engineering  ensembles  entropy-like  essay  ethical-algorithms  ethics  evan-miller  events  examples  expert  expert-experience  explanation  exploration-exploitation  explore-exploit  exposition  extrema  features  flux-stasis  frequentist  frontier  game-theory  games  gelman  generalization  georgia  gowers  gradient-descent  graph-theory  graphs  ground-up  hashing  heuristic  hi-order-bits  high-dimension  hmm  homepage  huge-data-the-biggest  human-ml  hypothesis-testing  identity  iidness  info-dynamics  information-theory  init  interdisciplinary  iteration-recursion  iterative-methods  justice  kernels  knowledge  language  latent-variables  learning-theory  lecture-notes  lectures  lens  lesswrong  linear-algebra  linear-programming  linearity  liner-notes  list  lower-bounds  machine-learning  magnitude  markov  matching  math  math.CA  math.CO  mathtariat  matrix-factorization  meta:science  metabuch  metameta  methodology  miri-cfar  mit  model-class  moments  mrtz  nibble  nlp  no-go  occam  off-convex  online-learning  optimization  org:bleg  org:edu  org:inst  org:mat  org:nat  overflow  p:*  p:***  p:someday  papers  pdf  people  perturbation  philosophy  potential  pragmatic  prediction  preprint  princeton  probability  prof  proofs  properties  q-n-a  quixotic  random-matrices  random-networks  ranking  ratty  reflection  regression  regularization  regularizer  reinforcement  research  research-program  rhetoric  rigorous-crypto  robotics  robust  rounding  s:**  s:***  sample-complexity  sampling  sanjeev-arora  science  scitariat  SDP  sebastien-bubeck  sensitivity  skeleton  slides  soft-question  sparsity  speedometer  spock  stanford  state-of-art  stats  stochastic-processes  stream  structure  study  subjective-objective  sublinear  submodular  synthesis  talks  tcs  tcstariat  tech  technocracy  techtariat  thesis  tidbits  tim-roughgarden  toolkit  top-n  topics  tricki  tutorial  unit  unsupervised  vc-dimension  video  volo-avolo  washington  wormholes  yoga  👳  🔬  🤖 

Copy this bookmark: