nhaliday + acm + expert-experience   24

New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Stat 260/CS 294: Bayesian Modeling and Inference
- Priors (conjugate, noninformative, reference)
- Hierarchical models, spatial models, longitudinal models, dynamic models, survival models
- Testing
- Model choice
- Inference (importance sampling, MCMC, sequential Monte Carlo)
- Nonparametric models (Dirichlet processes, Gaussian processes, neutral-to-the-right processes, completely random measures)
- Decision theory and frequentist perspectives (complete class theorems, consistency, empirical Bayes)
- Experimental design
unit  course  berkeley  expert  michael-jordan  machine-learning  acm  bayesian  probability  stats  lecture-notes  priors-posteriors  markov  monte-carlo  frequentist  latent-variables  decision-theory  expert-experience  confidence  sampling 
july 2017 by nhaliday
bounds - What is the variance of the maximum of a sample? - Cross Validated
- sum of variances is always a bound
- can't do better even for iid Bernoulli
- looks like nice argument from well-known probabilist (using E[(X-Y)^2] = 2Var X), but not clear to me how he gets to sum_i instead of sum_{i,j} in the union bound?
edit: argument is that, for j = argmax_k Y_k, we have r < X_i - Y_j <= X_i - Y_i for all i, including i = argmax_k X_k
- different proof here (later pages): http://www.ism.ac.jp/editsec/aism/pdf/047_1_0185.pdf
Var(X_n:n) <= sum Var(X_k:n) + 2 sum_{i < j} Cov(X_i:n, X_j:n) = Var(sum X_k:n) = Var(sum X_k) = nσ^2
why are the covariances nonnegative? (are they?). intuitively seems true.
- for that, see https://pinboard.in/u:nhaliday/b:ed4466204bb1
- note that this proof shows more generally that sum Var(X_k:n) <= sum Var(X_k)
- apparently that holds for dependent X_k too? http://mathoverflow.net/a/96943/20644
q-n-a  overflow  stats  acm  distribution  tails  bias-variance  moments  estimate  magnitude  probability  iidness  tidbits  concentration-of-measure  multi  orders  levers  extrema  nibble  bonferroni  coarse-fine  expert  symmetry  s:*  expert-experience  proofs 
february 2017 by nhaliday
A Fervent Defense of Frequentist Statistics - Less Wrong
Short summary. This essay makes many points, each of which I think is worth reading, but if you are only going to understand one point I think it should be “Myth 5″ below, which describes the online learning framework as a response to the claim that frequentist methods need to make strong modeling assumptions. Among other things, online learning allows me to perform the following remarkable feat: if I’m betting on horses, and I get to place bets after watching other people bet but before seeing which horse wins the race, then I can guarantee that after a relatively small number of races, I will do almost as well overall as the best other person, even if the number of other people is very large (say, 1 billion), and their performance is correlated in complicated ways.

If you’re only going to understand two points, then also read about the frequentist version of Solomonoff induction, which is described in “Myth 6″.


If you are like me from, say, two years ago, you are firmly convinced that Bayesian methods are superior and that you have knockdown arguments in favor of this. If this is the case, then I hope this essay will give you an experience that I myself found life-altering: the experience of having a way of thinking that seemed unquestionably true slowly dissolve into just one of many imperfect models of reality. This experience helped me gain more explicit appreciation for the skill of viewing the world from many different angles, and of distinguishing between a very successful paradigm and reality.

If you are not like me, then you may have had the experience of bringing up one of many reasonable objections to normative Bayesian epistemology, and having it shot down by one of many “standard” arguments that seem wrong but not for easy-to-articulate reasons. I hope to lend some reprieve to those of you in this camp, by providing a collection of “standard” replies to these standard arguments.
bayesian  philosophy  stats  rhetoric  advice  debate  critique  expert  lesswrong  commentary  discussion  regularizer  essay  exposition  🤖  aphorism  spock  synthesis  clever-rats  ratty  hi-order-bits  top-n  2014  acmtariat  big-picture  acm  iidness  online-learning  lens  clarity  unit  nibble  frequentist  s:**  expert-experience  subjective-objective 
september 2016 by nhaliday

bundles : academeacmframe

related tags

accretion  acm  acmtariat  advanced  adversarial  advice  ai  algorithmic-econ  ama  amortization-potential  announcement  aphorism  applications  arrows  atoms  bandits  bare-hands  bayesian  ben-recht  benchmarks  berkeley  best-practices  bias-variance  big-picture  big-surf  bio  bits  blog  bonferroni  books  brunn-minkowski  characterization  chart  clarity  clever-rats  cmu  coarse-fine  commentary  composition-decomposition  concentration-of-measure  concept  confidence  convexity-curvature  cornell  course  critique  curvature  data  data-science  debate  decision-making  decision-theory  deep-learning  deepgoog  descriptive  dimensionality  discussion  distribution  draft  economics  encyclopedic  entropy-like  equilibrium  essay  estimate  evolution  experiment  expert  expert-experience  explanans  explanation  exposition  extrema  frequentist  frontier  game-theory  generalization  generative  geometry  giants  gradient-descent  graph-theory  graphical-models  graphs  ground-up  guide  harvard  heavyweights  heterodox  hi-order-bits  high-dimension  hmm  human-ml  ideas  iidness  info-foraging  information-theory  init  intelligence  interdisciplinary  israel  iteration-recursion  kernels  latent-variables  learning  learning-theory  lecture-notes  lens  lesswrong  levers  lifts-projections  linear-algebra  linearity  liner-notes  links  list  machine-learning  macro  magnitude  markov  matching  math  math.CA  math.DS  math.FA  math.MG  matrix-factorization  metabuch  methodology  michael-jordan  mit  model-class  moments  monte-carlo  multi  network-structure  neuro  news  nibble  nlp  nonlinearity  occam  off-convex  oly  online-learning  openai  optimization  orders  org:bleg  org:edu  org:inst  org:mag  org:med  org:sci  overflow  p:*  p:***  p:whenever  PAC  papers  pdf  philosophy  physics  popsci  preimage  princeton  priors-posteriors  probability  proofs  q-n-a  qra  quixotic  random  random-networks  ranking  ratty  realness  recommendations  regularizer  reinforcement  replication  research  rhetoric  roadmap  roots  s:*  s:**  sample-complexity  sampling  sanjeev-arora  search  sebastien-bubeck  sequential  shannon  signal-noise  skeleton  slides  speedometer  spock  stats  stream  subjective-objective  summary  symmetry  synthesis  tails  talks  tcs  tech  thinking  tidbits  toolkit  top-n  topics  track-record  tutorial  unit  unsupervised  valiant  values  vc-dimension  video  volo-avolo  wild-ideas  yoga  👳  🤖 

Copy this bookmark: