nhaliday + acm + research   46

Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
acmtariat  techtariat  org:bleg  nibble  better-explained  machine-learning  deep-learning  visual-understanding  visualization  analysis  let-me-see  research  sequential  audio  classification  model-class  exposition  language  acm  approximation  comparison  markov  iteration-recursion  concept  atoms  distribution  orders  DP  heuristic  optimization  trees  greedy  matching  gradient-descent 
december 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Correlated Equilibria in Game Theory | Azimuth
Given this, it’s not surprising that Nash equilibria can be hard to find. Last September a paper came out making this precise, in a strong way:

• Yakov Babichenko and Aviad Rubinstein, Communication complexity of approximate Nash equilibria.

The authors show there’s no guaranteed method for players to find even an approximate Nash equilibrium unless they tell each other almost everything about their preferences. This makes finding the Nash equilibrium prohibitively difficult to find when there are lots of players… in general. There are particular games where it’s not difficult, and that makes these games important: for example, if you’re trying to run a government well. (A laughable notion these days, but still one can hope.)

Klarreich’s article in Quanta gives a nice readable account of this work and also a more practical alternative to the concept of Nash equilibrium. It’s called a ‘correlated equilibrium’, and it was invented by the mathematician Robert Aumann in 1974. You can see an attempt to define it here:
baez  org:bleg  nibble  mathtariat  commentary  summary  news  org:mag  org:sci  popsci  equilibrium  GT-101  game-theory  acm  conceptual-vocab  concept  definition  thinking  signaling  coordination  tcs  complexity  communication-complexity  lower-bounds  no-go  liner-notes  big-surf  papers  research  algorithmic-econ  volo-avolo 
july 2017 by nhaliday
How to Escape Saddle Points Efficiently – Off the convex path
A core, emerging problem in nonconvex optimization involves the escape of saddle points. While recent research has shown that gradient descent (GD) generically escapes saddle points asymptotically (see Rong Ge’s and Ben Recht’s blog posts), the critical open problem is one of efficiency — is GD able to move past saddle points quickly, or can it be slowed down significantly? How does the rate of escape scale with the ambient dimensionality? In this post, we describe our recent work with Rong Ge, Praneeth Netrapalli and Sham Kakade, that provides the first provable positive answer to the efficiency question, showing that, rather surprisingly, GD augmented with suitable perturbations escapes saddle points efficiently; indeed, in terms of rate and dimension dependence it is almost as if the saddle points aren’t there!
acmtariat  org:bleg  nibble  liner-notes  machine-learning  acm  optimization  gradient-descent  local-global  off-convex  time-complexity  random  perturbation  michael-jordan  iterative-methods  research  learning-theory  math.DS  iteration-recursion 
july 2017 by nhaliday
Predicting with confidence: the best machine learning idea you never heard of | Locklin on science
The advantages of conformal prediction are many fold. These ideas assume very little about the thing you are trying to forecast, the tool you’re using to forecast or how the world works, and they still produce a pretty good confidence interval. Even if you’re an unrepentant Bayesian, using some of the machinery of conformal prediction, you can tell when things have gone wrong with your prior. The learners work online, and with some modifications and considerations, with batch learning. One of the nice things about calculating confidence intervals as a part of your learning process is they can actually lower error rates or use in semi-supervised learning as well. Honestly, I think this is the best bag of tricks since boosting; everyone should know about and use these ideas.

The essential idea is that a “conformity function” exists. Effectively you are constructing a sort of multivariate cumulative distribution function for your machine learning gizmo using the conformity function. Such CDFs exist for classical stuff like ARIMA and linear regression under the correct circumstances; CP brings the idea to machine learning in general, and to models like ARIMA when the standard parametric confidence intervals won’t work. Within the framework, the conformity function, whatever may be, when used correctly can be guaranteed to give confidence intervals to within a probabilistic tolerance. The original proofs and treatments of conformal prediction, defined for sequences, is extremely computationally inefficient. The conditions can be relaxed in many cases, and the conformity function is in principle arbitrary, though good ones will produce narrower confidence regions. Somewhat confusingly, these good conformity functions are referred to as “efficient” -though they may not be computationally efficient.
techtariat  acmtariat  acm  machine-learning  bayesian  stats  exposition  research  online-learning  probability  decision-theory  frontier  unsupervised  confidence 
february 2017 by nhaliday
Intelligent Agent Foundations Forum | Online Learning 1: Bias-detecting online learners
apparently can maybe be used to shave exponent from Christiano's manipulation-resistant reputation system paper
ratty  clever-rats  online-learning  acm  research  ai-control  miri-cfar 
november 2016 by nhaliday

bundles : academeacmframe

related tags

academia  accretion  acm  acmtariat  advanced  adversarial  aggregator  ai  ai-control  akrasia  algorithmic-econ  algorithms  analysis  announcement  aphorism  approximation  arrows  atoms  audio  baez  bandits  bare-hands  bayesian  ben-recht  benchmarks  berkeley  better-explained  biases  big-picture  big-surf  bits  blog  bonferroni  california  caltech  causation  characterization  clarity  classification  clever-rats  cmu  coarse-fine  commentary  communication-complexity  comparison  complexity  composition-decomposition  compressed-sensing  computational-geometry  concept  conceptual-vocab  conference  confidence  convexity-curvature  cool  coordination  counterfactual  course  cs  curvature  data-science  database  debate  decision-making  decision-theory  deep-learning  deepgoog  definition  descriptive  differential-privacy  dimensionality  direction  distribution  DP  empirical  equilibrium  estimate  events  evolution  examples  exocortex  experiment  expert  expert-experience  explanans  explanation  exploratory  exposition  extrema  fall-2016  features  forum  frequentist  frontier  game-theory  gelman  generalization  geometry  giants  google  gowers  gradient-descent  greedy  ground-up  GT-101  guide  hashing  heuristic  hi-order-bits  hmm  homepage  human-ml  hypothesis-testing  ideas  info-dynamics  information-theory  init  intelligence  interdisciplinary  intricacy  isotropy  israel  iteration-recursion  iterative-methods  kernels  language  latent-variables  learning-theory  lecture-notes  lectures  let-me-see  levers  lifts-projections  linear-algebra  linearity  liner-notes  links  list  local-global  lower-bounds  machine-learning  markov  matching  math  math.DS  math.MG  mathtariat  matrix-factorization  meta:science  metabuch  metameta  methodology  metrics  michael-jordan  mihai  miri-cfar  mit  model-class  models  mrtz  neuro  neurons  news  nibble  nips  no-go  nonlinearity  norms  numerics  off-convex  oly  online-learning  openai  optimization  orders  org:bleg  org:edu  org:inst  org:mag  org:med  org:sci  overflow  p:*  p:**  p:someday  PAC  papers  parsimony  people  perturbation  physics  pinboard  popsci  preimage  preprint  princeton  priors-posteriors  probability  prof  project  publishing  q-n-a  quixotic  random  ranking  ratty  realness  recommendations  reduction  reflection  reinforcement  replication  repo  research  research-program  rigor  robust  roots  rounding  sample-complexity  sanjeev-arora  scholar  scholar-pack  science  scitariat  search  sebastien-bubeck  seminar  sensitivity  sequential  shannon  signal-noise  signaling  similarity  soft-question  sparsity  speculation  speedometer  stats  stochastic-processes  stories  stream  success  summary  synthesis  systematic-ad-hoc  talks  tcs  tcstariat  tech  techtariat  tensors  the-trenches  the-west  thinking  tightness  time-complexity  todo  tools  top-n  topics  trees  tricks  tutorial  unit  unsupervised  values  vc-dimension  video  visual-understanding  visualization  volo-avolo  wiki  wild-ideas  workshop  worrydream  yoga  🔬 

Copy this bookmark: