nhaliday + acm + frontier   10

[1709.06560] Deep Reinforcement Learning that Matters
https://twitter.com/WAWilsonIV/status/912505885565452288
I’ve been experimenting w/ various kinds of value function approaches to RL lately, and its striking how primitive and bad things seem to be
At first I thought it was just that my code sucks, but then I played with the OpenAI baselines and nope, it’s the children that are wrong.
And now, what comes across my desk but this fantastic paper: (link: https://arxiv.org/abs/1709.06560) arxiv.org/abs/1709.06560 How long until the replication crisis hits AI?

https://twitter.com/WAWilsonIV/status/911318326504153088
Seriously I’m not blown away by the PhDs’ records over the last 30 years. I bet you’d get better payoff funding eccentrics and amateurs.
There are essentially zero fundamentally new ideas in AI, the papers are all grotesquely hyperparameter tuned, nobody knows why it works.

Deep Reinforcement Learning Doesn't Work Yet: https://www.alexirpan.com/2018/02/14/rl-hard.html
Once, on Facebook, I made the following claim.

Whenever someone asks me if reinforcement learning can solve their problem, I tell them it can’t. I think this is right at least 70% of the time.
papers  preprint  machine-learning  acm  frontier  speedometer  deep-learning  realness  replication  state-of-art  survey  reinforcement  multi  twitter  social  discussion  techtariat  ai  nibble  org:mat  unaffiliated  ratty  acmtariat  liner-notes  critique  sample-complexity  cost-benefit  todo 
september 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Predicting with confidence: the best machine learning idea you never heard of | Locklin on science
The advantages of conformal prediction are many fold. These ideas assume very little about the thing you are trying to forecast, the tool you’re using to forecast or how the world works, and they still produce a pretty good confidence interval. Even if you’re an unrepentant Bayesian, using some of the machinery of conformal prediction, you can tell when things have gone wrong with your prior. The learners work online, and with some modifications and considerations, with batch learning. One of the nice things about calculating confidence intervals as a part of your learning process is they can actually lower error rates or use in semi-supervised learning as well. Honestly, I think this is the best bag of tricks since boosting; everyone should know about and use these ideas.

The essential idea is that a “conformity function” exists. Effectively you are constructing a sort of multivariate cumulative distribution function for your machine learning gizmo using the conformity function. Such CDFs exist for classical stuff like ARIMA and linear regression under the correct circumstances; CP brings the idea to machine learning in general, and to models like ARIMA when the standard parametric confidence intervals won’t work. Within the framework, the conformity function, whatever may be, when used correctly can be guaranteed to give confidence intervals to within a probabilistic tolerance. The original proofs and treatments of conformal prediction, defined for sequences, is extremely computationally inefficient. The conditions can be relaxed in many cases, and the conformity function is in principle arbitrary, though good ones will produce narrower confidence regions. Somewhat confusingly, these good conformity functions are referred to as “efficient” -though they may not be computationally efficient.
techtariat  acmtariat  acm  machine-learning  bayesian  stats  exposition  research  online-learning  probability  decision-theory  frontier  unsupervised  confidence 
february 2017 by nhaliday
CS229T/STATS231: Statistical Learning Theory
Course by Percy Liang covers a mix of statistics, computational learning theory, and some online learning. Also surveys the state-of-the-art in theoretical understanding of deep learning (not much to cover unfortunately).
yoga  stanford  course  machine-learning  stats  👳  lecture-notes  acm  kernels  learning-theory  deep-learning  frontier  init  ground-up  unit  dimensionality  vc-dimension  entropy-like  extrema  moments  online-learning  bandits  p:***  explore-exploit  advanced 
june 2016 by nhaliday

bundles : academeacmframefrontiergrowthvaguevirtueworrydream

related tags

acm  acmtariat  advanced  ai  ai-control  alignment  altruism  analysis  announcement  anthropic  aphorism  approximation  arrows  bandits  bayesian  ben-recht  biases  big-surf  bits  causation  characterization  civilization  clarity  clever-rats  communication  complement-substitute  complex-systems  composition-decomposition  concept  conference  confidence  contracts  contrarianism  cooperate-defect  cost-benefit  course  critique  data  decision-making  decision-theory  deep-learning  deepgoog  definition  dimensionality  discussion  distribution  economics  efficiency  elegance  empirical  entropy-like  estimate  ethics  events  examples  existence  expansionism  experiment  expert  expert-experience  explanans  exploratory  explore-exploit  exposition  extrema  fall-2016  fermi  finance  formal-values  frontier  futurism  gedanken  generalization  giants  gradient-descent  ground-up  growth-econ  hanson  hi-order-bits  hmm  ideas  info-dynamics  information-theory  init  intelligence  interdisciplinary  intricacy  israel  kernels  learning-theory  lecture-notes  len:short  lifts-projections  linear-algebra  liner-notes  list  local-global  machine-learning  markov  meta:prediction  metabuch  metameta  methodology  michael-jordan  model-class  moments  morality  multi  neuro  news  nibble  nips  nitty-gritty  numerics  off-convex  online-learning  optimization  org:bleg  org:inst  org:mag  org:mat  org:med  org:sci  overflow  p:***  papers  parsimony  peace-violence  perturbation  physics  popsci  pre-2013  preimage  preprint  princeton  priors-posteriors  pro-rata  probability  q-n-a  questions  random  ranking  rat-pack  ratty  realness  reduction  reflection  regularizer  reinforcement  relativity  replication  research  research-program  rigor  risk  robust  roots  rounding  s:**  sample-complexity  science  shannon  signal-noise  simulation  singularity  social  soft-question  space  speculation  speed  speedometer  stanford  state-of-art  stats  stories  strategy  street-fighting  success  summary  survey  systematic-ad-hoc  talks  techtariat  tetlock  the-trenches  thick-thin  thinking  threat-modeling  tightness  todo  top-n  trade  tricks  twitter  unaffiliated  unit  unsupervised  vc-dimension  video  wild-ideas  xenobio  yoga  👳  🤖 

Copy this bookmark:



description:


tags: