nhaliday + acm + ai   4

[1709.06560] Deep Reinforcement Learning that Matters
I’ve been experimenting w/ various kinds of value function approaches to RL lately, and its striking how primitive and bad things seem to be
At first I thought it was just that my code sucks, but then I played with the OpenAI baselines and nope, it’s the children that are wrong.
And now, what comes across my desk but this fantastic paper: (link: https://arxiv.org/abs/1709.06560) arxiv.org/abs/1709.06560 How long until the replication crisis hits AI?

Seriously I’m not blown away by the PhDs’ records over the last 30 years. I bet you’d get better payoff funding eccentrics and amateurs.
There are essentially zero fundamentally new ideas in AI, the papers are all grotesquely hyperparameter tuned, nobody knows why it works.

Deep Reinforcement Learning Doesn't Work Yet: https://www.alexirpan.com/2018/02/14/rl-hard.html
Once, on Facebook, I made the following claim.

Whenever someone asks me if reinforcement learning can solve their problem, I tell them it can’t. I think this is right at least 70% of the time.
papers  preprint  machine-learning  acm  frontier  speedometer  deep-learning  realness  replication  state-of-art  survey  reinforcement  multi  twitter  social  discussion  techtariat  ai  nibble  org:mat  unaffiliated  ratty  acmtariat  liner-notes  critique  sample-complexity  cost-benefit  todo 
september 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday

bundles : academeacmframetechie

related tags

academia  acm  acmtariat  aggregator  ai  ai-control  alignment  altruism  analysis  announcement  anthropic  arrows  big-surf  bits  causation  characterization  civilization  clever-rats  communication  complement-substitute  composition-decomposition  contracts  cooperate-defect  cost-benefit  critique  cs  data  database  deep-learning  deepgoog  discussion  distribution  economics  ethics  existence  exocortex  expansionism  experiment  expert  expert-experience  explanans  fermi  formal-values  frontier  futurism  gedanken  generalization  giants  gradient-descent  growth-econ  hi-order-bits  hmm  ideas  information-theory  intelligence  interdisciplinary  israel  lifts-projections  liner-notes  machine-learning  markov  methodology  model-class  morality  multi  neuro  news  nibble  org:inst  org:mag  org:mat  org:sci  papers  peace-violence  physics  popsci  preimage  preprint  priors-posteriors  pro-rata  probability  questions  ratty  realness  reflection  reinforcement  relativity  replication  research  risk  roots  sample-complexity  scholar  scholar-pack  science  search  shannon  signal-noise  simulation  singularity  social  space  speculation  speed  speedometer  state-of-art  street-fighting  summary  survey  talks  tcs  techtariat  thinking  threat-modeling  todo  tools  trade  twitter  unaffiliated  video  wild-ideas  worrydream  xenobio 

Copy this bookmark: