machine-learning   19517

« earlier    

[1709.06009] Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community, leading to some high-profile success stories such as the much publicized Deep Q-Networks (DQN). In this article we take a big picture look at how the ALE is being used by the research community. We show how diverse the evaluation methodologies in the ALE have become with time, and highlight some key concerns when evaluating agents in the ALE. We use this discussion to present some methodological best practices and provide new benchmark results using these best practices. To further the progress in the field, we introduce a new version of the ALE that supports multiple game modes and provides a form of stochasticity we call sticky actions. We conclude this big picture look by revisiting challenges posed when the ALE was introduced, summarizing the state-of-the-art in various problems and highlighting problems that remain open.
machine-learning  benchmarking  games  to-write-about  nudge-targets  consider:performance-measures 
yesterday by Vaguery
[1611.00862] Quantile Reinforcement Learning
In reinforcement learning, the standard criterion to evaluate policies in a state is the expectation of (discounted) sum of rewards. However, this criterion may not always be suitable, we consider an alternative criterion based on the notion of quantiles. In the case of episodic reinforcement learning problems, we propose an algorithm based on stochastic approximation with two timescales. We evaluate our proposition on a simple model of the TV show, Who wants to be a millionaire.
machine-learning  performance-measure  representation  what-gets-measured-gets-fudged  to-write-about  nudge-targets 
yesterday by Vaguery
Training data creation and management system focused on information extraction
machine-learning  information-extraction  training-data 
yesterday by mjlassila
Weak Supervision
Discusses the problems of creating sufficient amount of training data for DL.
machine-learning  theory  data-programming  snorkel  training 
yesterday by mjlassila
A Beginner’s Guide to AI/ML 🤖👶 – Machine Learning for Humans – Medium
The ultimate guide to machine learning. Simple, plain-English explanations accompanied by math, code, and real-world examples.
machinelearning  ai  machine-learning 
yesterday by andrewmarsh

« earlier    

related tags

@blog  acm  advice  ai  algorithms  analytics  announcement  approximation  architectures  arrows  art  article  articles  artificial-intelligence  artificialintelligence  arxiv  audio  automation  aws-emr  aws-redshift  aws  bayesian  benchmarking  bias  big-surf  bio  bits  britain  career  causality  characterization  cheatsheet  cheatsheets  cnns  commentary  company  composition-decomposition  compressed-sensing  computer-vision  consider:feature-discovery  consider:performance-measures  correlation  data-analysis  data-engineering  data-mining  data-programming  data-science  data  datamining  datascience  deep-learning  deepgoog  deeplearning  dialog  distributed-computing  dnd  dynamical-systems  education  elearning  embodied  equality  experiment  expert  explanation  fairness  favorite  frontier  functional-analysis  funny  games  gans  ge  gender  general-electric  generalizability  generalization  genetics  genomics  giants  gradient-boosting  gradient-descent  gradient-dissent  graph-theory  graphics  graphs  gwas  hi-order-bits  hierarchical-models  high-dimension  hmm  hn  howto  hsu  ideas  image-processing  imagenet  info-foraging  information-extraction  information-theory  intelligence  interdisciplinary  iot  israel  iteration-recursion  kaggle  keras  lifts-projections  linear-algebra  liner-notes  list  local-global  lstm  machine-translation  machinelearning  maintenance  markov  math  mathematics  medium  missing-heritability  mixed-models  ml  model-class  money-for-time  money  monitoring  multi  music  nasa  netflix  neural-network  neural-networks  neural_networks  neuralnetworks  neuro  news  nibble  nlp  nudge-targets  open-source  open  operational-analytics  operations  optimization  orders  org:mag  org:sci  organization  out-of-sample-prediction  paper  papers  pdes  pedro-domingos  people  performance-measure  phase-transition  philosophy  physics  popsci  predictive  predictiveanalytics  preimage  preprint  programming  python  pytorch  quantum-computing  quantum-mechanics  r  rather-interesting  read2of  reference  regression  reinforcement-learning  representation  research  resources  roots  saas  scitariat  search  security  sequence-to-sequence  shannon  signal-noise  sleuthin  snorkel  social-psychology  software  spark  state-of-art  statistical-mechanics  statistics  study  style-transfer  summary  talks  tcs  techtariat  tensorflow  theory  tiling  time-complexity  to-understand  to-write-about  tool  tools  training-data  training  tutorial  umn  unsupervised-learning  video  visualization  what-gets-measured-gets-fudged  wild-ideas  workflow  xgboost  yak-shaving  🌞 

Copy this bookmark: