reinforcement   682

« earlier    

Commentary: Predictions and the brain: how musical sounds become rewarding
did i just learn something big?

Prerecorded music has ABSOLUTELY NO
SURVIVAL reward. Zero. It does not help
with procreation (well, unless you're the
one making the music, then you get
endless sex) and it does not help with
individual survival.
As such, one must seriously self test
(n=1) prerecorded music actually holds
you back.
If you're reading this and you try no
music for 2 weeks and fail, hit me up. I
have some mind blowing stuff to show
you in how you can control others with
study  psychology  cog-psych  yvain  ssc  models  speculation  music  art  aesthetics  evolution  evopsych  accuracy  meta:prediction  neuro  neuro-nitgrit  neurons  error  roots  intricacy  hmm  wire-guided  machiavelli  dark-arts  predictive-processing  reinforcement 
11 days ago by nhaliday
Simple Reinforcement Learning with Tensorflow Part 1.5: Contextual Bandits
In Part 1 of my Simple RL series, we introduced the field of Reinforcement Learning, and I demonstrated how to build an agent which can solve the multi-armed bandit problem. In that situation, there…
deep  learning  reinforcement  bandit 
27 days ago by patmcnally

« earlier    

related tags

accuracy  acm  acmtariat  action  active  aesthetics  ai-control  ai  algorithmic_trading  algorithms  alignment  alpha-go  analogy  animation  anti-gaming  art  articulated  astonishing  atoms  augmentation  auto-learning  auxillary  avoidance  backtesting  backtracking  bandit  bare-hands  ben-recht  benchmarks  bias  bounded  brands  capitalism  carlo  chainer  character  clever-rats  clustering  coarse-fine  cog-psych  commerce  communication  community  comparison  computer  concept  connection  containers  cooperation  cortex  course  dark-arts  darwinian  database  debugging  decision-making  decision-theory  decision  deep-learning  deep  deepgoog  deeplearning  deepmind  dennett  descriptive  design  devops  distributed  dopamine  dotai  dotscale  driving  e-commerce  emotion  error  evolution  evopsych  experience  expert-experience  explanation  exploit  exposition  facebook  favorites  finance  fit  flexibility  game  games  gaming  generation  geography  giants  github  go  google-io  google  gradient-descent  gradient  groups  guide  hacks  hi-order-bits  hierarchical  hmm  human-ml  humanity  hyperparameters  image  imitation  inference  influence  instinct  interests  intricacy  inverse  iteration-recursion  jupyter  kaggle  keras  kubernetes  language  learning  limit_order_model  linear-algebra  liner-notes  logging  machiavelli  machine-learning  machine  machine_learning  machinelearning  macro  making  market_impact  math.ds  math  membership  memory  meta  meta:prediction  minute  ml  model-class  model-organism  models  monte  motion  multiagent  music  natural  nature  network  neural  neuro-nitgrit  neuro  neuromodulator  neurons  news  nibble  nn  notebook  old-anglo  openai  optimization  org:bleg  org:mag  org:mat  overfit  paper  papers  parallax  participation  philosophy  popsci  portfolio_algorithm  portfolio_strategy  prediction  predictive-processing  preference  prefrontal  preprint  probabilistic  programming  project  prover  psychology  pytorch  q  qa  random  rationality  ratty  realness  reinforcement-learning  reinforcementlearning  relational  relationships  replay  replication  representation  reproducibility  research-program  research  resources  restoration  rl  roots  search  sequential  siggraph  signal-noise  simulation  software  speculation  spoofing  ssc  steering  study  successor  summary  synthesis  task  tax  taxonomy  techtariat  tensorflow  text  the-self  theorem  theory-of-mind  theory  thinking  tilecoding  trade  tradeoffs  transfer  tree  tutorial  two  underfit  unsupervised  values  variance  verification  video  vision  volo-avolo  vulnerability  wire-guided  within-without  yoga  yvain 

Copy this bookmark: