41 bookmarks. First posted by nirum 25 days ago.
RL is devoted to estimate this credit-assignment problem, and great progress has been made in recent years. However, credit assignment is still difficult when the reward signals are sparse. In the real world, rewards can be sparse and noisy. Sometimes we are given just a single reward, like a bonus check at the end of the year, and depending on our employer, it may be difficult to figure out exactly why it is so low. For these problems, rather than rely on a very noisy and possibly meaningless gradient estimate of the future to our policy, we might as well just ignore any gradient information, and attempt to use black-box optimisation techniques such as genetic algorithms (GA) or ES.evolutionary geneticalgorithms AI reinforcementlearning deeplearning visualization algorithms tweetit
25 days ago by sachaa
tags1117 111217 ai algorithm algorithms biomimetics data-science deeplearning evolution evolutionary-algorithms evolutionary feedly genetic geneticalgorithm geneticalgorithms geneticprogramming genetic_algorithms ifttt infovis learning machine-learning machine machinelearning machine_learning ml neuralnet neuralnets neuralnetwork optimisation programming programming_evolution reading reinforcement-learning reinforcementlearning statistics teaching theory ththlink tweetit twitter visualization