policy optimization

Learning by Doing, This Time With Neural Networks

Read more

It's easy enough to navigate a 16x16 maze with tables and some dynamic programming, but how exactly do we extend that to play video games with millions of pixels as input, or board games like Go with more states than particals in the observable universe? The answer, as it often is, is deep reinforcement learning.