So I recently started learning deep reinforcement learning, and decided to make an open source Deep RL framework called ReiLS. So I went ahead and implemented a couple of popular actor-critic methods like DDPG, A3C and the more recent PPO, and soon turned my attention to TRPO. The difficulty with TRPO is that it uses natural gradients, as oppo... Read more 09 Jun 2018 - 7 minute read