Importance of learning rate when running Deep cross-entropy method

Today I have been trying to re-run my code and confirm I am getting 195 points while playing CartPole. According to OpenAI website¬†CartPole-v0 defines “solving” as getting average reward of 195.0 over 100 consecutive trials.

It was actually a very challenging task to figure out why my model is not training at all and I am getting 50% probabilities over multiple iterations.

Initially I suspected having issues in my solver (and I actually had an issue), but in the end, I realized that the learning rate is too small.

I was using Adam optimiser with a default learning rate of 0.001. It is totally OK to use it for most of the tasks. I suspect I had to increase the learning rate to 0.01 because of a relatively small batch.

Anyways, the problem is solved and cart pole is running great!

Importance of learning rate when running Deep cross-entropy method

Playing CartPole with Deep cross-entropy method using Julia and MXNet

Hello and happy new year!

Recently I’ve been playing FrozenLake using Cross-entropy method implemented in Julia, but this time I have made my task more complicated and implemented deep cross-entropy method using MXNet in order to play CartPole.

Continue reading “Playing CartPole with Deep cross-entropy method using Julia and MXNet”

Playing CartPole with Deep cross-entropy method using Julia and MXNet