Finally! I did it! I’ve been struggling for some time trying to make DQN work and could not succeed.
Today I have managed to make it work and solve CartPole from OpenAI gym using DQN. You know what the problem was? Size of my neural network!
I’ve been trying to use an MLP with hidden layers such as [10, 5] (and other combinations), but it worked with having one hidden layer with two neurons! UPDATE: As I also wrote below, MLP functionality in MXNet.jl failed to work. I have implemented custom NN using 1 hidden layer and tanh activation function which converged extremely well.
In case we remove a hidden layer it reaches 100/100 points within 1st epoch which is also cool!
Another issue I experienced was in regards to MXNet implementation in Julia. For some reason, MXNet.jl fails when batch-size is one, and I had to duplicate the training data to make batch-size equal to two. It also affected the learning rate which I had to decrease.
FYI. I have just realised it should not be the case and NN should converge also with more layers in-place. I will be checking if everything works well on some simple examples.
P. P. S. I have found that using built-in mx.MLP function does not work well. Defining the network structure manually helps to solve the problem and allows to have multiple hidden layers. Haven’t had time to do any extensive research yet.