Hello and happy new year!
Recently I’ve been playing FrozenLake using Cross-entropy method implemented in Julia, but this time I have made my task more complicated and implemented deep cross-entropy method using MXNet in order to play CartPole.
You will ask, what’s the difference?
First of all, if FrozenLake or Taxi, or any similar game had state described as a single integer value but in CartPole it is described with 4 float values. It makes it impossible to create a mapping between states and actions and therefore we require more sophisticated approach.
This is were neural networks come into place. I had to options – code everything myself or use one of the deep learning frameworks. For this specific task I have decided to use MXNet and the following setup:
- Multi-class classification problem. CartPole has only 2 actions but other games might have more.
- Simple MLP type network with 24 and 48 neurons in hidden layers. Number of neurons and layers can be supplied as an input parameter in the launch file.
- ReLU activation function.
- Uniform initialization on start.
The code itself inherits a lot of the cross-entropy code. The major change is support of state of any size and management of the neural network.
Another interesting thing about my implementation is the following. We don’t really need GPUs for this task but I succeeded running this MXNet on GPUs and speeding the training time twice.
Practical RL: https://github.com/dmitrijsc/practical-rl