I have been so busy writing my book on Julia that I could not dedicate time to finalizing Space Invaders.
Previously I have managed to train the model to reach 200, but after implementing a few changes, it dropped back to random policy and stalled on around 165.
I have worked on a number of changes to the environment and model:
- I have implemented 3-channel model, where each channel corresponds to a frame.
- I have used DQN definition from NIPS 2013.
- I have tried shrinking the frame size and taking every second pixel both on x and y-axis, but it did not work well. The current implementation is taking every second pixel on the y-axis and every pixel from the x-axis.
- I have tried retraining the model after every frame and mixing it the frames from the past. I have changed the implementation, and my update interval is 5, which means I am updating my model on every fifth frame.
- I am clipping the predictions to be over 0.
- I am also considering every reward I am getting to be either 0 or 1.
My model is in training. It has been 4500 episodes, and I am over 200 in reward. Will be training it longer to see if that works.
Will be check-ing the code shortly!