In parallel to taking Practical RL course, I am also reading a great book on reinforcement learning. I have found this quote to be good to note it down on the blog. It is explaining the difference between evolutionary methods and methods that learn value functions.
For example, if the player wins, then all of its behavior in the game is given credit, independently of how specific moves might have been critical to the win. Credit is even given to moves that never occurred! Value function methods, in contrast, allow individual states to be evaluated. In the end, evolutionary and value function methods both search the space of policies, but learning a value function takes advantage of information available during the course of play.
Our topic for today will be using Random Policy and enhance it with genetic/ evolutionary algorithms to score in different versions of FrozenLake.
About FrozenLake, OpenAI gym:
The agent controls the movement of a character in a grid world. Some tiles of the grid are walkable, and others lead to the agent falling into the water. Additionally, the movement direction of the agent is uncertain and only partially depends on the chosen direction. The agent is rewarded for finding a walkable path to a goal tile.
My name is Dmitry and I have a target to sharpen my Reinforcement Learning skills. In order to achieve my goal, I will go through Practical RL course on GitHub and solve all exercises.
There are 10 weeks of lectures/ videos provided with the course, so it should take me at least 3-4 month from start to end. I will be sharing my code and logic behind it. Improvements and PRs are welcome!