Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Ge, V.: Solving planning problems with deep reinforcement learning and tree search (2018). In: Advances in Neural Information Processing Systems, vol. Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. The paper’s research question was how to improve the performance of learning agents in tasks that require planning to avoid unwanted states. We also varied the number of rollouts for each move in MCTS and compared the results.
We have implemented MCTS in two different setups: one with a CNN trained using data obtained during the solving process and one without such training. We experimented with different heuristic variations of algorithms and compared them against each other. We propose using a Monte-Carlo tree search (MCTS) algorithm and a heuristic convolution neural network (CNN) specially trained to separate undesirable, neutral, and desired game states to address this issue. To predict which actions will lead to such undesirable states is often difficult for a learning agent – a common problem in tasks requiring planning. However, it poses a significant challenge for computer algorithms due to the irreversible nature of certain moves. This game is a popular puzzle, relatively easy for humans to solve. This paper focuses on applying reinforcement learning methods to solve the game Sokoban.