Game Solving with Online Fine-Tuning

November 13, 2023 · Entered Twilight · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .autopep8, .clang-format, .githooks, .gitignore, .gitmodules, .opening.txt, CMakeLists.txt, README.md, chat, config, docs, game_solver, minizero, scripts, training

Authors Ti-Rong Wu, Hung Guei, Ting Han Wei, Chung-Chin Shih, Jui-Te Chin, I-Chen Wu arXiv ID 2311.07178 Category cs.AI: Artificial Intelligence Cross-listed cs.GT, cs.LG Citations 3 Venue Neural Information Processing Systems Repository https://github.com/rlglab/online-fine-tuning-solver ⭐ 6 Last Checked 6 days ago

Abstract

Game solving is a similar, yet more difficult task than mastering a game. Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome. The AlphaZero algorithm has demonstrated super-human level play, and its powerful policy and value predictions have also served as heuristics in game solving. However, to solve a game and obtain a full strategy, a winning response must be found for all possible moves by the losing player. This includes very poor lines of play from the losing side, for which the AlphaZero self-play process will not encounter. AlphaZero-based heuristics can be highly inaccurate when evaluating these out-of-distribution positions, which occur throughout the entire search. To address this issue, this paper investigates applying online fine-tuning while searching and proposes two methods to learn tailor-designed heuristics for game solving. Our experiments show that using online fine-tuning can solve a series of challenging 7x7 Killall-Go problems, using only 23.54% of computation time compared to the baseline without online fine-tuning. Results suggest that the savings scale with problem size. Our method can further be extended to any tree search algorithm for problem solving. Our code is available at https://rlg.iis.sinica.edu.tw/papers/neurips2023-online-fine-tuning-solver.