GuiNN Checkers 2.04
by Jonathan Kreuzer
Nov 21, 2020
At long last, an update to Gui Checkers!
It's now called GuiNN, because it features a neural net evaluation in addition to having its own graphical user interface.
The source code has also had a major overhaul and a lot of clean-up. I had become embarassed by the old code, so now it's at least in a better state even though it's still messy in places.
After a first pass, it's very strong, and initial progress was very quick, but still far from unbeatable.
Zip : GuiNN-Checkers204.zip (1.5MB updated Nov 21, 2020)
gui64-avx.exe - For newer 64-bit processors that support AVX2. 30% faster.
gui64.exe - For older 64-bit processors.
gui64-avx.dll - For use in checkerboard on processors supporting AVX2.
gui64.dll - For use in CheckerBoard.
nets.gnn - The neural net weights used to evaluate the board.
This file *must* be in the same directory as exe or dll to play properly.
database.jef - 4 piece endgame bitbases (W/L/D). This also must be in same directory as exe or dll to be used.
Source Code : Github Link
Source License : Creative Commons Attribution-NonCommercial-ShareAlike.
Since my hope in releasing source is it might be useful as a learning tool, or something to experiment with / have fun and create more free software with, I don't consider this license strict. However it does mean if you are planning a commercial project or want to distribute a closed source project
you should avoid basing any part on this source code itself. Otherwise feel free to experiment and alter as you wish!
GuiNN is based on my previous Gui Checkers 1.10, and includes :
Art by Josh Hess
code contributions by Ed Trice and Ed Gilbert
Gui Checkers - the old web page and program
Bitboard tutorial - an old tutorial page of mine
Automated tuning/learning in chess - my more recent work with chess.
Elo Difference Calculator - My online elo rating calculator.
Ed Gilbert's page - Download CheckerBoard interface to play matches between checkers engines & the very strong KingsRow engine.
Martin Fierz's page - Cake a very strong engine I used in training. (Also the original author of CheckerBoard but I used 64-bit one.)
GuiNN 2.04 vs Gui 1.1 .2s per-move 1169-31-1680 +145 elo Download Games (pdn) (3.5 MB)
GuiNN 2.04 vs Cake 1.88 .2s per-move 339-137-3556 +17 elo Download Games (pdn) (4 MB)
Checkers is very drawish with strong evenly matched opponents, so the decisions were made to reduce drawishness for better testing,
and to concentrate on the evaluation and search.
Given how a large endgame and opening database can cover so much of the game in checkers it's good to acknowledge with no large databases it's more about using checkers
to test general methods I wanted to test than creating the strongest program. (And even with same conditions for both sides this testing is unfair to programs with large databases available that aren't learning from the match.)
Given the much weaker starting point I consider the learning quite successful with a win versus Cake 1.88 in these somewhat unfair conditions.
For openings I played matches using the forced 3-move openings. I think this is the most common way to force more
variety into checkers play.
144 openings played for each color = 288 games per test.
288 games is far too few to get a reasonably accurate strength measurement, I was surprised at how much match results varied even with no book.
So I would usually play 10 matches of 288 games in parallel and combine the results.
- The programs used their own 4-piece endgame databases.
- Opening books were off.
- Time control was 0.2 seconds per-move. I chose this because it resulted in enough decisive games to give a decent elo-change measurement, and made the training process quicker.
I could have gone even quicker, but it was already so fast it was hard to follow or notice any issues in the games as they played out.
- I used gui64_avx.dll
Neural Nets & Training Process
GuiNN is using 4 rather small neural nets, and based on game stage it will choose one the these nets to evaluate the board.
Each net uses the same setup, I didn't experiment much with changes here. Smaller will be quicker to evaluate, but bigger will be more accurate.
(Note : In the beginning before you have enough data, bigger will just be overfitted to the limited data and probably be worse.)
This is the network structure :
whiteInputCount = 32 + 28; // 32 king squares, 28 checkers square
blackInputCount = 32 + 28;
network.SetInputCount(whiteInputCount + blackInputCount + 1); // +1 for side-to-move
network.AddLayer(192, AT_RELU, LT_INPUT_TO_OUTPUTS);
The learner can read in .pdn files of the games, step through each move, and for each position export the neural net inputs and target value. The target
value is based on game result 1, 0.5, or 0, with 0.5 being a draw. After exporting it runs trainNet.py from the command line, which uses TensorFlow to train the weights for all the evaluation nets.
I ran training matches for 6 days on my home computer. Before having any data it of course lost every single game.
The game space of standard 8x8 checkers isn't that big however, and after just half a day it was performing better than my old evaluation.
On the first day I trained against my old version Gui 1.2 and Cake 1.88. The next 5 days I trained solely against Cake, and let Cake call wins/losses.
I didn't make sure the program was bug-free which led to some issues. I think early on there were some bad results calls,
and sometimes it was unable to win easily won positions, and sometimes even just played randomly obviously bugged
moves to throw away the game. When I noticed this I turned off own result calling and let Cake call adjucations. Later on I fixed a couple bugs causing bad moves and/or no progress but I think
some issues remain. When the result changes near end of the game because of bugs it can cause bad/noisy data, so it's good to make sure the process and endgame are accurate.
I would play multiple matches in parallel using a 12 physical core Ryzen CPU. The cycle wasn't fully automated, I'd have to manually start the CheckBoard matches, so I just started them whenever I happened to be around my computer.
After beating my old version, the new GuiNN was initially at -110 elo to Cake. In 3 days it went to -24 elo vs Cake. 3 more days it was at -8 elo. This includes no improvement in the last 2 matches. I later went back and after some more improvements (mostly search / bug-fix, but also a few more passes on net training) improved by an additional 20 elo.
In Chess carefully measuring individual changes in actual matches with large numbers of games for small elo gains can eventually result in hundreds of elo overall improvement.
I assume this careful method might show similar results in checkers, except : Checkers is a lot more drawish, which compresses elo, and would make slow progression slower and harder to measure.
So I didn't continue refine the program beyond my quick untested updates based on what I thought would be better, and ocassional attempted bug-fixes when I saw something obviously messed up in a game.
- Better Search. More actual testing of the search and try out more of the search methods common in chess.
- Use hard coded eval for any clearly won position. There are a couple reasons for this : Neural nets usually don't have much lop-sided data so can have weird noisy evals for those positions, and they are usually much slower to evaluate than a hard-coded heuristic.
- Linux support. Probably not that difficult, but I would need to finish untangling all the Windows stuff from the engine, and fix a few non-portable parts.
- Multi-threaded search support.
- Continue Learning. I stopped after only 6 days. The methods could be improved.
- If even just the opponent was using a large endgame database,
the endgame results would be more accurate and the endgame net would be better trained, and as mentioned earlier bad endgame play can mean
the earlier positions are scored poorly.
- Experiment with different net phases and different size nets. Usually the elo difference is small, but it's unlikely my first guess for setup is the best.
- Make an opening book and support/use large endgame databases. As mentioned I avoided this on purpose for this experiment, but for making
a complete & stronger checkers program it should be a clear win.