GuiNN checkers 2.06

by Jonathan Kreuzer
Apr 3, 2021

At long last, an update to Gui Checkers!

It's now called GuiNN, because it features a neural net evaluation in addition to having its own graphical user interface.
The source code has also had a major overhaul and a lot of clean-up. I had become embarassed by the old code, so now it's at least in a better state even though it's still messy in places.

After a first pass, it's very strong, and initial progress was very quick, but still far from unbeatable.

Download

Zip : GuiNN-Checkers206.zip (1.6MB updated Apr 3, 2021)

	guiNN_64_avx.exe  - For newer 64-bit processors that support AVX2. 30% faster.
	guiNN_64_sse.exe  - For older 64-bit processors.
	guiNN_64_avx.dll  - For use in checkerboard on processors supporting AVX2.
	guiNN_64_sse.dll  - For use in CheckerBoard.
	Nets206.gnn       - The neural net weights used to evaluate the board. 
                            This file *must* be in the same directory as exe or dll to play properly.
        database.jef      - 4 piece endgame bitbases (W/L/D). This also must be in same directory as exe or dll to be used.

Source Code : Github Link

Source License : Creative Commons Attribution-NonCommercial-ShareAlike.
Since my hope in releasing source is it might be useful as a learning tool, or something to experiment with / have fun and create more free software with, I don't consider this license strict. However it does mean if you are planning a commercial project or want to distribute a closed source project you should avoid basing any part on this source code itself. Otherwise feel free to experiment and alter as you wish!

GuiNN is based on my previous Gui Checkers 1.10, and includes :
Art by Josh Hess
code contributions by Ed Gilbert and Ed Trice

Links

Gui Checkers - the old web page and program
Bitboard tutorial - an old tutorial page of mine
Automated tuning/learning in chess - my more recent work with chess.
Elo Difference Calculator - My online elo rating calculator.
Ed Gilbert's page - Download CheckerBoard interface to play matches between checkers engines & the very strong KingsRow engine.
Martin Fierz's page - Cake a very strong engine I used a now old version of in training. (Also the original author of CheckerBoard but I used 64-bit one.)

New Test Matches

GuiNN 2.06 vs GuiNN 2.04 .2s per-move 1173-563-6264  +27 elo (53.8%)  Download Games (pdn) (7.2 MB)
GuiNN 2.06 vs Cake 1.88  .2s per-move 1331-631-6038  +30 elo (54.4%)  Download Games (pdn) (6.5 MB)
GuiNN 2.06 vs Cake 1.89d .2s per-move 902-1132-7966  -8 elo  (48.9%)  Download Games (pdn) (9 MB)

Match Conditions

First 1000 11-man openings, played form both sides. This includes some won/lost positions, so just overall elo difference matters.
6-pc W/L/D endgame database from Martin Fierz used by all programs. (Except poor GuiNN 2.04 only had its own 4pc.)
Fast time control of 0.2 seconds per-move. (Even testing at 0.5s the games started to become too drawish for me.)
This was tougher test than the 2.04 test matches. 6-pc database means it's closer to Cake training conditions, and many new openings means memorizing opening lines less likely. GuiNN 204 lost to Cake 1.88 in my first tests under these conditions. But the biggest reason this was a tougher is that Cake 1.89d is hugely stronger than Cake 1.88.

Old Test Matches

GuiNN 2.04 vs Gui  1.1  .2s per-move   1169-31-1680   +145 elo    Download Games (pdn) (3.5 MB)
GuiNN 2.04 vs Cake 1.88 .2s per-move   339-137-3556   +17 elo     Download Games (pdn) (4 MB)

Match Conditions

For openings I played matches using the forced 3-move openings. I think this is the most common way to force more variety into checkers play.
144 openings played for each color = 288 games per test.
288 games is far too few to get a reasonably accurate strength measurement, I was surprised at how much match results varied even with no book. So I would usually play 10 matches of 288 games in parallel and combine the results.
The programs used their own 4-piece endgame databases.
Opening books were off.
Time control was 0.2 seconds per-move. I chose this because it resulted in enough decisive games to give a decent elo-change measurement, and made the training process quicker. I could have gone even quicker, but it was already so fast it was hard to follow or notice any issues in the games as they played out.
I used gui64_avx.dll

Checkers is very drawish with strong evenly matched opponents, so the decisions were made to reduce drawishness for better testing, and to concentrate on the evaluation and search. Given how a large endgame and opening database can cover so much of the game in checkers it's good to acknowledge with no large databases it's more about using checkers to test general methods I wanted to test than creating the strongest program. (And even with same conditions for both sides this testing is unfair to programs with large databases available that aren't learning from the match.) Given the much weaker starting point I consider the learning quite successful with a slight win versus Cake 1.88 in these somewhat unfair conditions.

Neural Nets & Training Process

GuiNN is using 4 rather small neural nets, and based on game stage it will choose one the these nets to evaluate the board. Each net uses the same setup, I didn't experiment much with changes here. Smaller will be quicker to evaluate, but bigger will be more accurate. (Note : In the beginning before you have enough data, bigger will just be overfitted to the limited data and probably be worse.) This is the network structure :

	whiteInputCount = 32 + 28; // 32 king squares, 28 checkers square
	blackInputCount = 32 + 28;

	network.SetInputCount(whiteInputCount + blackInputCount + 1); // +1 for side-to-move
	network.AddLayer(192, eActivation::RELU, eLayout::SPARSE_INPUTS);
	network.AddLayer(32, eActivation::RELU);
	network.AddLayer(32, eActivation::RELU);
	network.AddLayer(1);
	network.Build();

The learner can read in .pdn files of the games, step through each move, and for each position export the neural net inputs and target value. The target value is based on game result 1, 0.5, or 0, with 0.5 being a draw. After exporting it runs trainNet.py from the command line, which uses TensorFlow to train the weights for all the evaluation nets.

I ran training matches for 6 days on my home computer. Before having any data it of course lost every single game. The game space of standard 8x8 checkers isn't that big however, and after just half a day it was performing better than my old evaluation. On the first day I trained against my old version Gui 1.2 and Cake 1.88. The next 5 days I trained solely against Cake, and let Cake call wins/losses.

I didn't make sure the program was bug-free which led to some issues. I think early on there were some bad results calls, and sometimes it was unable to win easily won positions, and sometimes even just played randomly obviously bugged moves to throw away the game. When I noticed this I turned off own result calling and let Cake call adjucations. Later on I fixed a couple bugs causing bad moves and/or no progress but I think some issues remain. When the result changes near end of the game because of bugs it can cause bad/noisy data, so it's good to make sure the process and endgame are accurate.

I would play multiple matches in parallel using a 12 physical core Ryzen CPU. The cycle wasn't fully automated, I'd have to manually start the CheckBoard matches, so I just started them whenever I happened to be around my computer. After beating my old version, the new GuiNN was initially at -110 elo to Cake. In 3 days it went to -24 elo vs Cake. 3 more days it was at -8 elo. This includes no improvement in the last 2 matches, and I decided this was a good stopping point to release rather than getting more involved.

In Chess carefully measuring individual changes in actual matches with large numbers of games for small elo gains can eventually result in hundreds of elo overall improvement. I assume this careful method might show similar results in checkers, except : Checkers is a lot more drawish, which compresses elo, and would make slow progression slower and harder to measure. So I didn't continue refine the program beyond my quick untested updates based on what I thought would be better, and ocassional attempted bug-fixes when I saw something obviously messed up in a game.

Possible Improvements to 2.04

Better Search. More actual testing of the search and try out more of the search methods common in chess.
Use hard coded eval for any clearly won position. There are a couple reasons for this : Neural nets usually don't have much lop-sided data so can have weird noisy evals for those positions, and they are usually much slower to evaluate than a hard-coded heuristic.
Linux support. Probably not that difficult, but I would need to finish untangling all the Windows stuff from the engine, and fix a few non-portable parts.
Multi-threaded search support.
Continue Learning. I stopped after only 6 days. The methods could be improved.
If even just the opponent was using a large endgame database, the endgame results would be more accurate and the endgame net would be better trained, and as mentioned earlier bad endgame play can mean the earlier positions are scored poorly.
Experiment with different net phases and different size nets. Usually the elo difference is small, but it's unlikely my first guess for setup is the best.
Make an opening book and support/use large endgame databases. As mentioned I avoided this on purpose for this experiment, but for making a complete & stronger checkers program it should be a clear win.

More about Neural Nets

I added this section for a quick explanation of how neural networks work, since it's simpler than I initially expected. This is example code for one layer of a dense neural network :

	// The neural network weights and biases for the layer. These are the values you train and save.
	float weights[NUM_INPUTS][NUM_OUTPUTS];
	float biases[NUM_OUTPUTS];

	// Compute outputs from the inputs using the weights and biases 
	void LayerCompute( float inputs[NUM_INPUTS], float outputs[NUM_OUTPUTS] )
	{
		for ( int output = 0; output < NUM_OUTPUTS; output++ )
		{
			float tempValue = biases[output];
			for ( int input = 0; input < NUM_INPUTS; input++ )
			{
				tempValue += inputs[input] * weights[input][output];	
			}

			// (RELU activation just means clamp values below 0 to 0)
			outputs[output] = tempValue < 0.0f : 0.0f : tempValue;
		}
	}

For multiple layers, the outputs from a layer are fed to the inputs of the next layer. If you want a single evaluation, the last layer will have one output. A neural net library can start to look complicated with optimizations, and support for various options that might only be sometimes used but there isn't much to a barebones implementation. (Code for training nets however still doesn't seem that simple to me, though barebones training code can be surprisingly short too.)