Snowie 4 – Back to the 80s
by
The long-awaited Snowie 4
is out now, and I'd like to share some first observations on its play as
compared to other bots, including most especially its older brother Snowie 3. Each new bot has
debunked its predecessor, and improved the play and understanding of a whole
gamut of positions. When Snowie 4 was announced it
was reported to have revolutionary knowledge in backgames
and prime vs. prime games. Levermann published a
couple of articles at the Snowie site (http://www.bgsnowie.com) with many
examples, and it was striking. Not only is its play much much
stronger in these phases, but so is its evaluation of such positions. The
repercussions of these changes in evaluation are so deep that it has even
changed its evaluation of the opening role 21, and while no amount of rollouts
by the previous cyberkings would favor the 13/11 6/5
slot, Snowie 4 favors it at almost any situation
except when trying to more avidly avoid gammons due to the match score. Snowie 4 seems to slot far more often than Snowie 3, leading a friend to declare it the Mad Slotter and that it would bring back much of the older 80s
style of play.
As it was, I still rather expected at least rollouts
by its predecessors to hold up better, but I am seeing the most impressive
reversals in choices that I wonder how tough it must be for strong experienced
players. After all, me, I'm just starting out and make no claims to expertise,
however, someone who has spent long hours rationalizing the results of previous
bots will be forced to unlearn and re-learn presuming Snowie
4 is right.
I was playing through the July 1999 annotated match published
at GammonLine, which had the benefit of Jellyfish
rollouts using no less than 1296 trials, and entered it into Snowie 4 to see what it said. Here is a sample of what I
saw:
1996 World Cup Final – Match #2
White played 24/22 13/11* 6/4(2) and Jellyfish
declared it only 3rd best (not by much it's true) saying:
13/11* 8/4
6/4 +0.370 (-0.000)
13/11(2)* 6/4(2) +0.368 (-0.002)
24/22 13/11*
6/4(2) +0.354 (-0.016)
However, Snowie 4 at 3-ply
Precise, its strongest playing setting, says Jellyfish's top choice of the 3 is
in fact the worst by a margin:
# |
Ply |
Move |
Equity |
|
|
1 |
3 |
13/11*(2) 6/4(2) |
0.525 |
|
|
1.4% 28.7% 59.5% 40.5% 11.3% 0.6% |
||
|
|
Speed Parameter: Precise. |
||
* |
2 |
3 |
24/22 13/11* 6/4(2) |
0.504 (-0.021) |
|
|
1.3% 27.6% 59.8% 40.2% 10.7% 0.5% |
||
|
|
Speed Parameter: Precise. |
||
|
3 |
3 |
13/11* 8/6 6/4(2) |
0.474 (-0.050) |
|
|
1.4% 28.7% 58.7% 41.3% 11.4% 0.6% |
||
|
|
Speed Parameter: Precise. |
A
few moves later White doubled in the following position:
and we are told that the take was truly heroic due to the
very high gammon risk. There are no Jellyfish rollout equities provided, but
one assumes it must be a close call. Snowie 4
declares that not taking would be a huge blunder:
Cube action equity |
|||
3-Ply |
Money equity: |
0.495 |
|
|
1.8% 33.7% 62.3% 37.7%
9.9% 0.5% |
||
1. |
Double, take |
0.737 |
|
2. |
No double |
0.612 |
(-0.125) |
3. |
Double, pass |
1.000 |
(+0.263) |
Proper cube action: Double, take |
|
But ok, that's Jellyfish, a much older program, so comparing
it is not completely fair. Snowie 3 should be a
different story unless we actually talk about backgames
of course, which would present no surprise. I began doing the same with the
more recent Special Annotated match (that I think wonderful by the way, in case
any should think I'm suggesting otherwise), played in 2001 and complemented
with Snowie 3 rollouts at 3-plies.
In Game 1 the two Snowies
begin squabbling as of move 2 (I won't include the opening 21 slot):
2001 Pro-Am Doubles Tournament final
Woolsey/Arnold – Ballard/Huie, Game 1, Move 2
The commentary explains that "Snowie's 3-ply opinion has 13/11, 6/5 out in front,
the split a close second, and 11/9, 6/5 a distant third. It is interesting that
the rollout shows just how strong the builder on the nine point really is."
since the rollout said:
13/11 6/5
+0.134 (-0.000)
11/9 6/5 +0.133
(-0.001)
24/22 6/5 +0.100
(-0.034)
However, Snowie 4 has its
own opinion on the matter and says for both 3-Ply Precise and a rollout:
# |
Ply |
Move |
Equity |
|
|
1 |
3 |
24/22 6/5 |
0.147 |
|
|
0.7% 15.4% 54.2% 45.8% 13.3% 0.7% |
||
|
|
Speed Parameter: Precise. |
||
* |
2 |
3 |
13/11 6/5 |
0.146 (-0.002) |
|
|
0.8% 16.3% 53.8% 46.2% 13.3% 0.7% |
||
|
|
Speed Parameter: Precise. |
||
|
3 |
3 |
11/9 6/5 |
0.110 (-0.038) |
|
|
0.7% 16.3% 52.5% 47.5% 13.3% 0.7% |
||
|
|
Speed Parameter: Precise. |
For the rollout, I set trucation
to 13 which should be deep enough to see the resulting positions, furthermore,
according to the Snowie people, Snowie
4's 2-ply is slightly stronger than Snowie 3's 3-ply.
Snowie 4 cannot play according to score in rollouts
unfortunately (no doubt it will be re-instated in a future patch), but at 0-0
in a 21-point match, that should not be a factor. Did the rollouts change the
evaluation any? You bet:
# |
Ply |
Move |
Equity |
|
|
1 |
R |
24/22 6/5 |
0.175 |
|
|
0.8% 15.6% 55.0% 45.0% 12.9% 0.7% |
||
|
|
95% confidence interval: |
||
* |
2 |
R |
13/11 6/5 |
0.145 (-0.031) |
|
|
0.9% 16.0% 54.0% 46.0% 13.7% 0.8% |
||
|
|
95% confidence interval: |
||
|
3 |
R |
11/9 6/5 |
0.123 (-0.053) |
|
|
0.9% 16.7% 53.0% 47.0% 13.6% 0.8% |
||
|
|
95% confidence interval: |
Now what was merely a matter of preference is an
error, and 11/9 6/5 is considered to be even worse than at first sight. Astonishing.
However, Snowie reserves some more surprises, and
also helps get a clearer picture in several positions where even Snowie 3’s rollouts left one wondering.
Here Snowie
3 spotted making the 8-point as best, unfortunately its indecisiveness even in
its rollouts left one unsure as to how correct it was. “The rollouts have making the eight point on top, although not
convincingly. Any of the approaches could work.” Here were the results:
11/8 9/8 +0.503
(-0.000)
11/8 6/5 +0.492
(-0.011)
9/5
+0.456
(-0.047)
6/5 6/3
+0.456
(-0.047)
Snowie 4’s 3-Ply Precise on the other hand had no doubts:
# |
Ply |
Move |
Equity |
|
|
1 |
3 |
11/8 9/8 |
0.859 |
|
|
0.9% 19.6% 69.4% 30.6% 6.8% 0.3% |
||
|
|
Speed Parameter: Precise. |
||
* |
2 |
3 |
6/3 6/5 |
0.803 (-0.057) |
|
|
0.9% 18.2% 68.5% 31.5% 6.6% 0.3% |
||
|
|
Speed Parameter: Precise. |
||
|
3 |
3 |
11/8 6/5 |
0.762 (-0.098) |
|
|
1.0% 18.7% 68.2% 31.8% 6.6% 0.3% |
||
|
|
Speed Parameter: Precise. |
||
|
4 |
3 |
11/7 |
0.753 (-0.107) |
|
|
0.9% 19.4% 66.7% 33.3% 6.8% 0.3% |
||
|
|
Speed Parameter: Precise. |
||
|
5 |
3 |
11/10 6/3 |
0.736 (-0.124) |
|
|
0.9% 18.3% 66.4% 33.6% 7.2% 0.3% |
||
|
|
Speed Parameter: Precise. |
||
|
6 |
3 |
9/5 |
0.727 (-0.132) |
|
|
0.9% 16.2% 67.8% 32.2% 6.6% 0.2% |
||
|
|
Speed Parameter: Precise. |
As neither the master commentator
nor Snowie 4 thought 9/5 had any real merits, I
ignored it and rolled out the top 3 moves just to be sure. Again Snowie 4 did not waiver and even accentuated the
difference:
# |
Ply |
Move |
Equity |
|
|
1 |
R |
11/8 9/8 |
0.834 |
|
|
1.1% 18.8% 69.2% 30.8% 7.1% 0.3% |
||
|
|
95% confidence interval: |
||
* |
2 |
R |
6/3 6/5 |
0.760 (-0.074) |
|
|
0.9% 17.7% 67.9% 32.1% 7.5% 0.4% |
||
|
|
95% confidence interval: |
||
|
3 |
R |
11/8 6/5 |
0.754 (-0.080) |
|
|
0.9% 18.1% 68.1% 31.9% 6.9% 0.4% |
||
|
|
95% confidence interval: |
And as the game proceeded
this position appeared at move 9:
where the move played was adamantly defended by the
commentator “B/24*, 13/11 is more
consistent with White's plan.” despite the fact “The rollout has getting the blot to safety better than B/24*, 14/12.”
Now human intuition and vision is shown to have been right on target as Snowie 4 changes the
verdict:
# |
Ply |
Move |
Equity |
|
* |
1 |
3 |
bar/24* 13/11 |
0.541 |
|
|
2.3% 31.3% 69.1% 30.9% 7.5% 0.4% |
||
|
|
Speed Parameter: Precise. |
||
|
2 |
3 |
bar/24* 14/12 |
0.529 (-0.012) |
|
|
2.5% 32.0% 68.4% 31.6% 7.9% 0.5% |
||
|
|
Speed Parameter: Precise. |
||
|
3 |
3 |
bar/24* 5/3 |
0.484 (-0.057) |
|
|
2.4% 31.3% 67.1% 32.9% 8.5% 0.5% |
||
|
|
Speed Parameter: Precise. |
And again a rollout only served to confirm its judgement:
# |
Ply |
Move |
Equity |
|
* |
1 |
R |
bar/24* 13/11 |
0.608 |
|
|
1.9% 33.4% 70.5% 29.5% 5.9% 0.3% |
||
|
|
95% confidence interval: |
||
|
2 |
R |
bar/24* 14/12 |
0.584 (-0.024) |
|
|
2.2% 34.8% 69.3% 30.7% 7.0% 0.4% |
||
|
|
95% confidence interval: |
||
|
3 |
R |
bar/24* 5/3 |
0.584 (-0.024) |
|
|
2.1% 35.2% 69.3% 30.7% 7.2% 0.4% |
||
|
|
95% confidence interval: |
||
|
|
|
And a few moves later, Snowie
4 again disagrees with the assessments of its sibling:
Here,
Snowie 3’s rollouts presented the following
evaluations:
8/2(2)*
+1.361 (-0.000)
11/2* 5/2 +1.348
(-0.013)
11/8 10/7(2) 5/2* +1.346 (-0.015)
11/8(2) 10/7(2) +1.340
(-0.021)
11/5 10/7(2) +1.327
(-0.034)
14/8 10/7(2) +1.302
(-0.059)
Clearly it wasn’t very sure of itself with
unconvincing rollout results and Woolsey comments “I see that this play came out on top in the rollout, but I'm not
inclined to believe it. 11/2*, 5/2 looks better to me. I think holding the
eight point will turn out to be an asset, not a liability.” Snowie 4 once more completely agrees, though with a far
more assertive evaluation than Snowie 3:
# |
Ply |
Move |
Equity |
|
|
1 |
3 |
11/5 5/2*(2) |
1.348 |
|
|
2.3% 52.2% 91.4% 8.6% 1.0% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
2 |
3 |
11/8(2) 10/7(2) |
1.317 (-0.031) |
|
|
1.9% 48.8% 91.7% 8.3% 1.1% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
3 |
3 |
8/2*(2) |
1.314 (-0.034) |
|
|
2.2% 51.5% 90.4% 9.6% 1.2% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
4 |
3 |
14/8 10/7(2) |
1.308 (-0.040) |
|
|
1.8% 47.1% 92.1% 7.9% 0.9% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
5 |
3 |
11/5 10/7(2) |
1.292 (-0.056) |
|
|
2.0% 49.0% 90.7% 9.3% 1.3% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
* |
6 |
3 |
11/8 10/7(2) 5/2* |
1.278 (-0.070) |
|
|
2.3% 48.9% 90.2% 9.8% 1.7% 0.1% |
||
|
|
Speed Parameter: Precise. |
And once more the rollouts do not change this
verdict:
# |
Ply |
Move |
Equity |
|
|
1 |
R |
11/5 5/2*(2) |
1.349 |
|
|
1.0% 51.0% 92.2% 7.8% 0.6% 0.0% |
||
|
|
95% confidence interval: |
||
|
2 |
R |
8/2*(2) |
1.330 (-0.020) |
|
|
1.2% 50.7% 91.4% 8.6% 0.6% 0.0% |
||
|
|
95% confidence interval: |
||
|
3 |
R |
11/8(2) 10/7(2) |
1.293 (-0.056) |
|
|
1.2% 45.5% 92.1% 7.9% 0.6% 0.0% |
||
|
|
95% confidence interval: |
||
* |
4 |
R |
11/8 10/7(2) 5/2* |
1.276 (-0.074) |
|
|
1.3% 47.8% 90.6% 9.4% 1.2% 0.1% |
||
|
|
95% confidence interval: |
||
|
5 |
R |
14/8 10/7(2) |
1.264 (-0.085) |
|
|
0.7% 40.6% 93.0% 7.0% 0.5% 0.0% |
||
|
|
95% confidence interval: |
So what’s the verdict? If we are to believe the
comparative results, Snowie 4 seems to be far surer
of itself in many positons
where Snowie 3 was unable to give a confident
evaluation. Is it the ultimate player then? Perhaps, as it is touted to be
better than the best players on the planet according to its makers, and I have
no doubt it may very well be true, but that doesn't mean it is perfect, nor that it never makes any mistakes.
For example, the following position
occurred in a game from a top final:
8th
Frank Talbot - Nishikawa Kiyokazu 3-0/15
Snowie 4, at 3-ply Precise
declares Talbot’s move a blunder in no uncertain terms (thanks to Ilia Guzei for spotting it):
# |
Ply |
Move |
Equity |
|
|
1 |
3 |
24/23 11/5 |
1.346 |
|
|
2.9% 66.1% 86.2% 13.8% 1.5% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
2 |
3 |
24/23 14/8 |
1.341 (-0.005) |
|
|
2.9% 65.9% 86.1% 13.9% 1.5% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
3 |
3 |
14/8 11/10 |
1.223 (-0.124) |
|
|
2.7% 63.5% 82.4% 17.6% 1.7% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
* |
4 |
3 |
14/13 11/5 |
1.221 (-0.125) |
|
|
2.7% 63.6% 82.3% 17.7% 1.7% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
5 |
3 |
14/7 |
1.218 (-0.128) |
|
|
2.7% 63.5% 82.3% 17.7% 1.7% 0.1% |
||
|
|
Speed Parameter: Precise. |
||
|
6 |
3 |
11/4 |
1.214 (-0.132) |
|
|
2.7% 63.3% 82.2% 17.8% 1.7% 0.1% |
||
|
|
Speed Parameter: Precise. |
Yet there is no logic in this. Right now, Talbot has
13 shots to move up to the 20 point, all 4s and 31. By bringing up the rearmost
checker, Talbot would lose 2 more shots (the 31) and be left with only 11 shots
to try to move up before collapsing. The rollouts, set at a truncation depth of
19 to avoid a horizon effect, bear this out completely and Talbot’s move moves
up the list whereas Snowie’s previous favorite is
shown to be a serious mistake:
# |
Ply |
Move |
Equity |
|
|
1 |
R |
11/4 |
1.355 |
|
|
2.0% 67.0% 87.0% 13.0% 1.1% 0.0% |
||
|
|
95% confidence interval: |
||
|
2 |
R |
14/8 11/10 |
1.351 (-0.004) |
|
|
1.9% 66.7% 87.0% 13.0% 1.1% 0.0% |
||
|
|
95% confidence interval: |
||
* |
3 |
R |
14/13 11/5 |
1.350 (-0.005) |
|
|
2.0% 66.6% 87.1% 12.9% 1.1% 0.0% |
||
|
|
95% confidence interval: |
||
|
4 |
R |
14/7 |
1.347 (-0.009) |
|
|
1.9% 66.7% 87.1% 12.9% 1.2% 0.0% |
||
|
|
95% confidence interval: |
||
|
5 |
R |
24/23 11/5 |
1.271 (-0.084) |
|
|
2.0% 63.7% 84.8% 15.2% 1.4% 0.0% |
||
|
|
95% confidence interval: |
||
|
6 |
R |
24/23 14/8 |
1.252 (-0.104) |
|
|
1.9% 63.3% 84.2% 15.8% 1.6% 0.0% |
||
|
|
95% confidence interval: |
So clearly, it is hardly infallible. However, with
its new and much more refined knowledge, many things are bound to be
re-evaluated, and it is clear that interesting times are ahead. I await with
interest the verdict of the experts.