Snowie 4 – Back to the 80s

 

by Albert Silver

 

 

The long-awaited Snowie 4 is out now, and I'd like to share some first observations on its play as compared to other bots, including most especially its older brother Snowie 3. Each new bot has debunked its predecessor, and improved the play and understanding of a whole gamut of positions. When Snowie 4 was announced it was reported to have revolutionary knowledge in backgames and prime vs. prime games. Levermann published a couple of articles at the Snowie site (http://www.bgsnowie.com) with many examples, and it was striking. Not only is its play much much stronger in these phases, but so is its evaluation of such positions. The repercussions of these changes in evaluation are so deep that it has even changed its evaluation of the opening role 21, and while no amount of rollouts by the previous cyberkings would favor the 13/11 6/5 slot, Snowie 4 favors it at almost any situation except when trying to more avidly avoid gammons due to the match score. Snowie 4 seems to slot far more often than Snowie 3, leading a friend to declare it the Mad Slotter and that it would bring back much of the older 80s style of play.

 

As it was, I still rather expected at least rollouts by its predecessors to hold up better, but I am seeing the most impressive reversals in choices that I wonder how tough it must be for strong experienced players. After all, me, I'm just starting out and make no claims to expertise, however, someone who has spent long hours rationalizing the results of previous bots will be forced to unlearn and re-learn presuming Snowie 4 is right.

 

I was playing through the July 1999 annotated match published at GammonLine, which had the benefit of Jellyfish rollouts using no less than 1296 trials, and entered it into Snowie 4 to see what it said. Here is a sample of what I saw:

 

1996 World Cup Final – Match #2

Kit Woolsey 1 – Malcolm Davis 0, Game 2, Move 3

 

 

White played 24/22 13/11* 6/4(2) and Jellyfish declared it only 3rd best (not by much it's true) saying:

 

13/11*  8/4  6/4                      +0.370 (-0.000)

13/11(2)*  6/4(2)                    +0.368  (-0.002)

24/22  13/11*  6/4(2)            +0.354  (-0.016)

 

However, Snowie 4 at 3-ply Precise, its strongest playing setting, says Jellyfish's top choice of the 3 is in fact the worst by a margin:

 

 

#

Ply

Move

Equity

 

1

3

13/11*(2) 6/4(2)

0.525

 

 

1.4%  28.7%  59.5%    40.5%  11.3%   0.6%

 

 

Speed Parameter: Precise.

*

2

3

24/22 13/11* 6/4(2)

0.504 (-0.021)

 

 

1.3%  27.6%  59.8%    40.2%  10.7%   0.5%

 

 

Speed Parameter: Precise.

 

3

3

13/11* 8/6 6/4(2)

0.474 (-0.050)

 

 

1.4%  28.7%  58.7%    41.3%  11.4%   0.6%

 

 

Speed Parameter: Precise.

 

A few moves later White doubled in the following position:

 

 

and we are told that the take was truly heroic due to the very high gammon risk. There are no Jellyfish rollout equities provided, but one assumes it must be a close call. Snowie 4 declares that not taking would be a huge blunder:

 

 

Cube action equity

3-Ply

Money equity:

0.495

 

1.8%  33.7%  62.3%    37.7%   9.9%   0.5%

1.

Double, take

0.737

 

2.

No double

0.612

(-0.125)

3.

Double, pass

1.000

(+0.263)

Proper cube action:   Double, take

 

 

But ok, that's Jellyfish, a much older program, so comparing it is not completely fair. Snowie 3 should be a different story unless we actually talk about backgames of course, which would present no surprise. I began doing the same with the more recent Special Annotated match (that I think wonderful by the way, in case any should think I'm suggesting otherwise), played in 2001 and complemented with Snowie 3 rollouts at 3-plies.

 

In Game 1 the two Snowies begin squabbling as of move 2 (I won't include the opening 21 slot):

 

2001 Pro-Am Doubles Tournament final

Woolsey/Arnold – Ballard/Huie, Game 1, Move 2

 

 

The commentary explains that "Snowie's 3-ply opinion has 13/11, 6/5 out in front, the split a close second, and 11/9, 6/5 a distant third. It is interesting that the rollout shows just how strong the builder on the nine point really is." since the rollout said:

 

13/11 6/5                   +0.134 (-0.000)

11/9    6/5                   +0.133 (-0.001)

24/22 6/5                   +0.100 (-0.034)

 

However, Snowie 4 has its own opinion on the matter and says for both 3-Ply Precise and a rollout:

 

 

#

Ply

Move

Equity

 

1

3

24/22 6/5

0.147

 

 

0.7%  15.4%  54.2%    45.8%  13.3%   0.7%

 

 

Speed Parameter: Precise.

*

2

3

13/11 6/5

0.146 (-0.002)

 

 

0.8%  16.3%  53.8%    46.2%  13.3%   0.7%

 

 

Speed Parameter: Precise.

 

3

3

11/9 6/5

0.110 (-0.038)

 

 

0.7%  16.3%  52.5%    47.5%  13.3%   0.7%

 

 

Speed Parameter: Precise.

 

For the rollout, I set trucation to 13 which should be deep enough to see the resulting positions, furthermore, according to the Snowie people, Snowie 4's 2-ply is slightly stronger than Snowie 3's 3-ply. Snowie 4 cannot play according to score in rollouts unfortunately (no doubt it will be re-instated in a future patch), but at 0-0 in a 21-point match, that should not be a factor. Did the rollouts change the evaluation any? You bet:

 

 

 

#

Ply

Move

Equity

 

1

R

24/22 6/5

0.175

 

 

0.8%  15.6%  55.0%    45.0%  12.9%   0.7%

 

 

95% confidence interval:
- money cubeless eq.: 0.128 ±0.007.
Rollout settings:
Truncated rollout, depth 13,
720 games (equiv. 103602 games),
played 2-ply,
random seed, with race database.

*

2

R

13/11 6/5

0.145 (-0.031)

 

 

0.9%  16.0%  54.0%    46.0%  13.7%   0.8%

 

 

95% confidence interval:
- money cubeless eq.: 0.104 ±0.008.
Rollout settings:
Truncated rollout, depth 13,
720 games (equiv. 105711 games),
played 2-ply,
random seed, with race database.

 

3

R

11/9 6/5

0.123 (-0.053)

 

 

0.9%  16.7%  53.0%    47.0%  13.6%   0.8%

 

 

95% confidence interval:
- money cubeless eq.: 0.090 ±0.008.
Rollout settings:
Truncated rollout, depth 13,
720 games (equiv. 101478 games),
played 2-ply,
random seed, with race database.

 

Now what was merely a matter of preference is an error, and 11/9 6/5 is considered to be even worse than at first sight. Astonishing. However, Snowie reserves some more surprises, and also helps get a clearer picture in several positions where even Snowie 3’s rollouts left one wondering.

 

 

            Here Snowie 3 spotted making the 8-point as best, unfortunately its indecisiveness even in its rollouts left one unsure as to how correct it was. “The rollouts have making the eight point on top, although not convincingly. Any of the approaches could work.” Here were the results:

 

11/8  9/8                     +0.503 (-0.000)

11/8  6/5                     +0.492 (-0.011)

9/5                               +0.456 (-0.047)

6/5  6/3                       +0.456 (-0.047)

 

Snowie 4’s 3-Ply Precise on the other hand had no doubts:

 

 

 

#

Ply

Move

Equity

 

1

3

11/8 9/8

0.859

 

 

0.9%  19.6%  69.4%    30.6%   6.8%   0.3%

 

 

Speed Parameter: Precise.

*

2

3

6/3 6/5

0.803 (-0.057)

 

 

0.9%  18.2%  68.5%    31.5%   6.6%   0.3%

 

 

Speed Parameter: Precise.

 

3

3

11/8 6/5

0.762 (-0.098)

 

 

1.0%  18.7%  68.2%    31.8%   6.6%   0.3%

 

 

Speed Parameter: Precise.

 

4

3

11/7

0.753 (-0.107)

 

 

0.9%  19.4%  66.7%    33.3%   6.8%   0.3%

 

 

Speed Parameter: Precise.

 

5

3

11/10 6/3

0.736 (-0.124)

 

 

0.9%  18.3%  66.4%    33.6%   7.2%   0.3%

 

 

Speed Parameter: Precise.

 

6

3

9/5

0.727 (-0.132)

 

 

0.9%  16.2%  67.8%    32.2%   6.6%   0.2%

 

 

Speed Parameter: Precise.

 

 

            As neither the master commentator nor Snowie 4 thought 9/5 had any real merits, I ignored it and rolled out the top 3 moves just to be sure. Again Snowie 4 did not waiver and even accentuated the difference:

 

 

#

Ply

Move

Equity

 

1

R

11/8 9/8

0.834

 

 

1.1%  18.8%  69.2%    30.8%   7.1%   0.3%

 

 

95% confidence interval:
- money cubeless eq.: 0.509 ±0.009.
Rollout settings:
Truncated rollout, depth 13,
720 games (equiv. 56762 games),
played 2-ply,
random seed, with race database.

*

2

R

6/3 6/5

0.760 (-0.074)

 

 

0.9%  17.7%  67.9%    32.1%   7.5%   0.4%

 

 

95% confidence interval:
- money cubeless eq.: 0.466 ±0.010.
Rollout settings:
Truncated rollout, depth 13,
720 games (equiv. 44655 games),
played 2-ply,
random seed, with race database.

 

3

R

11/8 6/5

0.754 (-0.080)

 

 

0.9%  18.1%  68.1%    31.9%   6.9%   0.4%

 

 

95% confidence interval:
- money cubeless eq.: 0.478 ±0.009.
Rollout settings:
Truncated rollout, depth 13,
720 games (equiv. 63181 games),
played 2-ply,
random seed, with race database.

 

 

And as the game proceeded this position appeared at move 9: 

 

 

where the move played was adamantly defended by the commentator “B/24*, 13/11 is more consistent with White's plan.” despite the fact “The rollout has getting the blot to safety better than B/24*, 14/12.” Now human intuition and vision is shown to have been right on target as Snowie 4 changes the verdict:

 

 

#

Ply

Move

Equity

*

1

3

bar/24* 13/11

0.541

 

 

2.3%  31.3%  69.1%    30.9%   7.5%   0.4%

 

 

Speed Parameter: Precise.

 

2

3

bar/24* 14/12

0.529 (-0.012)

 

 

2.5%  32.0%  68.4%    31.6%   7.9%   0.5%

 

 

Speed Parameter: Precise.

 

3

3

bar/24* 5/3

0.484 (-0.057)

 

 

2.4%  31.3%  67.1%    32.9%   8.5%   0.5%

 

 

Speed Parameter: Precise.

 

And again a rollout only served to confirm its judgement:

 

 

#

Ply

Move

Equity

*

1

R

bar/24* 13/11

0.608

 

 

1.9%  33.4%  70.5%    29.5%   5.9%   0.3%

 

 

95% confidence interval:
- money cubeless eq.: 0.701 ±0.011.
Rollout settings:
Truncated rollout, depth 17,
1080 games (equiv. 56067 games),
played 2-ply,
random seed, with race database.

 

2

R

bar/24* 14/12

0.584 (-0.024)

 

 

2.2%  34.8%  69.3%    30.7%   7.0%   0.4%

 

 

95% confidence interval:
- money cubeless eq.: 0.682 ±0.011.
Rollout settings:
Truncated rollout, depth 17,
1080 games (equiv. 58105 games),
played 2-ply,
random seed, with race database.

 

3

R

bar/24* 5/3

0.584 (-0.024)

 

 

2.1%  35.2%  69.3%    30.7%   7.2%   0.4%

 

 

95% confidence interval:
- money cubeless eq.: 0.683 ±0.012.
Rollout settings:
Truncated rollout, depth 17,
1080 games (equiv. 54185 games),
played 2-ply,
random seed, with race database.

 

 

 

 

And a few moves later, Snowie 4 again disagrees with the assessments of its sibling:

 

 

Here, Snowie 3’s rollouts presented the following evaluations:

 

8/2(2)*                        +1.361 (-0.000)

11/2* 5/2                    +1.348 (-0.013)

11/8 10/7(2) 5/2*       +1.346 (-0.015)

11/8(2) 10/7(2)          +1.340 (-0.021)

11/5 10/7(2)               +1.327 (-0.034)

14/8 10/7(2)               +1.302 (-0.059)

 

 

Clearly it wasn’t very sure of itself with unconvincing rollout results and Woolsey comments “I see that this play came out on top in the rollout, but I'm not inclined to believe it. 11/2*, 5/2 looks better to me. I think holding the eight point will turn out to be an asset, not a liability.Snowie 4 once more completely agrees, though with a far more assertive evaluation than Snowie 3:

 

 

#

Ply

Move

Equity

 

1

3

11/5 5/2*(2)

1.348

 

 

2.3%  52.2%  91.4%     8.6%   1.0%   0.1%

 

 

Speed Parameter: Precise.

 

2

3

11/8(2) 10/7(2)

1.317 (-0.031)

 

 

1.9%  48.8%  91.7%     8.3%   1.1%   0.1%

 

 

Speed Parameter: Precise.

 

3

3

8/2*(2)

1.314 (-0.034)

 

 

2.2%  51.5%  90.4%     9.6%   1.2%   0.1%

 

 

Speed Parameter: Precise.

 

4

3

14/8 10/7(2)

1.308 (-0.040)

 

 

1.8%  47.1%  92.1%     7.9%   0.9%   0.1%

 

 

Speed Parameter: Precise.

 

5

3

11/5 10/7(2)

1.292 (-0.056)

 

 

2.0%  49.0%  90.7%     9.3%   1.3%   0.1%

 

 

Speed Parameter: Precise.

*

6

3

11/8 10/7(2) 5/2*

1.278 (-0.070)

 

 

2.3%  48.9%  90.2%     9.8%   1.7%   0.1%

 

 

Speed Parameter: Precise.

 

And once more the rollouts do not change this verdict:

 

 

#

Ply

Move

Equity

 

1

R

11/5 5/2*(2)

1.349

 

 

1.0%  51.0%  92.2%     7.8%   0.6%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.357 ±0.009.
Rollout settings:
Truncated rollout, depth 17,
900 games (equiv. 41308 games),
played 2-ply,
random seed, with race database.

 

2

R

8/2*(2)

1.330 (-0.020)

 

 

1.2%  50.7%  91.4%     8.6%   0.6%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.341 ±0.008.
Rollout settings:
Truncated rollout, depth 17,
900 games (equiv. 45180 games),
played 2-ply,
random seed, with race database.

 

3

R

11/8(2) 10/7(2)

1.293 (-0.056)

 

 

1.2%  45.5%  92.1%     7.9%   0.6%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.302 ±0.008.
Rollout settings:
Truncated rollout, depth 17,
900 games (equiv. 48549 games),
played 2-ply,
random seed, with race database.

*

4

R

11/8 10/7(2) 5/2*

1.276 (-0.074)

 

 

1.3%  47.8%  90.6%     9.4%   1.2%   0.1%

 

 

95% confidence interval:
- money cubeless eq.: 1.291 ±0.009.
Rollout settings:
Truncated rollout, depth 17,
900 games (equiv. 46640 games),
played 2-ply,
random seed, with race database.

 

5

R

14/8 10/7(2)

1.264 (-0.085)

 

 

0.7%  40.6%  93.0%     7.0%   0.5%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.268 ±0.008.
Rollout settings:
Truncated rollout, depth 17,
900 games (equiv. 48134 games),
played 2-ply,
random seed, with race database.

 

So what’s the verdict? If we are to believe the comparative results, Snowie 4 seems to be far surer of itself in many positons where Snowie 3 was unable to give a confident evaluation. Is it the ultimate player then? Perhaps, as it is touted to be better than the best players on the planet according to its makers, and I have no doubt it may very well be true, but that doesn't mean it is perfect, nor that it never makes any mistakes.

 

            For example, the following position occurred in a game from a top final:

 

8th Japan Open Finals (2002)

Frank Talbot - Nishikawa Kiyokazu 3-0/15

 

 

            Snowie 4, at 3-ply Precise declares Talbot’s move a blunder in no uncertain terms (thanks to Ilia Guzei for spotting it):

 

 

#

Ply

Move

Equity

 

1

3

24/23 11/5

1.346

 

 

2.9%  66.1%  86.2%    13.8%   1.5%   0.1%

 

 

Speed Parameter: Precise.

 

2

3

24/23 14/8

1.341 (-0.005)

 

 

2.9%  65.9%  86.1%    13.9%   1.5%   0.1%

 

 

Speed Parameter: Precise.

 

3

3

14/8 11/10

1.223 (-0.124)

 

 

2.7%  63.5%  82.4%    17.6%   1.7%   0.1%

 

 

Speed Parameter: Precise.

*

4

3

14/13 11/5

1.221 (-0.125)

 

 

2.7%  63.6%  82.3%    17.7%   1.7%   0.1%

 

 

Speed Parameter: Precise.

 

5

3

14/7

1.218 (-0.128)

 

 

2.7%  63.5%  82.3%    17.7%   1.7%   0.1%

 

 

Speed Parameter: Precise.

 

6

3

11/4

1.214 (-0.132)

 

 

2.7%  63.3%  82.2%    17.8%   1.7%   0.1%

 

 

Speed Parameter: Precise.

 

Yet there is no logic in this. Right now, Talbot has 13 shots to move up to the 20 point, all 4s and 31. By bringing up the rearmost checker, Talbot would lose 2 more shots (the 31) and be left with only 11 shots to try to move up before collapsing. The rollouts, set at a truncation depth of 19 to avoid a horizon effect, bear this out completely and Talbot’s move moves up the list whereas Snowie’s previous favorite is shown to be a serious mistake:

 

 

#

Ply

Move

Equity

 

1

R

11/4

1.355

 

 

2.0%  67.0%  87.0%    13.0%   1.1%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.416 ±0.015.
Rollout settings:
Truncated rollout, depth 19,
1296 games (equiv. 17503 games),
played 2-ply,
random seed, with race database.

 

2

R

14/8 11/10

1.351 (-0.004)

 

 

1.9%  66.7%  87.0%    13.0%   1.1%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.416 ±0.015.
Rollout settings:
Truncated rollout, depth 19,
1296 games (equiv. 17226 games),
played 2-ply,
random seed, with race database.

*

3

R

14/13 11/5

1.350 (-0.005)

 

 

2.0%  66.6%  87.1%    12.9%   1.1%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.418 ±0.015.
Rollout settings:
Truncated rollout, depth 19,
1296 games (equiv. 16376 games),
played 2-ply,
random seed, with race database.

 

4

R

14/7

1.347 (-0.009)

 

 

1.9%  66.7%  87.1%    12.9%   1.2%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.414 ±0.015.
Rollout settings:
Truncated rollout, depth 19,
1296 games (equiv. 17747 games),
played 2-ply,
random seed, with race database.

 

5

R

24/23 11/5

1.271 (-0.084)

 

 

2.0%  63.7%  84.8%    15.2%   1.4%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.337 ±0.008.
Rollout settings:
Truncated rollout, depth 19,
1296 games (equiv. 78674 games),
played 2-ply,
random seed, with race database.

 

6

R

24/23 14/8

1.252 (-0.104)

 

 

1.9%  63.3%  84.2%    15.8%   1.6%   0.0%

 

 

95% confidence interval:
- money cubeless eq.: 1.320 ±0.008.
Rollout settings:
Truncated rollout, depth 19,
1296 games (equiv. 75648 games),
played 2-ply,
random seed, with race database.

 

So clearly, it is hardly infallible. However, with its new and much more refined knowledge, many things are bound to be re-evaluated, and it is clear that interesting times are ahead. I await with interest the verdict of the experts.