Backgammon Rollouts

Rollouts

Cautionary tale

From:   Kit Woolsey
Address:   kwoolsey@netcom.com
Date:   24 September 1995
Subject:   Re: X to play 6-3
Forum:   rec.games.backgammon
Google:   kwoolseyDFFEy1.DuE@netcom.com

David Montgomery (monty@cs.umd.edu) wrote: > +24-23-22-21-20-19-+---+18-17-16-15-14-13-+ > | X X O O O | | X O | > | O O O | | X O | > | | | O | > | | | O | > | | | | > | | | | [1] > | | | | > | | | O | > | | | X O | > | | | X O | > | X X X | | X O | > | X X X | | X X O | > +-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+ > Money game. X to play 6-3. > > Thanks for all the responses to this problem. > This is a position from _Costa Rica 1993_. > Wilcox Snellings played 22/13. > > My preference before seeing any rollouts > or analysis was for 11/5 7/4. This was also > the choice of Herb Gurland, a top Boston player. > The authors of _Costa Rica 1993_ also preferred > 11/5 7/4. > > Wilcox Snelling rolled two plays out by hand > 108 times with the following results: > > 11/5 7/4 -.42 > 22/13 -.50 > > The authors rolled another play out 108 times: > > 11/8 24/18 -.50 > > I rolled all of these plays out (and several others) > 3888 times on Jellyfish (no truncation, duplicate > dice, 3 sets of 1296 with seeds 2430, 2431, and 2432). > > Jellyfish cubeless equities: > > 22/13 -.363 > 22/16 11/8 -.405 > 22/16 7/4 -.416 > 24/18 11/8 -.446 > 11/5 7/4 -.449 > 24/18 7/4 -.456 > 24/15 -.458 > > I wasn't really that surprised that 22/13 came out > on top, although it wasn't the play that I would > have made. But I was *very* surprised that it came > out right by so much. This mistake actually costs > about 2/10 of a point when the cube is figured in. > > I would welcome any further illumination on why > 22/13 is so much better than the other plays, > especially 11/5 7/4. > > David Montgomery > monty on FIBS While Jellyfish rollouts are usually accurate and quite informative, occasionally they can give us wrong information. One the dangers is that the program is simply misplaying the position, and this affects one of the plays being rolled out more than the other one. Keep in mind that for the rollout the program is playing with only 1-ply (that is the same as level 5). This is necessary for speed purposes -- to use 2-ply in the rollouts would make the rollouts take far longer. The program still plays pretty well at 1-ply, but not nearly as well as 2-ply and therefore is more likely to be doing something wrong in the play. Most of the time this will not matter (particularly in play vs. play problems), since these errors in play tend to cancel out and generally are not huge anyway. Occasionally the two plays being tested lead to different types of positions, where one play gives the program a chance to make an error which the other play doesn't. When I saw David's results, I thought this might be happening. I thought after playing 11/5, 7/4 the program might be making the defensive three point if it rolled a two. I also thought this might be the wrong strategy -- hanging back on the ace point with the back man and springing the other checker could be better. So, I decided to run a test. I had X play 11/5, 7/4 with the 6-3, and gave O a 6-1 (played 13/6). This left the following position: 13 14 15 16 17 18 19 20 21 22 23 24 +------------------------------------------+ | O X | | O O O X X | | O X | | O O O | | O | | O | | O | | | | | | | | | | | | O | | | | O X | | X X | | O X | | X X X | | O X | | X X X | +------------------------------------------+ 12 11 10 9 8 7 6 5 4 3 2 1 Now I gave X a 4-2 to play, and looked at Jellyfish's 1-ply opinion. I also rolled out the three logical plays 2952 times each, duplicate dice. These were the results: Play 1-ply Rollout 24/22, 7/3 -.428 -.514 7/3, 5/3 -.501 -.486 22/18, 5/3 -.510 -.412 These results confirmed my suspicions. Jellyfish was thematically misplaying the position in its rollouts after playing 11/5, 7/4 with the original 6-3. However after playing 22/13 with the 6-3 the program didn't have the opportunity to make this sort of misplay, since there was no way to make the 22 point so there was no incentive to move the back checker. This misplay might be sufficient to turn the rollout results of the 6-3 around, and certainly explains why 22/13 came out so much better than 11/5, 7/4 in David's rollout. Any time you are suspicious about the results of a rollout, it is vital to examine how the program is playing at least the next couple of rolls before accepting the results of the rollout as gospel. The rollouts are good, but we still have to keep our eyes open or we may fall into some unexpected traps. Kit

Did you find the information in this article useful?

Do you have any comments you'd like to add?