Backgammon Rollouts

Rollouts

Advice

From:   David Montgomery
Address:   monty@cs.umd.edu
Date:   7 April 1996
Subject:   Re: How best to do Jellyfish rollouts? (long)
Forum:   rec.games.backgammon
Google:   4k8njk$6lc@twix.cs.umd.edu

> I just bought the Jellyfish analyzer 2.0 and am trying to > figure out the best way to perform rollouts. Depending on how I set > the variables I get very different and even conflicting results. [ rolling out 2 plays: Play 1) 24/20/16* 13/9(2) Play 2) 24/20/16* 8/4(2) after an opening 4-1 played 13/9 6/5 ] [ results so far: Play 1) JF7 evaluation .486 Level 6 (36x) .516 Level 6 (36x) .451 Level 6 (106x) .439 Level 5 truncated (7776x) .484 Play 2) JF7 evaluation .461 Level 6 (36x) .413 Level 6 (36x) .559 Level 6 (106x) .530 Level 5 truncated (7776x) .491 ] > I don't like the idea of truncated rollouts because they rely > heavily on JF's evaluation of the position. If it is incorrectly > evaluating the position then the results are not worth much. It doesn't > seem to evaluate backgames well and the above position easily turns into > one. Well, it's true that truncated rollouts rely on JF's evaluations, but most of the time, and for most positions, this isn't much of a problem. This is because the errors in JF's evaluation will in large part cancel out -- sometimes the evaluation will be too high and other times too low. And JF evaluations are really pretty good. Better than human evaluations, anyway. Some error may remain if the game tends to develop into positions in which there is some bias in JF's evaluation. By itself, this usually isn't too much of a problem, because most positions tend to branch out into a wide variety of types of positions, and the positions which don't, and for which JF's evaluations are off, are often positions that you can't trust JF with anyway. If you review the rollouts of Robertie's _Advanced_Backgammon_, you can get a good feeling for the amount of error that typically arises from using truncated rollouts. For the position in question, there should be very little trouble with using truncated JF rollouts. JF understands opening checker play very well, and the game is likely to evolve into a wide variety of different kinds of positions, so there should be relatively little bias due to truncation. I disagree that this position will "easily" become a backgame. The first player should generally be very much trying to avoid this scenario, and will usually succeed. Certainly, with JF at the helm, this will very rarely become a backgame. The main advantages of truncated rollouts are two: 1) they are faster, and 2) they have lower variance. That is, they converge toward the "infinite rollout" equity with fewer trials, on average. Item two just means that you need fewer trials to get your answer, so the advantage of truncated rollouts comes down to just one thing, which is that they are faster. The disadvantage of truncated rollouts is that sometimes they are biased. This is less of a problem in a checker play rollout (which is also when speed is more of a concern), but very important for cube rollouts. But the more significant disadvantage to truncated rollouts is that JF does not give you "live cube" figures with truncated rollouts, which it does with non-truncated rollouts. This is obviously a problem when you are rolling out a cube action problem, but also a factor in many checker play problems (see, for example, Jeremy Bagai's excellent article in the Jan-Feb Inside Backgammon, or the solution to Inside Backgammon quiz problem #110). For these reasons, I almost always do complete rollouts, but truncated rollouts are not as suspect as you think. > I'm new to this rollout business and am not making much out of > the above results. I'm also starting to think that JF rollouts are > way overrated. I studied the JF rollouts of Robertie's Advanced > Backgammon and I find Robertie's logic far more convincing than the > rollouts in the vast majority of the problems. Well, my guess is that you're overrating Robertie's logic. The fact is, most interesting backgammon problems cannot be tackled by logic. Over the board, we reason as best we can, but ultimately we are just guessing based on our experience. Robertie recognizes this himself. A few years back he sharply criticized a problem solution by Kleinman (which was based on reasoning from general principles), and backed up his criticism with (hand) rollouts. Robertie wrote that backgammon was not "an exercise in deductive logic" but rather, at least for correctly analyzing positions, an exercise in empirical science. Rollout data is exactly what is needed to determine the correct play, most of the time. The fact is that many of Robertie's solutions are after-the-fact. Long propositions were played, and Robertie learned the result and saved the position. In his book, he justifies the solution based on logic or reasoning or breaking down the rolls or emphasizing one very important feature of the position. In doing this, he is showing the reader how one might approach the problem over the board, which is exactly what you want to know to play better backgammon. But the important thing to realize is that the empirical data came first, and the reasoning to point you to the correct play is derivative. Kit Woolsey has also often emphasized this point, by saying how he has learned a lot from trying to figure out rollout results which at first seemed unintuitive. Now, as to whether JF rollouts are overrated -- I guess it depends on the person and the position. JF rollouts are a tremendous source of empirical data for a wide variety of positions. But they do have their limitations. First of all, any rollout is subject to statistical variation. So when results come out very close, there is very good reason to be skeptical about the results' significance. JF gives the standard deviations of the rollouts it performs, so this can be a guide for that. Secondly, any position can be misplayed. Putting aside for the moment major thematic errors, small mistakes can be made favoring one side or the other, and these small mistakes should add a little more doubt to the significance of close results, even in positions that we believe JF handles well. Now, turning to the question of thematic errors, its well documented that JF has a few of these. Here are the ones that come to mind right now: - JF gets low results with outside primes -- the further outside, the more irrelevant the results. JF doesn't completely understand how to walk a prime home against a single trapped checker. - JF doesn't understand well how and when to try for a second checker after a bearoff hit. - JF gets high results in many backgames. However, I think this bias has been overemphasized. In backgames nearing resolution, where the timing issue has been resolved, as is the case in many forward (e.g., 34 or 45) backgames, JF's results are not that far off. In these cases, JF may give up a little due to having to walk its prime home after a hit, but probably not much. In deeper backgames, JF gives up more because capturing a second checker may be a significant consideration. Also, JF doesn't always understand when to split its rear checkers to generate more shots. In positions where the timing issue is not yet resolved, or where there is still significant forward equity, as in a two-way game, JF *may* give up significant equity because it often will avoid the backgame strategy that a human would choose. I emphasize may because I think JF is often right in avoiding the backgame, and that human players are often wrong about this. JF probably gives up the most in well-timed deep backgames where the leader is still a long ways from the bearin. - JF can get weird results in noncontact positions. This problem has probably been reduced by the bearoff database in JF2.0, but JF still isn't the best tool for these kinds of positions. - JF gets low results for many priming positions against one back, even when the prime is deep in the board. This is especially true when slotting the back of the prime is important. JF very often doesn't do this when it is correct. - Wilcox Snellings thinks JF gets high results vs deep anchor games, especially vs ace point games. I don't know whether this is true or not, but it's plausible. Part of the equity of acepoint games comes from capturing a second checker after a late hit. - JF can get results that are off in what I call "runaround" positions. These are positions where one side is trying to navigate the last few checkers around the opposition. An example is: side A has 4 checkers each on the 1, 2, and 3 points, and 1 checker each on the 4, 17 and 18 points; side B has a closed board, and 1 checker each on the 18, 19, and 20 points. JF doesn't count shots, so sometimes it makes significant checker play errors when rolling these positions out. - JF gets low results in bearoff hit positions in which there is a lot of play. For example, X O O . X . | | . . . . X . [2] O O | | O O | | O O | | | | X X | | X X X . . O X O X | | X X X . X X X's home board. O has 5 off With O owning a 2-cube, X's equity is about 0.70. JF gets .261 cubeless, .345 after doubling to 2 (3888 trials). Interesting, humans tend to overrate the value of these positions. - many purely technical decisions are less amenable to rollouts, whether by JF or humans. This is especially true if the technical decision tends to repeat itself. - because of the way JF uses the cube in live cube rollouts, sometimes its cube numbers are way off. A common example is a position where the trailer has a busted board, one checker back at the edge of a five prime, and the leader has checkers back in the trailer's home board. In this situation, the trailer may leap the prime and obtain a double-in (which JF doesn't recognize), only to obtain a huge cash one roll later. In general, if the trailer has only one common recube variation, and this variation yields mostly weak doubles-in, JF's live cube algorithm will not give accurate results. - Another common live cube error ends up with the cube owner doing *worse* owning the cube. Apparently when this happens JF has erroneously played on for the gammon some of the time. So yes, JF rollouts cannot be trusted implicitly. However, for most positions JF rollouts are the best source for equities, and considered carefully, the best tool for improving your game. An interesting corrolary to the fact that JF misplays the above situations, is that JF plays other types of positions *better* than a human of overall equivalent strength would. This shows up most prominently in the play of attacking positions, where JF frequently gets results that are higher than humans get. > I would appreciate some advice from those more experienced > with rollouts as to how better utilize the program. What paramaters > work best for the above rollout? Here's my advice: - Always do rollouts in multiples of 36 (unlike the 106 game rollout) and in multiples of 1296 if doing level 5 rollouts. - If you have time and a fast enough computer, do complete rollouts. This way you avoid any bias and get the cube numbers as well. - When doing checker play rollouts, set the seed the same for all the plays under consideration. - Don't regard checker play results that are within 2 standard deviations as anything significant. If you don't want to bother to look at the standard deviations, as a rule of thumb, consider differences of .10 significant for rollouts of 1296, .07 for 2592, .06 for 3888. - There are decreasing returns as you roll positions out more times. You will reduce the standard deviation, but if the equities are still close, the errors in checker play are probably more significant than the random error. I usually don't roll plays out more than 3888 times. - When rolling out checker plays, go ahead and roll out all those plays that fit the themes of the position, even if you don't think they are candidate plays. Occasionally one of these plays you didn't like will actually turn out to be best, and you'll learn something. If you're short on time, do small short truncated rollouts first to identify the real candidates. - Look at both the cube numbers and the cubeless numbers. - If you really want to understand what's going on in a cube action situation, rollout several variations of the position, so that you can see how they affect the equity. Use the same seed for all of these rollouts. - DON'T just believe the rollout results as though they came from on high. But try to understand the sorts of positions where the results are off, and why, so that you can know when you can trust JF and when you should be skeptical, and the probable direction of the error. - If you suspect that the rollout is biased, you can look at how JF plays the first numbers, or set up a few important variations to see how it plays those. You may find that with level 6 it does a better job, in which case use that. If it still seems to be playing the position wrong on level 6, use the interactive rollout feature. One approach would be to play it 36x with you playing one side, JF level 6 the other, and then another 36x with you playing the other side and JF level 6 the first. If you're right that JF is screwing the position up (and you're not), you'll see it in the results. - Be careful about interpreting the rollout results for a particular match score. JF does all its rollouts based on choosing the best cubeless plays, with gammons and backgammons counting (and counting equally for both sides), so the results may not be valid in a match situation. For many match scores, there is no satisfactory way to set the JF cashing parameter to give a reasonable match live cube rollout, so you are better off interpreting the cubeless numbers. My experience is mostly with using JF level 5 rollouts, but it may well be better to use JF level 6 by default. It certainly plays better on level 6, and with JF's variance reduction algorithm, its not a lot slower, effectively. Hope this is of some use to you, David Montgomery monty on FIBS

Did you find the information in this article useful?

Do you have any comments you'd like to add?