Forum Archive : Rollouts

Some guidelines

From:   Kit Woolsey
Date:   7 April 1996
Subject:   Re: How best to do Jellyfish rollouts?

>       I just bought the Jellyfish analyzer 2.0 and am trying to
> figure out the best way to perform rollouts.  Depending on how I set
> the variables I get very different and even conflicting results.  As
> an example I used the opening position from Barclay Cooke's _Paradoxes
> and Probabilities._ Red opened with 4 1 and played 13/9 6/5.  White
> now has 4 4 to play.  I want to compare these two possible replies:
> 1)  24/20* 20/16* 13/9 13/9
> 2)  24/20* 20/16* 8/4  8/4
> Move 2 seems superior to me but Jellyfish Level 7 verified
> lookahead likes move 1 (.486 vs .461).
> I rolled it out 36 times on level 6 and got these results:
>   Move 1  .516
>   Move 2  .413
> I changed the random number seed and did another 36 rollouts:
>   Move 1  .451
>   Move 2  .559
> Since these results are conflicting I assumed 36 games is not
> enough to draw a conclusion.  I rolled it out 106 times at Level 6
> with a complete new seed:
>   Move 1 .439
>   Move 2 .530
> I then tried 7776 truncated rollouts on Level 5 with Horizon 20:
>   Move 1  .484
>   Move 2  .491
>       I don't like the idea of truncated rollouts because they rely
> heavily on JF's evaluation of the position.  If it is incorrectly
> evaluating the position then the results are not worth much.  It doesn't
> seem to evaluate backgames well and the above position easily turns into
> one.

As you are finding out, you can need a pretty big sample size to get
accurate results with a full rollout.  The luck factor can be pretty
large.  For example, when I first had access to a rollout program I tried
rolling out an opening 4-2 1296 times and got the startling result that
13/11, 13/9 was a bit better than 8/4, 6/4!  As you might guess this was
way off the end of the bell curve and a longer rollout quickly set things
straight, but it does give an idea of how large a sample size one might
need to be comfortable with full rollout results.

Truncated rollouts have two advantages.  First of all, they obviously
take much less time.  Secondly, the luck factor is cut down
considerably.  This is because you aren't dependent on the lucky rolls at
the end of the game which determine the winner -- that is factored into
the jellyfish estimates.  My experience has been that 1296 trials,
truncation 7, is quite sufficient for most play vs. play problems and
leads to good results.  As an experiment, try taking some play vs. play
problem (avoid backgames -- JF does have problems there), and roll out
the two plays 1296 times truncation 7 (same seed to get the duplicate
dice, of course).  Then try rolling them each out 10,000 times on a full
rollout (again, same seed).  I predict that the relative results will be
very similar -- i.e. if the truncated rollout says that play A is .03
better than play B then the full rollout will say about the same.  Note
that the truncated rollouts may give bad estimates for absolute equities
for various reasons, but for play vs. play problems they are very good.

There are many things that can screw up a JF rollout.  The most common is
that it is making some big thematic mistake on the first roll or two,
which it will be repeating over and over.  If you are really curious
about a position I suggest you see how it plays the first couple of
moves.  Also the program may have trouble handling the overall position
decently -- this is often a problem in some end-game positions
particularly backgames.  In general, however, the program plays quite
well even on level 5 (which you have to use to get the fast rollouts),
and for most normal positions the results are very accurate.

As for not believing rollouts, you have to be careful.  Sure it is easy
to be convinced by an expert's arguments.  He is probably convinced
himself.  The problem is that his arguments may be based on false
premises, which can lead to false conclusions.  Except for certain
end-games or technical plays it is very difficult to *prove* that a play
is correct -- one can argue the plusses and minuses of a play, but if the
weighting of the parameters is wrong you can get the wrong result.

Let's look at your actual example: the 4-4 response to the 4-1 opening.
We might find Robertie saying:
Making the four point is clear.  When the opponent has two men on the
bar, it is a must to go for the throat.  The swing if he rolls a four is
enormous.  The tactical gains outweigh the slight positional disadvantage
of giving up the eight point.
On the other hand, we might find Woolsey saying:
Making the nine point is clear.  You have a very strong advantage, and it
is time to solidify things.  Bringing the builders down gives you
ammunition to pounce wherever your opponent enters, and the solidity of
the nine and eight points will hold your advantage whatever happens.  The
positional gains from this play outweigh the slight temporary advantage of
making the four point.
Both arguments are reasonable, but which one is right?  The answer is we
don't know!  Only our judgment and experience can guide us here.  Your
initial impression was that Robertie's arguments are correct, and the
tactical considerations are overriding.  The rollouts showed the plays to
be very close (which, in fact, I think they are).  You have learned
something -- in this sort of position you are overweighing the tactical
considerations.  Now that you have this benchmark result down pat, you
can use it to help you with other similar plays.  For example, suppose
your opponent opens with a 5-4, plays 13/9, 13/8, and you roll 4-4.
Should you play 24/16*, 8/4(2) or 24/16*, 13/9(2).  My guess from your
comment about the original position is that, before seeing the rollout
results, you would have considered this a close call.  Not any more!  If
it was close with two men on the bar, then with only one man on the bar
the positional play of making the nine point must be considerably better
(which I believe it is).  This is the way you improve your backgammon play.

When a rollout result is considerably different than what you would
expect, don't be quick to disbelieve it.  Think about the position, and
see if maybe your weights of the relevant parameters are not correct.
Look hard -- there is almost always a reason for an unexpected rollout
result and if you can find that reason you will have learned a lot.  I
have seen many experts (myself included) make plays which were .150 or
more worse than another play without having any idea that they were
making an error.  There is a lot we have to learn about backgammon, and
the tool of the jellyfish rollout is by far the most valuable tool we
have today.  The results are definitely not always gospel, but there is
often a lot of truth in them.

Did you find the information in this article useful?          

Do you have any comments you'd like to add?     



Advice  (David Montgomery, Apr 1996)  [Long message]
Cautionary tale  (Kit Woolsey, Sept 1995) 
Combining rollouts  (Gregg Cattanach+, Dec 2003)  [GammOnLine forum]
Confidence intervals  (Bob Koca, Nov 2010) 
Confidence intervals  (Timothy Chow, May 2010) 
Confidence intervals  (Gerry Tesauro, Feb 1994) 
Cubeless vs centered-cube rollouts  (Ron Karr, Dec 1997) 
Duplicate dice  (David Montgomery, June 1998) 
How reliable are rollouts?  (David Montgomery, Aug 1999) 
Level-5 versus level-6 rollouts  (Michael J. Zehr, June 1998) 
Level-5 versus level-6 rollouts  (Chuck Bower, Aug 1997) 
Positions with inaccurate rollouts  (Douglas Zare, Oct 2002) 
Reporting results of rollouts  (David Montgomery, June 1995) 
Rollout settings  (Lokicol+, Apr 2010) 
Settlement limit  (Michael J. Zehr, Apr 1998) 
Settlement limit  (Kit Woolsey, Dec 1997) 
Settlement limit in races  (Alexander Nitschke, Dec 1997) 
Some guidelines  (Kit Woolsey, Apr 1996) 
Standard error and JSD  (rambiz+, Feb 2011) 
Standard error and JSD  (Stick+, Oct 2007) 
Systematic error  (Chuck Bower, Oct 1996) 
Tips for doing rollouts  (Douglas Zare, June 2002) 
Truncated rollouts  (Gregg Cattanach, Oct 2002) 
Truncated rollouts: pros and cons  (Jason Lee+, Jan 2006)  [GammOnLine forum]
What is a rollout?  (Gregg Cattanach, Dec 1999) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition

  Book Suggestions
Computer Dice
Cube Handling
Cube Handling in Races
Extreme Gammon
Fun and frustration
GNU Backgammon
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Source Code
Strategy--Bearing Off
Strategy--Checker play


Return to:  Backgammon Galore : Forum Archive Main Page