> I just bought the Jellyfish analyzer 2.0 and am trying to
> figure out the best way to perform rollouts. Depending on how I set
> the variables I get very different and even conflicting results.
[ rolling out 2 plays:
Play 1) 24/20/16* 13/9(2)
Play 2) 24/20/16* 8/4(2)
after an opening 4-1 played 13/9 6/5 ]
[ results so far:
Play 1) JF7 evaluation .486
Level 6 (36x) .516
Level 6 (36x) .451
Level 6 (106x) .439
Level 5 truncated (7776x) .484
Play 2) JF7 evaluation .461
Level 6 (36x) .413
Level 6 (36x) .559
Level 6 (106x) .530
Level 5 truncated (7776x) .491
]
> I don't like the idea of truncated rollouts because they rely
> heavily on JF's evaluation of the position. If it is incorrectly
> evaluating the position then the results are not worth much. It doesn't
> seem to evaluate backgames well and the above position easily turns into
> one.
Well, it's true that truncated rollouts rely on JF's evaluations,
but most of the time, and for most positions, this isn't much of
a problem. This is because the errors in JF's evaluation will
in large part cancel out -- sometimes the evaluation will be too
high and other times too low. And JF evaluations are really pretty
good. Better than human evaluations, anyway. Some error may remain
if the game tends to develop into positions in which there is some
bias in JF's evaluation. By itself, this usually isn't too much
of a problem, because most positions tend to branch out into a wide
variety of types of positions, and the positions which don't, and
for which JF's evaluations are off, are often positions that you
can't trust JF with anyway. If you review the rollouts of Robertie's
_Advanced_Backgammon_, you can get a good feeling for the amount
of error that typically arises from using truncated rollouts.
For the position in question, there should be very little trouble
with using truncated JF rollouts. JF understands opening checker
play very well, and the game is likely to evolve into a wide
variety of different kinds of positions, so there should be
relatively little bias due to truncation. I disagree that this
position will "easily" become a backgame. The first player should
generally be very much trying to avoid this scenario, and will usually
succeed. Certainly, with JF at the helm, this will very rarely
become a backgame.
The main advantages of truncated rollouts are two:
1) they are faster, and
2) they have lower variance. That is, they converge toward the
"infinite rollout" equity with fewer trials, on average.
Item two just means that you need fewer trials to get your
answer, so the advantage of truncated rollouts comes down
to just one thing, which is that they are faster.
The disadvantage of truncated rollouts is that sometimes they
are biased. This is less of a problem in a checker play rollout
(which is also when speed is more of a concern), but very
important for cube rollouts. But the more significant disadvantage
to truncated rollouts is that JF does not give you "live cube"
figures with truncated rollouts, which it does with non-truncated
rollouts. This is obviously a problem when you are rolling out
a cube action problem, but also a factor in many checker play
problems (see, for example, Jeremy Bagai's excellent article in
the Jan-Feb Inside Backgammon, or the solution to Inside Backgammon
quiz problem #110). For these reasons, I almost always do
complete rollouts, but truncated rollouts are not as suspect
as you think.
> I'm new to this rollout business and am not making much out of
> the above results. I'm also starting to think that JF rollouts are
> way overrated. I studied the JF rollouts of Robertie's Advanced
> Backgammon and I find Robertie's logic far more convincing than the
> rollouts in the vast majority of the problems.
Well, my guess is that you're overrating Robertie's logic. The
fact is, most interesting backgammon problems cannot be tackled
by logic. Over the board, we reason as best we can, but ultimately
we are just guessing based on our experience. Robertie recognizes
this himself. A few years back he sharply criticized a problem
solution by Kleinman (which was based on reasoning from general
principles), and backed up his criticism with (hand) rollouts.
Robertie wrote that backgammon was not "an exercise
in deductive logic" but rather, at least for correctly analyzing
positions, an exercise in empirical science. Rollout data is
exactly what is needed to determine the correct play, most
of the time.
The fact is that many of Robertie's solutions are after-the-fact.
Long propositions were played, and Robertie learned the result
and saved the position. In his book, he justifies the solution
based on logic or reasoning or breaking down the rolls or
emphasizing one very important feature of the position. In doing
this, he is showing the reader how one might approach the problem
over the board, which is exactly what you want to know to play
better backgammon. But the important thing to realize is that
the empirical data came first, and the reasoning to point you to
the correct play is derivative. Kit Woolsey has also often
emphasized this point, by saying how he has learned a lot from
trying to figure out rollout results which at first seemed unintuitive.
Now, as to whether JF rollouts are overrated -- I guess it depends
on the person and the position. JF rollouts are a tremendous source
of empirical data for a wide variety of positions. But they do
have their limitations. First of all, any rollout is subject to
statistical variation. So when results come out very close, there
is very good reason to be skeptical about the results' significance.
JF gives the standard deviations of the rollouts it performs, so
this can be a guide for that.
Secondly, any position can be misplayed. Putting aside for the
moment major thematic errors, small mistakes can be made favoring
one side or the other, and these small mistakes should add a little
more doubt to the significance of close results, even in positions
that we believe JF handles well.
Now, turning to the question of thematic errors, its well documented
that JF has a few of these. Here are the ones that come to mind
right now:
- JF gets low results with outside primes -- the further outside,
the more irrelevant the results. JF doesn't completely understand
how to walk a prime home against a single trapped checker.
- JF doesn't understand well how and when to try for a second checker
after a bearoff hit.
- JF gets high results in many backgames. However, I think this
bias has been overemphasized. In backgames nearing resolution,
where the timing issue has been resolved, as is the case in many
forward (e.g., 34 or 45) backgames, JF's results are not that
far off. In these cases, JF may give up a little due to having
to walk its prime home after a hit, but probably not much. In
deeper backgames, JF gives up more because capturing a second checker
may be a significant consideration. Also, JF doesn't always
understand when to split its rear checkers to generate more shots.
In positions where the timing issue is not yet resolved, or where
there is still significant forward equity, as in a two-way game,
JF *may* give up significant equity because it often will avoid
the backgame strategy that a human would choose. I emphasize may
because I think JF is often right in avoiding the backgame, and that
human players are often wrong about this. JF probably gives up
the most in well-timed deep backgames where the leader is still
a long ways from the bearin.
- JF can get weird results in noncontact positions. This problem
has probably been reduced by the bearoff database in JF2.0, but
JF still isn't the best tool for these kinds of positions.
- JF gets low results for many priming positions against one back,
even when the prime is deep in the board. This is especially
true when slotting the back of the prime is important. JF very
often doesn't do this when it is correct.
- Wilcox Snellings thinks JF gets high results vs deep anchor
games, especially vs ace point games. I don't know whether this
is true or not, but it's plausible. Part of the equity of acepoint
games comes from capturing a second checker after a late hit.
- JF can get results that are off in what I call "runaround" positions.
These are positions where one side is trying to navigate the last
few checkers around the opposition. An example is: side A has
4 checkers each on the 1, 2, and 3 points, and 1 checker each on
the 4, 17 and 18 points; side B has a closed board, and 1 checker each
on the 18, 19, and 20 points. JF doesn't count shots, so sometimes
it makes significant checker play errors when rolling these positions
out.
- JF gets low results in bearoff hit positions in which there is
a lot of play. For example,
X O O . X . | | . . . . X . [2]
O O | |
O O | |
O O | |
| |
X X | | X X X
. . O X O X | | X X X . X X
X's home board. O has 5 off
With O owning a 2-cube, X's equity is about 0.70. JF gets
.261 cubeless, .345 after doubling to 2 (3888 trials).
Interesting, humans tend to overrate the value of these
positions.
- many purely technical decisions are less amenable to rollouts,
whether by JF or humans. This is especially true if the technical
decision tends to repeat itself.
- because of the way JF uses the cube in live cube rollouts, sometimes
its cube numbers are way off. A common example is a position where
the trailer has a busted board, one checker back at the edge of a
five prime, and the leader has checkers back in the trailer's home
board. In this situation, the trailer may leap the prime and obtain
a double-in (which JF doesn't recognize), only to obtain a huge
cash one roll later. In general, if the trailer has only one common
recube variation, and this variation yields mostly weak doubles-in,
JF's live cube algorithm will not give accurate results.
- Another common live cube error ends up with the cube owner doing
*worse* owning the cube. Apparently when this happens JF has
erroneously played on for the gammon some of the time.
So yes, JF rollouts cannot be trusted implicitly. However, for
most positions JF rollouts are the best source for equities, and
considered carefully, the best tool for improving your game.
An interesting corrolary to the fact that JF misplays the above
situations, is that JF plays other types of positions *better*
than a human of overall equivalent strength would. This shows
up most prominently in the play of attacking positions, where
JF frequently gets results that are higher than humans get.
> I would appreciate some advice from those more experienced
> with rollouts as to how better utilize the program. What paramaters
> work best for the above rollout?
Here's my advice:
- Always do rollouts in multiples of 36 (unlike the 106 game rollout)
and in multiples of 1296 if doing level 5 rollouts.
- If you have time and a fast enough computer, do complete rollouts.
This way you avoid any bias and get the cube numbers as well.
- When doing checker play rollouts, set the seed the same for all
the plays under consideration.
- Don't regard checker play results that are within 2 standard
deviations as anything significant. If you don't want to bother
to look at the standard deviations, as a rule of thumb, consider
differences of .10 significant for rollouts of 1296, .07 for
2592, .06 for 3888.
- There are decreasing returns as you roll positions out more times.
You will reduce the standard deviation, but if the equities are
still close, the errors in checker play are probably more significant
than the random error. I usually don't roll plays out more than
3888 times.
- When rolling out checker plays, go ahead and roll out all those
plays that fit the themes of the position, even if you don't think
they are candidate plays. Occasionally one of these plays you
didn't like will actually turn out to be best, and you'll learn
something. If you're short on time, do small short truncated
rollouts first to identify the real candidates.
- Look at both the cube numbers and the cubeless numbers.
- If you really want to understand what's going on in a cube action
situation, rollout several variations of the position, so that
you can see how they affect the equity. Use the same seed for
all of these rollouts.
- DON'T just believe the rollout results as though they came
from on high. But try to understand the sorts of positions
where the results are off, and why, so that you can know
when you can trust JF and when you should be skeptical, and
the probable direction of the error.
- If you suspect that the rollout is biased, you can look at how
JF plays the first numbers, or set up a few important variations
to see how it plays those. You may find that with level 6 it
does a better job, in which case use that. If it still seems
to be playing the position wrong on level 6, use the interactive
rollout feature. One approach would be to play it 36x with
you playing one side, JF level 6 the other, and then another
36x with you playing the other side and JF level 6 the first.
If you're right that JF is screwing the position up (and you're
not), you'll see it in the results.
- Be careful about interpreting the rollout results for a
particular match score. JF does all its rollouts based on
choosing the best cubeless plays, with gammons and backgammons
counting (and counting equally for both sides), so the results
may not be valid in a match situation. For many match scores,
there is no satisfactory way to set the JF cashing parameter
to give a reasonable match live cube rollout, so you are better
off interpreting the cubeless numbers.
My experience is mostly with using JF level 5 rollouts, but it
may well be better to use JF level 6 by default. It certainly
plays better on level 6, and with JF's variance reduction algorithm,
its not a lot slower, effectively.
Hope this is of some use to you,
David Montgomery
monty on FIBS
|