GNU Backgammon

Forum Archive : GNU Backgammon

 
Even-ply/odd-ply effect

From:   Tom Keith
Address:   tom@bkgm.com
Date:   17 October 2003
Subject:   GnuBG: Fractional-ply evaluators
Forum:   rec.games.backgammon
Google:   b650e130.0310171628.2bc5fafe@posting.google.com

Let me describe an experiment I did comparing zero-ply and one-ply
evaluations in GnuBG.

When GnuBG evaluates a position, you can tell it how far ahead you
want it to look.  A zero-ply evaluation does no lookahead -- you just
get the output of the program's neural net.  A one-ply evaluation
looks ahead one roll:  it looks at all 21 possible rolls, makes what
it believes the best play for each, and takes a weighted average of
the resulting positions.  Each additional ply of lookahead takes about
21 times as long as the previous level.

When you do rollouts in GnuBG, one of the parameters you can set is
what level of evaluation to use for checker plays.  Presumably one-ply
evaluation plays better than zero-ply, and two-ply plays better than
one-ply, etc.  However, there has been some discussion over the years
about whether odd-ply evaluations are as reliable as even-ply.  (See
http://www.bkgm.com/rgb/rgb.cgi?view+1061).

I thought I'd try an experiment comparing zero-ply and one-ply
evaluations.  Here's what I did:

1.  I collected a large number backgammon games between good players,
    some human-vs-human, some human-vs-computer.  From these I took
    a representative sample of positions. (However duplicate positions
    were deleted so early game positions are under-represented.)

2.  I rolled out each position to the end of the game thirty-six times
    using cubelss zero-ply evaluation. Variance reduction was applied.

3.  I took the root-mean-square average of the differences between
    GnuBG's zero-ply evaluation and the rollout results, and between
    GnuBG's one-play evaluation and the rollout results.  I looked
    only at game-winning chances; I didn't look at gammons or
    backgammons.

These are the results:

    Zero-ply evaluation:  Average error = 0.0300
    One-ply evaluation:   Average error = 0.0284

So one-ply evaluation does do better on average.  This is to be
expected; being able to look ahead one ply should be a help,
especially in volatile positions.

In certain games GnuBG's evaluation seems to oscillate back and forth
according to which side's turn it is to play.  When this happens, a
one-ply evaluation (which essentially looks at the game from the other
player's side) can give quite different numbers than a zero-ply
evaluation.  You might expect when zero-ply and one-ply evaluations
differ by a lot that the true value of the position is probably
somewhere in between.  I thought it would be interesting to see what
would happen if you had an evaluator that used the average of zero-ply
and one-ply.  I called this a "0.5-ply evaluation."

    0.5-ply evaluation:  Average error = 0.0245

So 0.5-ply does do better!  In fact, it does enough better to make you
wonder if it does even better than two-ply. (I didn't look into this.)

Can we do even better?  Something I noticed is that you can often
predict whether zero-ply or one-ply is better for a particular
position by looking at the relative pipcount.  (The relative pipcount
is your own pipcount minus your opponent's pipcount.)  When the
relative pipcount is between -160 and -40, one-ply usually does
better; when the relative pipcount is between 40 and 150, zero-ply
usually does better.  Let's call an evaluator based on this idea a
"hybrid evaluator."  How well does the hybrid evaluator perform?

    Hybrid evaluator:  Average error = 0.0225

It should be noted that these tests show how well GnuBG performs at
computing the ABSOLUTE equity of a position.  They may or may not
indicate an improvement in GnuBG's ability to *play* a position, since
playing depends on having accurate RELATIVE equities.  Nevertheless,
I'm guessing that the 0.5-ply and hybrid evaluators play better than
the integer-ply evaluators too.

Tom

Tom Keith  writes:

To follow up on my own post ...

It has been pointed out by some (most notably Robert-Jan Veldhuizen),
that GnuBG seems to handle cube decisions better at zero-ply than at
one-ply.

To test this, I selected positions from actual games in which the
player-on-roll's game-winning chances + gammons/2 was between 70% and
80%. In other words, I was trying to select positions in which a player
might be thinking about doubling, or his opponent might have to think
about whether to take or drop.

Comparing 0-ply evaluations with untruncated rollouts gave a standard
error of 0.0235.
Comparing 1-ply evaluations with untruncated rollouts gave a standard
error of 0.0288.

So 0-ply does do significantly better in cube-likely positions.  This
despite the fact that 1-ply does better at estimating absolute equity
in general.

Comparing the hybrid evaluator described in my previous post with the
untruncated rollouts gave a standard error of 0.0207.  So the hybrid
evaluator still does better than zero-ply, even on positions that
zero-ply seems to be particularly good at.

Tom Keith
Backgammon Galore!
http://www.bkgm.com
 
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     

 

GNU Backgammon

Analyzing GamesGrid matches  (Roy Passfield, Dec 2001) 
Batch analysis tool  (Øystein Johansen, June 2004)  [GammOnLine forum]
Cache size  (Ned Cross+, Mar 2004)  [GammOnLine forum]
Compiling for Windows  (Øystein Johansen, Jan 2002) 
Edit mode removing checker from bar  (Scott Steiner+, May 2003) 
Entering an annotated match  (Albert Silver, Dec 2003)  [GammOnLine forum]
Error rates: Gnu vs. Snowie  (Raccoon, Mar 2006)  [GammOnLine forum]
Even-ply/odd-ply effect  (Raccoon, Nov 2004) 
Even-ply/odd-ply effect  (Tom Keith+, Oct 2003) 
Even-ply/odd-ply effect  (Scott Steiner+, Dec 2002) 
Filter settings  (Robert-Jan Veldhuizen, Nov 2004)  [GammOnLine forum]
Gnu 0.13 versus Jellyfish and Snowie  (Torsten Schoop, Aug 2003) 
Gnu 0.13 vs. Snowie 4  (Albert Silver, June 2003) 
Gnu 0.14 vs. Jellyfish  (Michael Howard+, July 2003) 
Gnu versus Snowie and Jellyfish  (Michael Depreli, Oct 2005) 
How luck factor is calculated  (Gregg Cattanach, Aug 2002) 
How rollouts work  (Gary Wong, July 1999) 
How to enter an illegal move  (Øystein Johansen, Aug 2003)  [GammOnLine forum]
Importing .gam files  (PAR+, Mar 2005) 
Importing PartyGammon matches  (rew+, July 2006) 
Improving your game using GnuBG  (D.U.G.+, Nov 2002) 
Installing on Windows  (maareyes, Oct 2001) 
Interpreting JSD's  (Adrian Wright+, Feb 2005)  [GammOnLine forum]
JSD's and confidence intervals  (Daniel Murphy+, Jan 2005) 
Logging rollouts  (Øystein Johansen, Oct 2004)  [GammOnLine forum]
Luck rate  (Kees van den Doel+, May 2002) 
MWC versus Equity (EMG)  (Ken+, Apr 2005)  [GammOnLine forum]
Manually entering first roll  (Andreas Graf+, Apr 2005) 
Match equity tables  (Raccoon, July 2005)  [GammOnLine forum]
Personal reflections  (Louis Nardy Pillards, Sept 2002) 
Playing two computers against each other  (Stanley E. Richards+, Mar 2008)  [GammOnLine forum]
Python scripting  (Øystein Johansen+, Nov 2004) 
Quasi-random dice in rollouts  (Ian Shaw, Mar 2004)  [GammOnLine forum]
Question marks in game list  (Jim Segrave, July 2005) 
Questions and answers  (Jim Segrave+, Jan 2003) 
Questions and answers  (Jørn Thyssen, Aug 2002) 
Restarting a rollout with different settings  (Jim Segrave, Apr 2005)  [GammOnLine forum]
Restarting a rollout with different settings  (Robert-Jan Veldhuizen, Apr 2004)  [GammOnLine forum]
Rollout settings  (geoff arnold+, Apr 2007) 
Rollout settings  (Stick+, Nov 2005)  [GammOnLine forum]
Rollout settings  (Robert-Jan Veldhuizen, Mar 2004)  [GammOnLine forum]
Rollout settings  (Ian Dunstan, Aug 2003)  [GammOnLine forum]
Rollout settings for the impatient  (Robert-Jan Veldhuizen, June 2004)  [GammOnLine forum]
Running rollouts in background  (Bruce+, Apr 2004)  [GammOnLine forum]
Saving rollout results from command-line interface  (Jeremy Bagai+, Apr 2006)  [GammOnLine forum]
Saving rollouts  (Mislav Radica+, May 2006)  [GammOnLine forum]
Setting GnuBG's playing strength  (JP White, Sept 2001) 
Setting skill level  (Jim Segrave, Apr 2004) 
Setting up and saving a rollout  (Albert Silver, Dec 2003)  [GammOnLine forum]
What's GNU?  (Gary Wong, Oct 2001) 
Which player is player 0?  (Neil Kazaross+, Oct 2004)  [GammOnLine forum]

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition
 

  Book Suggestions
Books
Cheating
Chouettes
Computer Dice
Cube Handling
Cube Handling in Races
Equipment
Etiquette
Extreme Gammon
Fun and frustration
GNU Backgammon
History
Jellyfish
Learning
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Miscellaneous
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Programming
Propositions
Puzzles
Ratings
Rollouts
Rules
Rulings
Snowie
Software
Source Code
Strategy--Backgames
Strategy--Bearing Off
Strategy--Checker play
Terminology
Theory
Tournaments
Uncategorized
Variations

 

Return to:  Backgammon Galore : Forum Archive Main Page