Programming

Forum Archive : Programming

 
Writing a backgammon program

From:   Gary Wong
Address:   gary@cs.arizona.edu
Date:   20 January 1999
Subject:   Re: Writing a bg program
Forum:   rec.games.backgammon
Google:   wtemopth5j.fsf@brigantine.CS.Arizona.EDU

Whigdon2 writes:
> My intuition tells me to start at the end, and work backwards, i.e.
> develop a program that plays a near perfect end game strategy, and
> is knowledgeable with the cube, and then work my way backwards.  I
> think that I would like the program to make decisions based upon
> it's own simulations, where many of the moves are based upon prior
> simulations.  I realize that this may be a bit idealistic of me, and
> that I am likely to face some memory constraints.

I believe this approach is feasible; I am working along similar lines
myself.  The best finished player I have is only of intermediate
strength (it plays as Costello on FIBS, rated around 1600 +/- 50) but
I have bits and pieces of a newer player which I expect will be
significantly stronger.  Using strong evaluations of subsequent
positions to train evaluators of positions early in the game appears
to work well in practice.  In some sense, TD training can be
considered as doing this; a more explicit approach I am investigating
is to break the set of all positions into fine-grained classes, use a
distinct evaluator for each class, establish a partial ordering on
those classes, roll out a large number of positions in those classes
according to the partial ordering, and use supervised training to
obtain the evaluators of each other class in turn.

"Knowledgeable with the cube" may or may not be important: the
standard heuristics for deriving cube behaviour given cubeless
evaluations seem to work adequately if you are careful.  Ideally it
would be nice to have a genuinely cubeful evaluator, but at present it
doesn't appear to be worth the headaches (ie. correcting the absolute
error in current evaluators still appears to be more fruitful than
making those evaluators cubeful).

Memory constraints have not been a problem in my experience: for
bearoff classes I use a couple of databases mmap()ed into memory,
though this could just as easily be read off disk as required; for
most other classes I use neural net evaluators which need of the order
of 100KB each.

> I am hoping to borrow from the experiences of others.  Please share your
> references,  ideas, or suggestions.

References:
-----------
  There isn't a huge amount in the academic literature on backgammon
  playing programs (compared to chess or othello for instance), but
  it would be worth your time tracking down papers by Tesauro (for
  applying TD training to backgammon) and Berliner (general ideas
  about what metrics to derive from positions).

  There have been many articles posted here which are collectively
  invaluable; search on Deja News or look for archived posts at:

    http://www.bkgm.com/rgb/rgb.cgi?menu+programming

  More references on the Web are available at:

    http://forum.swarthmore.edu/~jay/learn-game/systems/gammon.html

  Some of my code can be grabbed from:

    ftp://ftp.cs.arizona.edu/people/gary/

    (still very badly organised, sorry -- may or may not compile depending
    when you look at it :-)

Ideas:
------
  In the long term I want to take advantage of the fact that supervised
  training is highly parallelisable (in that almost all of the time is
  spent acquiring accurate training data by rolling out positions; once
  the training data is there it only takes a few hours to train a net
  from it).  So far I have been generating all training data myself, but
  once I am able to release a player strong enough to generate
  accurate rollout data to anybody who wants it, I want to collect the
  results from those who can spare the CPU time to roll out what they
  can; distribute subsequent evaluators trained on that data; use them
  to generate subsequent results, etc. etc.

Suggestions:
------------
  If you want to get started quickly it might be worth taking a look at
  my code for administrative fluff like move generation, bearoff
  databases, generic neural net code etc. so you can spend more time on
  interesting problems.

  Grab a few free players (eg. xgammon, Tesauro's pubeval, mine) and
  use them to benchmark your player against.

  If using a neural net evaluator, the approach I take to training (which
  may be a long way from optimal, but it's very slow trying different
  things to see what works best!) is to start with random weights; use
  TD training until the network stops improving; then generate LOTS
  (hundreds of thousands) of rollouts and start supervised training on
  those.  I find it very hard to manage the quantity/quality tradeoff.
  My approach so far has been not to go overboard on the number of
  trials per position: estimate the standard error you'd expect your
  evaluator to have after training and don't bother reducing the
  standard deviation of your rollouts too far below that; I suspect
  you'd be better spending your time rolling out more varied positions.
  In the past I have used no lookahead in the rollouts just so I could
  get a large amount of data in reasonable time; now I'm looking at
  using 1-ply lookahead with variance reduction, mainly to lower the
  _systematic_ error in the data.  Neural nets are very good at
  ignoring statistical noise in training data, but are hurt by
  systematic error.  My rough guess is that a standard deviation
  of ~0.01 from 36 or 72 rollouts per position is more than sufficient.

I hope some of the above has been useful -- yell out of you want more
detail in any area.

Cheers,
Gary.
--
        Gary Wong, Department of Computer Science, University of Arizona
            gary@cs.arizona.edu     http://www.cs.arizona.edu/~gary/
 
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     

 

Programming

Adjusting to a weaker opponent  (Brian Sheppard, July 1997) 
Anticomputer positions  (Bill Taylor+, June 1998) 
BKG 9.8 vs. Villa  (Raccoon+, Aug 2006) 
BKG 9.8 vs. Villa  (Andreas Schneider, June 1992) 
BKG beats world champion  (Marty Storer, Sept 1991) 
Backgames  (David Montgomery+, June 1998)  [Long message]
Blockading feature  (Sam Pottle+, Feb 1999)  [Long message]
Board encoding for neural network  (Brian Sheppard, Feb 1997) 
Bot weaknesses  (Douglas Zare, Mar 2003) 
Building and training a neural-net player  (Brian Sheppard, Aug 1998) 
How to count plies?  (Chuck Bower+, Jan 2004)  [GammOnLine forum]
How to count plies?  (tanglebear+, Mar 2003) 
Ideas for improving computer play  (David Montgomery, Feb 1994) 
Ideas on computer players  (Brian Sheppard, Feb 1997) 
Introduction  (Gareth McCaughan, Oct 1994) 
Measuring Difficulty  (John Robson+, Feb 2005)  [GammOnLine forum]
Methods of encoding positions  (Gary Wong, Jan 2001) 
N-ply algorithm  (eXtreme Gammon, Jan 2011) 
Neural net questions  (Brian Sheppard, Mar 1999) 
Pruning the list of moves  (David Montgomery+, Feb 1994) 
Search in Trees with Chance Nodes  (Thomas Hauk, Feb 2004) 
Source code  (Gary Wong, Dec 1999) 
TD-Gammon vs. Robertie  (David Escoffery, June 1992) 
Training for different gammon values  (Gerry Tesauro, Feb 1996) 
Training neural nets  (Walter Trice, Nov 2000) 
Variance reduction in races  (David Montgomery+, Dec 1998)  [Long message]
Variance reduction of rollouts  (Michael J. Zehr+, Aug 1998)  [Long message]
Variance reduction of rollouts  (Jim Williams, June 1997) 
What is a "neural net"?  (Gary Wong, Oct 1998) 
Writing a backgammon program  (Gary Wong, Jan 1999) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition
 

  Book Suggestions
Books
Cheating
Chouettes
Computer Dice
Cube Handling
Cube Handling in Races
Equipment
Etiquette
Extreme Gammon
Fun and frustration
GNU Backgammon
History
Jellyfish
Learning
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Miscellaneous
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Programming
Propositions
Puzzles
Ratings
Rollouts
Rules
Rulings
Snowie
Software
Source Code
Strategy--Backgames
Strategy--Bearing Off
Strategy--Checker play
Terminology
Theory
Tournaments
Uncategorized
Variations

 

Return to:  Backgammon Galore : Forum Archive Main Page