Forum Archive :
Programming
Training for different gammon values
|
Bob Koca wrote:
> On a related note is this a good idea for neural nets?:
>
> Train separately for double match point and for one way gammon
> positions. Seems like it should not be too much extra effort
> since previously obtained weights for money play could be
> used as the starting point and difference will not be huge.
> I think the gain in playing ability near the end of a match
> could be substantial.
This seems like a potentially good idea, because the regular
neural net trained for money play will give you win and gammon
estimates *assuming money-type plays are made* by both sides.
Thus it's conceivable that if we trained a win estimator
under conditions of double match point where gammons don't
count, then such a win estimator might be different from
the win estimator that comes out of money-condition training.
However, this appears not to be the case empirically. I once
trained a double match point version of TD-Gammon, and compared
it to regular TD with the gammon estimates turned off.
There seemed to be no measurable advantage in playing ability
of the DMP version, at least at the 1-ply level. One-way
gammons I don't know about, but I'd have to guess that you
get reasonably correct play just by taking the regular
money-play net and re-weighting the win and gammon outputs
appropriately. Perhaps Fredrik or Harald could contribute
further to the discussion.
-- Gerry Tesauro (tesauro@watson.ibm.com)
|
|
|
|
Programming
- Adjusting to a weaker opponent (Brian Sheppard, July 1997)
- Anticomputer positions (Bill Taylor+, June 1998)
- BKG 9.8 vs. Villa (Raccoon+, Aug 2006)
- BKG 9.8 vs. Villa (Andreas Schneider, June 1992)
- BKG beats world champion (Marty Storer, Sept 1991)
- Backgames (David Montgomery+, June 1998)
- Blockading feature (Sam Pottle+, Feb 1999)
- Board encoding for neural network (Brian Sheppard, Feb 1997)
- Bot weaknesses (Douglas Zare, Mar 2003)
- Building and training a neural-net player (Brian Sheppard, Aug 1998)
- How to count plies? (Chuck Bower+, Jan 2004)
- How to count plies? (tanglebear+, Mar 2003)
- Ideas for improving computer play (David Montgomery, Feb 1994)
- Ideas on computer players (Brian Sheppard, Feb 1997)
- Introduction (Gareth McCaughan, Oct 1994)
- Measuring Difficulty (John Robson+, Feb 2005)
- Methods of encoding positions (Gary Wong, Jan 2001)
- N-ply algorithm (eXtreme Gammon, Jan 2011)
- Neural net questions (Brian Sheppard, Mar 1999)
- Pruning the list of moves (David Montgomery+, Feb 1994)
- Search in Trees with Chance Nodes (Thomas Hauk, Feb 2004)
- Source code (Gary Wong, Dec 1999)
- TD-Gammon vs. Robertie (David Escoffery, June 1992)
- Training for different gammon values (Gerry Tesauro, Feb 1996)
- Training neural nets (Walter Trice, Nov 2000)
- Variance reduction in races (David Montgomery+, Dec 1998)
- Variance reduction of rollouts (Michael J. Zehr+, Aug 1998)
- Variance reduction of rollouts (Jim Williams, June 1997)
- What is a "neural net"? (Gary Wong, Oct 1998)
- Writing a backgammon program (Gary Wong, Jan 1999)
From GammOnLine
Long message
Recommended reading
Recent addition
|
| |
|