The March-April (Volume 2, Number 2) and May-June (Volume 2, Number 3)
issues of "Inside Backgammon" featured a two-part article by twice world
champion Bill Robertie reporting on the state of backgammon computers and
annotates his match against Dr. Gerald Tesauro's program, TD-Gammon. and I
trust that Kent will not mind me reproducing some of Robertie's article for
the benefit of those on the net.
Anyways, Dr. Gerald Tesauro is employed at the IBM Research Labs in White
Plains, New York. His latest backgammon program is called TD-Gammon and is
based on neural network theory. TD-Gammon runs on an RS-6000 workstation.
In the beginning, it was programmed only with the actual rules of
backgammon and the ability to generate legal moves. No knowledge was
programmed into it as to what constituted a good or a bad move. It "knew"
nothing about making points, hitting blots, or anything else. Over the
course of several months, it played 300,000 games against itself. In the
beginning, it picked its moves at random from the list of possible legal
moves. After each game, a logic routine made informed guesses as to what
moves in the previous game may or may not have been errors, based upon a
sophisticated mathematical theory of learning. The program's positional
evaluator routine continuously modified itself, based upon its results in
previous games.
The article and the games are interesting, as Robertie annotates and
analyzes each move. At the end he summarizes ..
"TD-Gammon and I played most of a day, a total of 31 games. I won 19
points, an average of 0.61 points per game. On balance, I was lucky. Game
16, as you saw, could easily have gone the other way at the end, a 16-point
swing by itself [Robertie had to throw a double on his last roll to win].
My estimate, after reviewing the entire session, was that I would do well
to average 0.20 to 0.25 points per game. This figure makes TD-Gammon the
strongest backgammon program in existence, most likely better than
Berliner's program of 13 years ago, although that's no longer available for
comparison.
"Not only is TD-Gammon interesting as a backgammon program, it represents
an astonishing achievement for the neural network approach to artificial
intelligence. Remember that this program has no human knowledge built into
it. Everything it "knows", it deduced by playing against itself, then
improving by applying sophisticated mathematical learning algorithms to the
results of its games.
"Just before going to press, we received word that Malcolm Davis and Paul
Magriel made journeys up to White Plains to match wits with TD-Gammon.
Malcolm Davis broke even in 12 games, although TD-Gammon won 8 out of 12.
Paul Magriel got backgammoned while playing a favorable back game and ended
up negative for the session."
There was one anecdote Robertie relates that I found interesting. In its
300,000 games of experience, Robertie felt that TD-Gammon has not "learned"
to slot the 5-point with an opening 4-1, regarding the split on an opening
21, 41, or 51 as superior to the slot. After rolling out the opening
position 1000 times, the program finds that while 13/9 24/23 makes it
exactly even money, slotting the 5-point leaves it an underdog by 0.05
points.
===========================================================================
David Escoffery Tel: (408) 427-7718
The Santa Cruz Operation Internet: davide@sco.COM
P.O. Box 1900
Santa Cruz, CA 95061
===========================================================================
|