Backgammon Programming

Programming

What is a "neural net"?

From:   Gary Wong
Address:   gary@cs.arizona.edu
Date:   7 October 1998
Subject:   Re: neural-nets and backgammon
Forum:   rec.games.backgammon
Google:   wtemskcntt.fsf@brigantine.CS.Arizona.EDU

> Can anyone explain me how neural-nets work? A neural net is essentially a collection of connected neurons, where each neuron is a simple dedicated computing element. Each neuron can have many inputs but only a single output (analogous to cellular automata). Actually a better example is probably cells in a spreadsheet -- each neuron has a "formula" based on the numbers in lots of other cells, and produces a value which can be used by subsequent cells. In some areas it is fashionable to model neural computation on the activity of biological brain cells, although the neural nets that have become useful in practical terms have a shaky resemblance at best to `real' neural nets (in particular, backpropagation of errors is utterly implausible in biological terms). Since these `real' neural nets are an important but distinct topic, the kind we're talking about are more properly known as artificial neural nets. A former colleague (and expert on ANNs) of mine from New Zealand disliked the term "neural net" and preferred to call them "superinterpolators". How they work (as opposed to what they are) is a little bit much to go into here, but there's a huge amount of information available. Look at Sumeet's Neural Network Page (Sumeet Gupta) at: http://www.geocities.com/CapeCanaveral/Lab/3765/neural.html for more links than you ever wanted to follow :-) To go into a bit more detail on the spreadsheet analogy: suppose you want to construct a backgammon position evaluating neural net in a spreadsheet. One possible layout is this: enter all kinds of information about the position into column A (for instance, A1 might be the number of chequers X has borne off; A2 is the number of chequers O has on the bar; A3 is a measure of how strong O's prime is, etc). In ANN terms, column A is the "input layer"; a typical backgammon evaluator might use hundreds of cells. Now, create some cells in column B which all contain a similar formula: each one is f( w . A ), where f() is some non-linear function, w is some weight vector, and A is the vector of cells in column A. The "." is a vector dot product. Ignoring the details for now, let's just assume that column B contains dozens of these cells, each with the same formula f() and vector A, but with different weights w. The important thing to notice is that there will be many cells in this column, each producing a different value somehow based on the values in column A. In ANN terms again, column B is the "hidden layer" and each cell in it is a "hidden neuron". Lastly, create a few cells in column C, each with the formula f( w . B ) (again, a non-linear function of the dot product of some weight vector with the vector made up of all the cells in column B). These cells are given meanings by the designer; for instance cell C1 might be the estimated probability that O wins, C2 the probability that O wins a gammon, etc. This last column is known as the "output vector" for obvious reasons. If we somehow magically set all those "w" weight vectors in the network to appropriate values, then we can create a net that will give reasonably accurate output values in column C corresponding to the position we enter in column A. What we have created in our spreadsheet is an ordinary multi-layer perceptron (MLP) with a single hidden layer (column B). Magically calculating the weights is the hard part. If we start with completely random weights, then our network will obviously produce random output (and play backgammon incredibly badly). However, if we have some idea of what the desired output ("training signal") is for particular positions, then we can gradually improve ("train") our network to approximate the desired function like this: set the input vector appropriately and observe the output. Compare the output values our untrained or partially trained net gives to the training signal: any difference is the "error". We want to adjust the weights to make this error slightly lower (it doesn't help to attempt to reduce the error by a large amount in one go, because making coarse changes like that would tend to make it "forget" earlier training. We want to make it improve gradually over the coarse of a huge number of training positions so it can experience a wide range of types of backgammon positions, and not just the current one). The explanation gets too detailed to go into here (see the links above), but there are solutions to the "credit assignment problem" which essentially tell you how strongly each individual weight is responsible for the observed error; by tweaking each of them slightly in proportion to this amount, you can reduce the overall error. If you repeatedly perform this operation for a large number of positions, then our spreadsheet neural net can be made to approximate the "true" values for most positions reasonably well. There's a good web page "The Neural Net Backgammon Programs" (Jay Scott) at: http://forum.swarthmore.edu/~jay/learn-game/systems/gammon.html that describes how several real backgammon nets address these issues in practice. > and why they are the "best represntasion for backgammon"? I'd guess you'd get a different answer from anybody you ask :-) I get the impression from Gerry Tesauro's papers (see the web page above) that he advocates TD training as being particularly applicable to backgammon, but since plenty of others have had good results with other training schemes I'm sure there's more to it than that. In general, neural nets tend to be good at pattern recognition, generalisation, and interpreting noisy input. They are capable of approximating arbitrary continuous functions, but in order to have a good chance of succeeding in practice the function needs to be as smooth as possible and you need some way of obtaining lots of reasonably accurate training data spread throughout the domain of interest. In my opinion, the reason why ANNs work well for evaluating backgammon equities is that the equity function is VERY smooth and ordered. The most important thing is that small changes in input produce only small changes in output (take a random position; move a single chequer from X's 9 point and move it to her 8 point; chances are you are left with practically the same position as before and the equities will be very similar). This is quite distinct from most other games (for instance, take a random chess position; move that white knight on f6 to f5; chances are you've made a significant change to the tatical balance of that part of the board and quite possibly changed it from a winning position for black into a winning position for white, even though you only moved a single piece a single square). This is what I mean by smoothness of the equity function. Smooth functions can be represented very efficiently by ANNs. The "pattern recognition" concept plays a large role, too; recall that the main operation performed in every neuron is the vector (dot) product of the previous layer and a weight vector. This can be regarded at a weighted sum of the activity of the previous layer, but an equally good interpretation is that it's a measure of "similarity" of the previous layer and some "ideal" vector (since the dot product of two normalised vectors is the cosine of the angle between them). It is fairly obvious that this is a very useful property in evaluating backgammon positions, since steering toward winning "patterns" is essentially what strategic play is all about. A single neuron in the hidden layer can calculate how "closely" the input pattern matches a (learned) successful blitzing template, for instance -- regarding the weight vector as an "ideal blitz" to be compared for similarity against the input vector is exactly equivalent to regarding each individual weight as a measure of how important the corresponding feature is for a successful blitz. > And how should i represnt the backgammon game in your idea. I don't understand what you mean. Whose and what idea? What exactly do you want to represent? Cheers, Gary. -- Gary Wong, Department of Computer Science, University of Arizona gary@cs.arizona.edu http://www.cs.arizona.edu/~gary/

Did you find the information in this article useful?

Do you have any comments you'd like to add?