Backgammon Programming

Programming

Backgames

From:   David Montgomery
Address:   monty@cs.umd.edu
Date:   2 June 1998
Subject:   Re: JellyFish and Snowie and backgames
Forum:   rec.games.backgammon
Google:   6l1n6v$t9@twix.cs.umd.edu

Maverick writes: > There have been some postings about whether: > A:Jellyfish plays backgames well > B:Whether Jelly plays them better than Snowie > > There seems to be a general (I think erroneous) opinion that > JF doesn't play backgames well. > Well I'd like to challenge those guys who make this claim to come up > with some evidence. I play an even game with JF Level 7 (Tested over > 10 sets of 100 games) and although backgames don't occur frequently, > when they do I find JF has no problem defending against them at all. > It certainly knows how to bust my timing and certainly knows how to > recirculate checkers and capture additional ones as I have found out > to my cost. There is a big difference in the strategies of defending against a backgame, and playing a backgame. In general programs (JF, SW, EXBG) have been critized much more for their play of the backgame side, not for how they play in defending against a backgame. When someone says that a program doesn't play backgames well, they are almost certainly talking about playing the backgame side, as opposed to the defender side. Backgames come in a multitude of varieties, and the appropriate strategies can differ greatly. I think one of the most poorly understood classes of positions consists of incipient/potential backgames in which the trailer has some small chance of going forward, but generally has a poor game, and in which the timing issue is undecided. I think these positions are probably played poorly by everyone, top humans and bots included. The bots mostly play these positions by adamantly refusing to play a backgame strategy. Assuming they play backgames poorly, this makes sense. Humans are much more likely to go into 'backgame mode' too readily, giving up substantial equity. Until we have stronger programs, I don't think we will be able to accurately resolve many of the issues arising in these sorts of positions. If we look at positions where the timing issue is resolved, where the defender is bearing in or bearing off, the game becomes much clearer and with the existing tools we can learn a lot about the right decisions. Here the issues are primarily: How likely is the leader to win a gammon, when hit and when not hit? How likely is the leader to win a backgammon? How often will the trailer hit? How many checkers will the leader have off when hit? How strong will the backgame player's offense be after a hit? Can the backgame player capture a second checker? There is certainly one area of play where the bots are indisputably worse than top humans right now. This is when a side gets hit bearing off, has many checkers borne off (say, 10-13), and the hitting side has a strong blocking structure and can try for a second checker. This is a fairly obscure patch of the backgammon universe, but it is a significant part of the equity a backgame player has. To the degree that this is a significant variation, the bots will do worse in backgames. Mostly this comes up in well-timed deep backgames. In two other areas the bots seems to be a little worse than people in backgames. First, in arranging their checkers to get a hit (and avoid the gammon and backgammon). The backgame player often must break one or more anchors to get the best chance for a hit (or an early, effective hit). Second, in bringing home the win after a hit. This often involves slotting parts of a prime or rolling a prime home. The bots aren't too bad at these things relative to people (in typical positions), but they do seem a little worse. > I don't own a copy of Snowie but if you do, how about importing some > backgame matches Jelly has played and seeing if Snowie can indeed > spot some blunders. It is very hard to find games where JF opts to play a deep backgame. JF has learned not to play them. > Where some disagreements may be occuring on JF's playing of backgames > are probably resulting from comparing rollouts to actual games. > As the rollouts are limited to either L5 L6 the data may not reflect > JF's true strength at playing back games as is plays a MUCH stronger > game at L7 which it may "need" in order to play backgames properly. Although JF probably plays backgames better on level 7 than on level 6, I don't think that is the main problem. One way of looking at why neural nets so easily get so good at backgammon is to realize that for most positions, most of the time, backgammon is easy. The things that are good are good, and the things that are bad are bad. Opponent on the roof, good. Me on roof, bad. Opp leaves shots, good. Me leaving shots, bad. My points are good, the opponents are bad. Being ahead in the race is good. Having a strong board is good. Having a strong blockade is good. Having many checkers back is bad. Having checkers flexibly placed (say, 3 to a point) is good -- having the stacked up (5+ to a point) is bad. These things are almost always true. All you have to do is to figure out the proper weighting for the things -- how important is this good thing vs. that good thing? And the nets are very good at learning from experience how to properly weight these things, so they never (in "normal" positions) weight these things drastically wrong (unlike people, who do). The positions where the programs have had the most trouble is in positions where the things that are normally good aren't good any more. Or the things that are normally bad are now good. For example, say you're trying to walk a prime home from the opponent's outfield 18-13. Normally, the best point on the board is the 6 point. But here it would be a huge mistake to make the 6 point (say, with 44, 55, or 66) -- the most important point right now is the 12 point, which is a point usually worth little. Usually, closing out your opponent is good, especially if you're not primed. But this is often not the case if your opponent has 10 checkers off and you might be able to capture a second checker. In fact, it can be right to take a point in your prime or strong board and turn it into two blots in order to prevent your opponent from safetying a second checker! This kind of play is rarely right normally. Similarly, in backgames, it may not be bad at all to have many checkers sent back, if you have timing. Being behind in the race is part of the strength of the position. Having checkers on the roof can be helpful. Having a strong inner board is almost worthless if its a long time before you can hit effectively. So all the things that a net has learned are good are not necessarily good at all here. This problem is remarkably minor in backgammon, which is one reason that it has been easy for programs to learn good evaluation functions. In chess, you can't say that having a certain piece on a certain square is good or bad in itself -- at all! In backgammon, 99% of the time, owning the six point is a good thing, until you are legally obligated to abandon it. Thus I don't think adding another ply of lookahead is as big a deal. If the overall understanding of the priorities is wrong, then weighting your assumed priorities more accurately won't help much. There are workarounds for these problems, of course. JF 3.0 has some added functionality that helps it roll home distant primes. Snowie (beta) doesn't have this yet. JF 3 plays backgames much more like people than earlier versions, and I would guess that it has a different evaluation function being used in many backgame situations. As more and better workarounds are added, the bots will eventually play even these "weird" positions much better. > There's already a lot of hullabaloo about what Snowie will cost and > whether it will actually play a better allround game than Jelly. > My guess is it will play some positions better and some worse. True. > I'm not familiar with neural net technology but if you are, could you > explain to me whether if a neural nets inputs are being tweaked in one > area to improve its game whether it affects another part of its game > even if the sum of the two changes still makes the net overall better > ? Yes, if you are using the same evaluation function everywhere, this will generally be true. For example, in some experiments people tried training not just from the opening position, but also from positions with more checkers back, more potential backgames. In one case they trained from the Nackgammon position. The programs then played better in these sorts of positions, but the overall play decreased. Actually, this isn't the same as changing the inputs, as you mention, but it is similar. If you change the inputs, and then train one evaluation function to be used everywhere, the evaluations are likely to be changed (at least somewhat) in lots of areas of the game, not just the one you intended to change. David Montgomery monty@cs.umd.edu monty on FIBS

David Montgomery writes:

I wrote: > > There is certainly one area of play where the bots are indisputably > > worse than top humans right now. This is when a side gets hit bearing > > off, has many checkers borne off (say, 10-13), and the hitting > > side has a strong blocking structure and can try for a second checker. > > This is a fairly obscure patch of the backgammon universe, but it is > > a significant part of the equity a backgame player has. To the degree > > that this is a significant variation, the bots will do worse in > > backgames. Mostly this comes up in well-timed deep backgames. maverick wrote: > Sorry but I have to disagree. I played agame only the other day with > 12 checkers off and JF had only the 2 3 4 an 6 and 8 points made.I > entered on the ace but unable to move the second part of the roll. It > subsequently slotted both the 5 and 7 points with its next play, I was > forced to hit and eventually it made a prime followed by some > reciculation and a closeout. It played this game as far as I could see > perfectly on L7. > Could you give me some evidence where JF doesnt play this position > correctly? If you post the exact position I will try to do so -- assuming that it does play incorrectly -- I don't know since I haven't seen the position. (If both of your unhit checkers were on your ace point, so that capturing a second checker is virtually impossible, then JF plays the position quite well.) Later in this post I'll show you a position that I play better than the bots (although I figure that I'm actually botching it). First, let me explain one way (not the only way) to figure out that the computer players are playing worse, in some types of positions. You are certainly right that it can be difficult to figure out whether the expert or the program is playing the backgame better, since there might be something else going on, like the expert playing the other side *worse*. However, in some positions it seems clear that one side has the more difficult checker play. To take an extreme example, let's say that one side has only 1 checker left on the board, and the other side is trying to contain this checker and then win. Without a doubt all the skill lies with the 15 checker side. If program A can win more with the 15 checker side than program B, we can conclude that program A plays this position better. Here's an example where both sides have 15 checkers on the board, but clearly all the skill is for one side: |===========================================| 30 | O O ... X X | | X X X O ... | | O O ... X X | | X X X ... | | O O ... X X | | X X X ... | | O O . . | | . . . | | O O . . | | . . . | | O O | | | | O | | | | O | | | X on roll | | | | | | | | | . . . | | . . . | | . . . | | . . . | | ... ... ... | | ... ... ... | | ... ... ... | | ... ... ... | [2] | ... ... ... | | ... ... ... | |===========================================| 270 With good human play, X can cash. With JF v1 and v2, O has a monster beaver, followed by a cash regardless of X's roll (hmmm... or maybe O is supposed to play on -- I don't remember). With JF v3 handling the checker play (but not the cube) then X is close to a double, but the take is easy. Could JF v3 really play O that much better than people? And v2 so much more so? No. JF screws this up. People play it better. In developing backgames/potential backgames, there is great potential for skillful play on both sides, and so it can be very hard to figure out who is getting it right and who is getting it wrong. However, once the backgame side is committed to bearing in (recirculation is over), perhaps even bearing off, then the timing issues have been resolved and most of the opportunity for skillful play lies with the backgame side. The player bearing in will have some decisions on how to arrange spares, and properly balancing long and short term safety vs. ripping checkers. These are important decisions, but on the other hand, many of the backgame defender's plays will be forced. The backgame player will also have some choices in arranging checkers to get a hit, though fewer, but these plays can be quite significant and the programs often get them wrong (trivial example: JF sometimes doesn't vacate 24 when opponent has only 2 point stack). After a hit, much more skill is required of the backgame player than the defender. The hitter must contain the hit checker, perhaps slot and build a prime, walk the prime home, perhaps try for a second checker, and then bear off in a position where speed may be as important as safety. Here is an example: |===========================================| 180 | ... ... O O O | | O ... ... | | ... ... O O O | | O ... ... | [2] | ... ... ... | | ... ... ... | | . . . | | . . . | | . . . | | . . . | | | | | X on Roll | | | | | . . . | | . . . | | . X X . | | . . . | | ... X X ... | | ... ... ... | | O O X X X | | ... ... ... | | O O X X X | | ... O O O | |===========================================| 40 X off: 5 In these kinds of positions, the majority of the skill is on the side of the backgame player, and so if program A gets a better result than program B for the backgame side, I believe program A is playing it better. For example, JF v3 gets a lower result for X here than JF v2. Therefore I believe that v3 plays O better. Within reason, of course. If the results aren't that far apart, we might want to consider whether program B is playing the defender better, and so forth. Finally, there are positions with extensive skill required of both sides, but which nonetheless seem to require disproportionate skill of one side or the other. Here's an example: |===========================================| 253 | ... ... O O | | ... ... ... | | ... ... O O | | ... ... ... | | ... ... ... | | ... ... ... | | . . . | | . . . | | . . . | | . . . | | | | | [1] | X | | | | . X X | | . . . | X on Roll | . X O X | | . . . | | ... X O X | | ... ... ... | | O O O X O X | | X X ... ... | | O O O X O X | | X X ... O | |===========================================| 86 I have played this a lot as a prop, and I think it's not too hard to win against a weaker player from both sides, because while the backgame player is a favorite with good play, he is a dog with average play. The defender here does have some tough decisions on when to hit inside to try to start a 3rd point, risking the hit of a second checker, but based on my experience playing it, I'm sure it's harder to play O accurately. JF v3 gets a better result for O than does JF v2, so I believe that JF v3 plays it better than JF v2. Okay, here is a position of the type I meant in the first paragraph: +24-23-22-21-20-19-+---+18-17-16-15-14-13-+ 13| O | | X | O| | | | O| | | | O| | | | O| | | | O| | O | | O on Roll | | | | | | | | | X | | | | X | | | | X X X | | X | | X X X X | | X X X X | +-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+ This position is a follow-up to one that was posted here in April. The original post was "stay on the 1pt against 3 on 2pt without a prime", by Harald Retter. Here X stayed, playing 66 18/12 14/8 13/7 9/3, O rolled 61 playing 2/off 2/1*, and X hit with a 62 played bar/23*/17, producing the position above. I wasn't sure what the cube action was here, but I knew that I would take over the board. Kit Woolsey wrote regarding this variation: KW> 2) O rolls something-ace, and X hits with a two. O probably has KW> a play-on for a bit, but it could get treacherous fast. I was very surprised by this comment, Kit suggesting a play-on when I was taking, so I set up the above continuation. (Note that 62 is probably X's weakest hitting number.) I rolled it out, manually (using JF's interactive rollout feature) and with JF v3 L6. I don't have my results handy, but I know that I got a *much* lower result for O than did JF. Since almost all of the skill here is for X, I think that is proof that I play this position better than JF. JF doesn't understand how important it is for X to capture the second checker. Arranging ways to make this happen often takes precedence over making the best play to contain the first checker. I don't think I played the position particularly well, btw. My recollection is that O should not double, and that X does not quite have a beaver in the above position. I believe if I had the time and motivation to study this position enough (say, if someone offered me a long proposition contract) then I could play X well enough to make it a beaver. With JF playing X, O should probably double, although X's take is easy. David Montgomery monty@cs.umd.edu monty on FIBS

Did you find the information in this article useful?

Do you have any comments you'd like to add?