Ratings

Forum Archive : Ratings

 
Cube error rates

From:   Joe Russell
Address:   ez2bblue@aol.com
Date:   21 July 2009
Subject:   Cube error rates
Forum:   BGonline.org Forums

If you are like me you get dinged more than anything else for not doubling
when you should. Often I will miss a cube by .05 or so for several shakes
in a row. That has led me to consider the fairness of the grading of these
errors.

Say you had a position that was static for 25 rolls and for each roll you
made a .05 error by not doubling. Your cumulative error would be 1.25, but
you would have only been .05 better off if you had doubled at any time. Now
say in another 25-roll static position it was wrong to double by .05 and
you doubled on the first roll. Now your cumulative error is only .05,
twenty five times less than the other position, but the true cost of the
errors was identical.

I realize in one situation you made one bad decision and in the other you
made 25 bad decisions, but the cost in MWC of the 25 was truly no more than
the cost of the one.

Maik Stiebler  writes:

In completely static positions, doubling is always either wrong or
optional. So if you are dinged 25 times in a row for 0.050,

  - your bot is bad at evaluating cube errors in static positions

or

  - your position was not quite static, and you were unlucky to get an
    awful error rating for repeatedly getting into situations that you
    didn't understand. Can happen, but it averages out in the long term.

I think your statement, "... but you would have only been .05 better off if
you had doubled at any time," hinges on the position being completely
static. If not, I don't see in which sense it is true.

Gregg Cattanach  writes:

Your analysis of repeating missed doubles implies that each opportuntity
you had to double was the same decision. It WAS NOT. Both players had
rolled and moved, creating a new position. Other than perhaps when one
player is closed out, the position is effectively different and requires a
new double/no double decision process.

Thus you are making different no-double errors each time, not the same one,
and you should be dinged for each one.

Frank Berger  writes:

That's right, but nevertheless IMHO Joe's point is still true. If I
repeatedly miss a .05 double I add upp an unreasonable error rate. IMHO it
is totally irrelevant whether the situation is static or due some fairy
dust it stays in that area. If I miss 25 times a 0.05 double I lose in
error rates more than a point and that is ridicolous. And the early cube is
penalized just once.

David Levy  writes:

Let's assume the missed doubles were not consecutive, but in different
games in the same match. Is there are problem about multiple dinging? If
they were in different matches?

I suspect Joe is thinking, "I am a Snowie ER 3 player, but these multiple
dings gave me an error rate of 6 and I don't believe it."

The Snowie error rate in any match reflects how well the player understands
the positions that came up in the match and the frequency those positions
came up. If a poorly-understood position comes up a lot (cumulative missed
doubles), the error rate is higher.

I have a great error rate in a match consisting only of five-anchor holding
games. But I know I'm not any better after seeing that low error rate.

Moral: Don't expect too much of the Snowie error rate, particular for a
single match.

Matt Reklaitis  writes:

My take on this situation is that, a bot's evaluation represents values
related to its own play. So when calculating the size of any individual
error, inherent to that calculation is that your future play is similar to
the bots. The more your play differs from the bot's, the more the model
breaks down.

Matt Cohn-Geier  writes:

Let's say your error here in not doubling is .05.

      24  23  22  21  20  19      18  17  16  15  14  13
     +---+---+---+---+---+---+---+---+---+---+---+---+---+
     | O   O   O   O   O   X |   |             O   O   O |
     | O   O   O   O         |   |                     O |
     |                       |   |                       |
     |                       |   |                       |
     |                       | X |                       |
     |                       |   |                       |  X on roll
     |                       | O |                       |
     |                       |   |                       |
     |                       |   |                       |
     | X   X   X   X         |   |                     X |
     | X   X   X   X   X   O |   |             X   X   X |
     +---+---+---+---+---+---+---+---+---+---+---+---+---+
       1   2   3   4   5   6       7   8   9  10  11  12

This assumes that after fan/fan you will cube. But after fan/fan you won't
cube...so your error is compounded more than .05 would indicate. So
regarding the error as just .05 isn't sufficient.

If, on the other hand, you doubled a position where the ND error was .05,
you would deprive yourself of a chance to make future mistakes.

Maik Stiebler  writes:

Yes, not doubling in this and all the repeated situations does cost more
than not doubling in this situation and reconsidering after fan/fan. The
latter cost is what the bot reports, and if it is 0.05, the former cost is
approx. 0.062. The difference here is not large, because an exact repeat of
the position only happens with a probability of 256/1296. In the typical
"static situation", the effect may be much larger.

If you reach this position, you will usually get dinged for 0.05 in total,
but on a very bad day you can be dinged for something like 0.50. Those
occasional high dings are needed for unbiased feedback, because the typical
0.05 ding is too small. The average ding is just right. On the other hand,
I can see why it is regarded as unfair that you can be dinged 0.05 or 0.50
for the same error.

Douglas Zare discusses this in his GV article "Unbiased Nonsense" and
proposes a variance reduction method on the error rate. I don't think that
that would be accepted by the bg community, but it is an interesting
concept.

Maik Stiebler  writes:

I had prepared a puzzle involving a discrete random walk to post at some
point in this thread:

Players A and B play a game of StaticishRace. A game consists of tossing a
fair coin and changing the score by +1 or -1 respectively based on the
result of the coin toss. The starting score is 0. Player A wins one point
and the game ends when the score reaches 50. Player B wins one point and
the game ends when the score reaches -50.

Before each coin toss, ONLY Player A (to keep things simple) is given the
opportunity to double the stakes, after which Player B can either take or
drop (ending the game and losing a single point). A double is allowed only
once in a game.

1. What is the optimal strategy for both players?

2. What is the theoretical (assuming optimal play from both sides) equity
of the starting position?

3. Assume Player A deviates from optimal strategy by doubling if and only
if the current score is +30. What is the practical equity of the starting
position then?

4. How much does Player A's deviation from perfect play cost him on average
per game?

Now put yourself in the position of a bot that knows the optimal strategy
and observes the game, not knowing Player A's complete strategy, but noting
the wrong plays that follow from the strategy.

5. (a) At which points in the game does Player A, following the non-optimal
       strategy, blunder away theoretical equity by making a wrong play?
   (b) How much equity does each of these wrong plays lose?

6. How often will the opportunity for Player A to make a wrong (equity
losing) play arise in a game? Compute both an average value (a) and a
distribution (b).

7. Verify that the average number of blunder opportunities (6a) times the
cost of a blunder (7b) equals the total cost of Player A's misguided
strategy (4).

Bob Koca  writes:

> 1. What is the optimal strategy for both players?

Player A doubles at +25 at which point B has an optional take/pass. That is
because B is then has the required 1/4 winning chance.  A does not double
before then as there is no market loss.

> 2. What is the theoretical (assuming optimal play from both sides) equity
> of the starting position?

A has +1/3 equity. Going from -50 to +25 is 75 and A starts 2/3 of the way
there. 2/3 - 1/3 = 1/3.

> 3. Assume Player A deviates from optimal strategy by doubling if and only
> if the current score is +30. What is the practical equity of the starting
> position then?

A wins 50/80 of the games for an equity of +1/4.

> 4. How much does Player A's deviation from perfect play cost him on
> average per game?

1/3 - 1/4 = 1/12

> Now put yourself in the position of a bot that knows the optimal strategy
> and observes the game, not knowing Player A's complete strategy, but
> noting the wrong plays that follow from the strategy.
>
> 5. (a) At which points in the game does Player A, following the
>        non-optimal strategy, blunder away theoretical equity by making a
>        wrong play?
>    (b) How much equity does each of these wrong plays lose?

Only when the game is at exactly 25 does A lose equity. Values below +25
are a no-double and values above +25 are an optional double. The cash is
not lost in the next sequence.

At +25 B's has an optional take/pass. Let's suppose he would pass. Then the
equity lost by not doubling at +25 is 1/2 the equity lost by the game
reaching 24 (if it reaches +26 it is a cash anyways). At +24, A's
theoretical equity is 74/75 - 1/75 = 73/75, a loss of 2/75. So failing to
double at +25 costs 1/75 equity.

> 6. How often will the opportunity for Player A to make a wrong (equity
> losing) play arise in a game? Compute both an average value (a) and a
> distribution (b).

It happens at least once in 2/3rds of the games. Suppose that the game is
exactly +25. We need to find the probability of that state occurring again.
This is harder than the previous questions. When you are at +25 you go to
+26 half the time. From there you go to 25 before 30 with probability 4/5.
The other half the time you go to 24 and then the chance of returning to 25
before -50 is 74/75. (1/2)(4/5)+ (1/2)(74/75) = 67/75 chance of a repeat
visit to 25 if start from 25. The expected number of times needed to not
repeat has a geometric dist with p = 8/75 and has an expected value of
1/(8/75) = 75/8. The expected total number of visits is thus (2/3)(75/8) =
25/4.

The distribution is as follows:

Exactly 0 visits occurs with probability 1/3.
Exactly 1 visit occurs with probability (2/3)(8/75).
Exactly 2 visits occurs with probability (2/3)(67/75)(8/75).
Exactly 3 visits occurs with probability (2/3)(67/75)**2(8/75).
...
Exactly n visits occurs with probability (2/3)(67/75)**(n-1)(8/75).

(This was jointly done with Chris Yep).

> 7. Verify that the average number of blunder opportunities (6a) times the
> cost of a blunder (7b) equals the total cost of Player A's misguided
> strategy (4).

(25/4)(1/75) = 1/12.
 
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     

 

Ratings

Constructing a ratings system  (Matti Rinta-Nikkola, Dec 1998) 
Converting to points-per-game  (David Montgomery, Aug 1998)  [Recommended reading]
Cube error rates  (Joe Russell+, July 2009)  [Long message]
Different length matches  (Jim Williams+, Oct 1998) 
Different length matches  (Tom Keith, May 1998)  [Recommended reading]
ELO system  (seeker, Nov 1995) 
Effect of droppers on ratings  (Gary Wong+, Feb 1998) 
Emperical analysis  (Gary Wong, Oct 1998) 
Error rates  (David Levy, July 2009) 
Experience required for accurate rating  (Jon Brown+, Nov 2002) 
FIBS rating distribution  (Gary Wong, Nov 2000) 
FIBS rating formula  (Patti Beadles, Dec 2003) 
FIBS vs. GamesGrid ratings  (Raccoon+, Mar 2006)  [GammOnLine forum]
Fastest way to improve your rating  (Backgammon Man+, May 2004) 
Field size and ratings spread  (Daniel Murphy+, June 2000)  [Long message]
Improving the rating system  (Matti Rinta-Nikkola, Nov 2000)  [Long message]
KG rating list  (Daniel Murphy, Feb 2006)  [GammOnLine forum]
KG rating list  (Tapio Palmroth, Oct 2002) 
MSN Zone ratings flaw  (Hank Youngerman, May 2004) 
No limit to ratings  (David desJardins+, Dec 1998) 
On different sites  (Bob Newell+, Apr 2004) 
Opponent's strength  (William Hill+, Apr 1998) 
Possible adjustments  (Christopher Yep+, Oct 1998) 
Rating versus error rate  (Douglas Zare, July 2006)  [GammOnLine forum]
Ratings and rankings  (Chuck Bower, Dec 1997)  [Long message]
Ratings and rankings  (Jim Wallace, Nov 1997) 
Ratings on Gamesgrid  (Gregg Cattanach, Dec 2001) 
Ratings variation  (Kevin Bastian+, Feb 1999) 
Ratings variation  (FLMaster39+, Aug 1997) 
Ratings variation  (Ed Rybak+, Sept 1994) 
Strange behavior with large rating difference  (Ron Karr, May 1996) 
Table of ratings changes  (Patti Beadles, Aug 1994) 
Table of win rates  (William C. Bitting, Aug 1995) 
Unbounded rating theorem  (David desJardins+, Dec 1998) 
What are rating points?  (Lou Poppler, Apr 1995) 
Why high ratings for one-point matches?  (David Montgomery, Sept 1995) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition
 

  Book Suggestions
Books
Cheating
Chouettes
Computer Dice
Cube Handling
Cube Handling in Races
Equipment
Etiquette
Extreme Gammon
Fun and frustration
GNU Backgammon
History
Jellyfish
Learning
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Miscellaneous
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Programming
Propositions
Puzzles
Ratings
Rollouts
Rules
Rulings
Snowie
Software
Source Code
Strategy--Backgames
Strategy--Bearing Off
Strategy--Checker play
Terminology
Theory
Tournaments
Uncategorized
Variations

 

Return to:  Backgammon Galore : Forum Archive Main Page