Forum Archive : Ratings

Ratings variation

From:   Ed Rybak
Address:   rybak@sequent.com
Date:   27 September 1994
Subject:   A Random Walk down FIBS Street ;-)
Forum:   rec.games.backgammon
Google:   1994Sep27.223554.17331@sequent.com

A few weeks ago I became curious about the variability in FIBS ratings
that I was seeing among a number of players:  for example, when I began
playing on FIBS, fatboy was in the low 1600's, climbed to the low 1700's,
fell to the high 1600's, rose to about 1770, and now is in the mid-to-low
1600's again.  What kind of variability should one expect in one's FIBS
ratings based on chance?

This is important because so many of us base our estimation of improvement
on our FIBS ratings!  And many of us have changed our play in frustration
over having lost the fourth or fifth match in a row?  Can we really
measure our progress with our FIBS rating?

So I ran a series of Monte Carlo simulations using the FIBS ratings
formula.  In the first series I had two players of equivalent skill
and FIBS ratings (1700) play one another 1000 5 point matches.  I made
the assumption that the players skill remains constant whereas the
rating varies with the results of the match.  The results were
fascinating:  the ratings of the players varied tremendously often
changing by over a hundred points over a few hundred matches.  I ran
100 simulations of 1000 5 point matches and the average minimum and
average maximum of the ratings were 1626 and 1780 respectively.

So a 1700 player, playing with no improvement/diminishment of skill,
can expect to see their rating fluctuate by plus or minus 80 points
in the course of play.  Add in some variability due to fatigue, or
steaming, or "bathwater" adjustments to the skill level and it could
be even greater.

I ran a whole bunch of other simulations.  One simulation asked:  if I'm
a 1700 player (ranked and skilled) and I miraculously improved my play
(re: skill) by 100 points, how long before it will be reflected in my
rating?  The average number of 5 point match required to reach the true
rating of 1800 was about 280.

The length of the match doesn't seem a difference.  Nor does the true
rating of one's opponent...it's all adjusted for in the ratings formula.

One interesting result:  the fastest way to improve one's rating is by
playing players that are overrated (i.e. those who's ratings are higher
than their skill).  This assumes, of course, that you can differentiate
overrated players.  I ran one simulation where you played only players
that were 50 points overrated;  the simulations converged much more
quickly than a 100 point skill increase,  allowing a 1700 skilled player
to be rated in the high 1800's.  The only readily identifiable class
of overrated players is new players (statistically speaking) which
explains why they are played so readily by some ;-)

It appears to me that FIBS rating is not a very good indicator of skill
except in the broad sense.   If I lose four matches in a row to players
of my level, I probably shouldn't adjust my game and I definitely
shouldn't lose my composure.  I have taken to using other indicators:
matchquiz, analysis of my play, etc. to measure myself.

Ed Rybak
Sequent Computer Systems
15450 SW Koll Parkway
Beaverton, OR 97006
phone: (503) 578-4336
fax: (503) 578-3811

Tom Keith  writes:

Douglas Zare and Adam Stocks comment on this posting in their article
"Ratings: A Mathematical Study"
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     



Constructing a ratings system  (Matti Rinta-Nikkola, Dec 1998) 
Converting to points-per-game  (David Montgomery, Aug 1998)  [Recommended reading]
Cube error rates  (Joe Russell+, July 2009)  [Long message]
Different length matches  (Jim Williams+, Oct 1998) 
Different length matches  (Tom Keith, May 1998)  [Recommended reading]
ELO system  (seeker, Nov 1995) 
Effect of droppers on ratings  (Gary Wong+, Feb 1998) 
Emperical analysis  (Gary Wong, Oct 1998) 
Error rates  (David Levy, July 2009) 
Experience required for accurate rating  (Jon Brown+, Nov 2002) 
FIBS rating distribution  (Gary Wong, Nov 2000) 
FIBS rating formula  (Patti Beadles, Dec 2003) 
FIBS vs. GamesGrid ratings  (Raccoon+, Mar 2006)  [GammOnLine forum]
Fastest way to improve your rating  (Backgammon Man+, May 2004) 
Field size and ratings spread  (Daniel Murphy+, June 2000)  [Long message]
Improving the rating system  (Matti Rinta-Nikkola, Nov 2000)  [Long message]
KG rating list  (Daniel Murphy, Feb 2006)  [GammOnLine forum]
KG rating list  (Tapio Palmroth, Oct 2002) 
MSN Zone ratings flaw  (Hank Youngerman, May 2004) 
No limit to ratings  (David desJardins+, Dec 1998) 
On different sites  (Bob Newell+, Apr 2004) 
Opponent's strength  (William Hill+, Apr 1998) 
Possible adjustments  (Christopher Yep+, Oct 1998) 
Rating versus error rate  (Douglas Zare, July 2006)  [GammOnLine forum]
Ratings and rankings  (Chuck Bower, Dec 1997)  [Long message]
Ratings and rankings  (Jim Wallace, Nov 1997) 
Ratings on Gamesgrid  (Gregg Cattanach, Dec 2001) 
Ratings variation  (Kevin Bastian+, Feb 1999) 
Ratings variation  (FLMaster39+, Aug 1997) 
Ratings variation  (Ed Rybak+, Sept 1994) 
Strange behavior with large rating difference  (Ron Karr, May 1996) 
Table of ratings changes  (Patti Beadles, Aug 1994) 
Table of win rates  (William C. Bitting, Aug 1995) 
Unbounded rating theorem  (David desJardins+, Dec 1998) 
What are rating points?  (Lou Poppler, Apr 1995) 
Why high ratings for one-point matches?  (David Montgomery, Sept 1995) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition

  Book Suggestions
Computer Dice
Cube Handling
Cube Handling in Races
Extreme Gammon
Fun and frustration
GNU Backgammon
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Source Code
Strategy--Bearing Off
Strategy--Checker play


Return to:  Backgammon Galore : Forum Archive Main Page