Forum Archive :
Rollouts
From: |
Stick |
Address: |
checkmugged@yahoo.com |
Date: |
15 October 2007 |
Subject: |
Explaining (J)SD w/GNU & rollouts |
Forum: |
BGonline.org Forums |
Pretend I know little to nothing about statistical significance (Chuck can
vouch for that), standard deviations, joint standard deviations, etc ...
can you explain it using the above example/rollout? [or know of a link
where it has already been done, I can't find one with what I'm looking for]
13 14 15 16 17 18 19 20 21 22 23 24
+---+---+---+---+---+---+---+---+---+---+---+---+---+
| X O | | O O X X |
| X O | | O O |
| X | | O |
| X | | O |
| | | | +---+
| | | | | 1 |
| | | | +---+
| O | | X |
| O | | X |
| O X | | X |
| O X | | X O |
| O X X | | X O |
+---+---+---+---+---+---+---+---+---+---+---+---+---+
12 11 10 9 8 7 6 5 4 3 2 1
X rolls 1-1.
Rollout:
Play Equity Diff. Std.Err.
-------------------- ------- ------- --------
1. 8/7(2), 6/5(2) +0.0773 0.0043
2. 23/22, 11/10, 6/5(2) +0.0615 -0.0158 0.0042
|
|
Brice writes:
So no one has touched this yet, guess I'll give it a shot. It's been about
5 years since my last stats course, so others should correct me if I say
something stupid.
The cubeful equity is listed as .0773, with a standard error (SE) of .0043.
The actual position after 8/7(2), 6/5(2) has some exact equity A, but we
don't know what it is, so we run trials to try to estimate it. The best
guess after these trials is .0773. The standard error is a description of
how good our measurement is and lets us construct confidence intervals to
tell us where the actual value A might be. 1 SE around our estimate
corresponds to about 68%, 2 SE corresponds to 95%*, and 3 SE is about
99.7%. So for this first position we might say
There is a 68% chance that A lies in the interval
(.0773-1*.0043, .0773+1*.0043) = (.0730, .0816)
There is a 95% chance that A lies in the interval
(.0773-2*.0043, .0773+2*.0043) = (.0687, .0859)
There is a 99.7% chance that A lies in the interval
(.0773-3*.0043, .0773+3*.0043) = (.0644, .0902)
More trials = smaller SE = smaller intervals = better idea what the actual
value is. One could make similar intervals for the second position.
When you're comparing two positions, what you actually want is the
difference between the two equities: we'll call this A - B. Our best guess
for this difference is, not surprisingly, the difference between the
guesses for A and B: .0158. It turns out you can't just add the two errors
of A and B to get the error for (A - B): it's given by SE(A - B) =
squareroot(SE(A)^2 + SE(B)^2)**. (This is the joint standard error.) In our
case this comes out to about .00601. Suppose we were to now make a 95%
confidence interval for A - B: it would be
(.0158 - 2*.0060, .0158 + 2*.0060) = (.0038, .0278)
The entire interval is greater than 0; that means we can be 95% sure the
actual difference between the equities is positive. Actually, you can even
say that we are 97.5% sure by symmetry (there is a 2.5% chance it's below
.0038, and a 2.5% chance it's higher than .0278--in the latter case it's
still positive)***.
Instead of creating intervals, it's usually easier to go in the reverse
direction: you have your estimate of A - B (.0158), you have SE(A - B) =
.00601, so you can see how many (joint) standard errors you are away from
0. In this case we're at .0158/.00601 = 2.63 standard errors; you can
convert this to a probability by doing an integral numerically**** to
discover that there's a 99.6% chance that A is better than B, and a .4%
chance that B is better than A. Reasonably good estimates to remember:
0 JSEs: 50% chance that top position is better (in other words: TCTC)
1 JSE: 84.1% chance that top position is better
2 JSEs: 97.5% chance that top position is better
3 JSEs: 99.9% chance that top position is better
Of course, it is up to you what percentage you think is significant. You'd
probably get eyed suspiciously claiming something is true with only 1 JSE,
but with 3+ JSEs few would doubt the veracity of your claim.
I hope that helped?
--Bryce
* 95% is actually more like 1.96 standard errors.
** This is because, while standard errors do not add, the variances (which
are the squares of the standard deviation) do: Var(A-B) = Var(A+B) =
Var(A) + Var(B).
*** I might be confusing one- and two-tailed tests here, but it looks right
to me.
**** Or just use this applet:
http://psych.colorado.edu/~mcclella/java/normal/normz.html
and enter "2.63" or "-2.63" into the z-score box.
|
|
|
|
Rollouts
- Advice (David Montgomery, Apr 1996)
- Cautionary tale (Kit Woolsey, Sept 1995)
- Combining rollouts (Gregg Cattanach+, Dec 2003)
- Confidence intervals (Bob Koca, Nov 2010)
- Confidence intervals (Timothy Chow, May 2010)
- Confidence intervals (Gerry Tesauro, Feb 1994)
- Cubeless vs centered-cube rollouts (Ron Karr, Dec 1997)
- Duplicate dice (David Montgomery, June 1998)
- How reliable are rollouts? (David Montgomery, Aug 1999)
- Level-5 versus level-6 rollouts (Michael J. Zehr, June 1998)
- Level-5 versus level-6 rollouts (Chuck Bower, Aug 1997)
- Positions with inaccurate rollouts (Douglas Zare, Oct 2002)
- Reporting results of rollouts (David Montgomery, June 1995)
- Rollout settings (Lokicol+, Apr 2010)
- Settlement limit (Michael J. Zehr, Apr 1998)
- Settlement limit (Kit Woolsey, Dec 1997)
- Settlement limit in races (Alexander Nitschke, Dec 1997)
- Some guidelines (Kit Woolsey, Apr 1996)
- Standard error and JSD (rambiz+, Feb 2011)
- Standard error and JSD (Stick+, Oct 2007)
- Systematic error (Chuck Bower, Oct 1996)
- Tips for doing rollouts (Douglas Zare, June 2002)
- Truncated rollouts (Gregg Cattanach, Oct 2002)
- Truncated rollouts: pros and cons (Jason Lee+, Jan 2006)
- What is a rollout? (Gregg Cattanach, Dec 1999)
From GammOnLine
Long message
Recommended reading
Recent addition
|
| |
|