Search interval

Abstract

This study shows the effect of search interval on the precision of the analysis. It focuses only on 3-ply eXtreme Gammon analysis. It shows the difference between Normal search interval and analyzing all 32 top moves results in a loss of 0.0034 equity per game and a loss of 3 Elo points.

Search Interval

Important note: this article was initial writen at the time of eXtreme Gammon version 1. for version 2 data check at the end of the article.
Backgammon programs rely on search interval to reduce the amount of moves to be analyzed at higher level. This allows much faster analyze as typically only 4 moves get analyzed in 3-ply. The downside of it is that it is possible the best move is missed. Here are how the Normal (default) search interval work in eXtreme Gammon in 3-ply (see theAppendix for detail about other plies):

  • 1. All moves are analyzed in 1-ply (direct neural network output) cubeless.
  • 2. The top 32 moves are analyzed in 1-ply cubeful.
  • 3. Up to 8 moves within 0.160 equity of the top move are analyzed in 2-ply
  • 4. Up to 4 moves within 0.080 equity of the top move are analyzed in 3-ply
The large interval uses a factor 1.5 for all values in steps 3 and 4:
  • Up to 12 moves within 0.240 equity of the top move are analyzed in 2-ply
  • Up to 6 moves within 0.120 equity of the top move are analyzed in 3-ply
The huge interval uses a factor 2 for all values in steps 3 and 4:
  • Up to 16 moves within 0.320 equity of the top move are analyzed in 2-ply
  • Up to 8 moves within 0.160 equity of the top move are analyzed in 3-ply
For the purpose of this study a new search interval is defined: infinite. It analyzes all 32 moves after step 1 in 3-ply. This mode is not available to users.


Sample set

The sample used for this study is made of

  • 4236 money games
  • 924 one point matches
  • 616 3 point matches
  • 308 5 point matches
  • 308 7 point matches
  • 308 11 point matches

  • These games are obtained by letting eXtreme Gammon self play using 3-ply, infinite interval.
    These represent 11,406 games and 190,076 not trivial move decisions.


    Measurements

    To determine the cost of a specific search interval we check the move that would have been played using that search interval and compare to the one made by the infinite mode.
    The equity lost is aggregated. Note that is possible that at higher level (XGR+ for instance) that the best move can be different than the one pick by 3-ply infinite, however we are making the assumption that the sum of these errors will, in the long run, tend to 0.


    Result

    Here are the results obtained:

      Total equity lost Equity lost per game Equity lost per move Elo difference to 3-ply infinite
    3-ply Normal 38.634 0.0034 0.00017 2.99
    3-ply Large 17.833 0.0016 0.00008 1.37
    3-ply Huge 10.281 0.0009 0.00005 0.81
    For the normal interval:
  • There are 4163 moves that are played differently which is one every 2.74 games
  • Of these, 500 are errors (more than 0.020 equity lost), which is one error every 22.81 games
  • Of these, 49 are blunders (more than 0.080 equity lost), which is one error every 232.8 games

  • Confidence of the analyze:

    As stated the average equity lost per game is 0.0034, the standard deviation is 0.0179. This result in a 95% confidence interval of 0.0003, in term of Elo it means 0.29.
    So we can be confident that the normal search interval error is between 0.0031 and 0.0037 (or the Elo cost between 2.70 and 3.28)

    Distribution of total errors (normal interval) per game

    Error % of games
    No error 80.17%
    Less than 0.005 equity 7.00%
    Between 0.005 and 0.010 4.22%
    Between 0.010 and 0.020 3.93%
    Between 0.020 and 0.030 1.71%
    Between 0.030 and 0.040 1.01%
    Between 0.040 and 0.050 0.60%
    Between 0.050 and 0.060 0.36%
    Between 0.060 and 0.070 0.26%
    Between 0.070 and 0.080 20%
    Between 0.080 and 0.090 0.09%
    Between 0.090 and 0.100 0.07%
    More than 0.100 0.38%
    The worse game has a whooping 0.652 equity lost with 10 errors, 5 of them being blunder. The game is an extreme backgame (71 moves for each player) with all 15 checkers being in the opponent home board at one point. It is interesting to compare the error due to the Normal search interval (0.0034 per game) to the average error made by the 3-ply evaluation (from Michael Depreli study: 0.0271 per game). The average error is 8 times bigger than the error added by the search interval.


    Time consideration

    Increasing the search interval slows the analysis considerably.

    Interval Speed compare to 3-ply Normal Elo Gained
    3-ply Large 1.31 time slower 1.62
    3-ply Huge 1.66 time slower 2.18
    3-ply infinite 5 time slower 2.99
    Obviously the choice of use larger interval than normal is ultimately the choice of the user, but our advice is to use the Normal search


    Rollout consideration and a practical example

    You can choose which search interval to use in a rollout. The rollout will be slower as shown in the table above.

    Again, the user has ultimate choice but our recommendation is to use 3-ply normal, the time gained compare to 3-ply huge can be used to increase the rollout length (resulting in 1.29 reduction of the confidence interval).

    In average there is a difference every 45 moves. If there is an error that could make a noticeable effect it will need to be in the next move or the following one. I think a much better solution than to use larger interval is to set the first move to use a higher ply: 4-ply will analyze up to 8 moves in 3-ply (within 0.160) so it is equivalent to a Huge interval. On top of that it still makes a 4-ply analyze. The speed cost is not that bad as the first move need to be calculated only 21 times (the rest of the time the program will get the result from the cache). For the rest of the rollout using the Normal interval keep the rollout fast.

    Let’s examine an example: consider the following position and its 2-ply analyze





     is Player 2

    score: 0
    pip: 75
    Money session
    pip: 51
    score: 0

     is Player 1
    XGID=-CDBBaBB------a----bbbbcb-:1:-1:1:62:0:0:0:0:10
     to play 62

    1. 2 ply 7/1 7/5* eq: +0.309
    Player:
    Opponent:
    67.21% (G:1.04% B:0.00%)
    32.79% (G:1.20% B:0.02%)
    2. 2 ply 7/1 4/2 eq: +0.191 (-0.118)
    Player:
    Opponent:
    62.36% (G:0.05% B:0.00%)
    37.64% (G:1.27% B:0.02%)
    3. 2 ply 7/1 3/1 eq: +0.179 (-0.130)
    Player:
    Opponent:
    61.83% (G:0.05% B:0.00%)
    38.17% (G:1.36% B:0.02%)
    4. 1 ply 7/1 6/4 eq: -0.382 (-0.691)
    Player:
    Opponent:
    42.73% (G:0.10% B:0.00%)
    57.27% (G:4.20% B:0.06%)

    In this case the 3-ply Normal search interval will not analyze any further as the 2nd move is more than 0.080 away from the first. The Large interval will analyze the 2nd move only. and the Huge interval will analyze the top 3 moves.

    1. 3 ply 7/1 4/2 eq: +0.225
    Player:
    Opponent:
    63.63% (G:0.06% B:0.00%)
    36.37% (G:1.14% B:0.01%)
    2. 3 ply 7/1 3/1 eq: +0.221 (-0.004)
    Player:
    Opponent:
    63.47% (G:0.04% B:0.00%)
    36.53% (G:1.21% B:0.01%)
    3. 3 ply 7/1 7/5* eq: +0.125 (-0.100)
    Player:
    Opponent:
    58.92% (G:1.92% B:0.01%)
    41.08% (G:1.27% B:0.02%)
    4. 1 ply 7/1 6/4 eq: -0.382 (-0.607)
    Player:
    Opponent:
    42.73% (G:0.10% B:0.00%)
    57.27% (G:4.20% B:0.06%)

    It now realizes the other moves are much better (rollout shows the blunder to be 0.085).

    Let’s see how this has an effect on rollouts. Let’s back up the position one move and roll the choice leading to the position above.

    Out of 21 possible dice, the 62 will make an error of 0.085. The other rolls are played properly.

    It means the average error for the next roll is 0.085/18=0.0047

    We can expect that error to be carried around to the RO.





     is Player 2

    score: 0
    pip: 81
    Money session
    pip: 51
    score: 0

     is Player 1
    XGID=aCDBB-BB-----a-----bbbbcb-:1:-1:-1:51:0:0:0:0:10
     to play 51

    1. Rollout1 Bar/20 12/11 eq: -0.778
    Player:
    Opponent:
    10.50% (G:0.11% B:0.00%)
    89.50% (G:0.81% B:0.01%)
    Conf: ± 0.001 (-0.779...-0.777)
    Duration: 1 minute 04 seconds
     
    1 20736 Games rolled with Variance Reduction.
    Rolled for both No double and Double
    Moves and cube decisions: 3 ply

    Now let’s use the huge interval


    1. Rollout1 Bar/20 12/11 eq: -0.782
    Player:
    Opponent:
    10.24% (G:0.12% B:0.00%)
    89.76% (G:0.73% B:0.01%)
    Conf: ± 0.001 (-0.783...-0.781)
    Duration: 1 minute 36 seconds
     
    1 20736 Games rolled with Variance Reduction.
    Rolled for both No double and Double
    Moves and cube decisions: 3 ply
    Search interval: Huge

    As expected the error was carried on because the error happens on the very 1st move (note that the rollout was 1.5 slower).

    Let’s try using 4-ply for the first move (normal interval) and 3-ply for the rest (normal interval)


    1. Rollout1 Bar/20 12/11 eq: -0.782
    Player:
    Opponent:
    10.25% (G:0.12% B:0.00%)
    89.75% (G:0.78% B:0.01%)
    Conf: ± 0.001 (-0.783...-0.781)
    Duration: 1 minute 12 seconds
     
    1 20736 Games rolled with Variance Reduction.
    Rolled for both No double and Double
    First 1 moves: 4 ply, cube decisions: 3 ply
    Remaining moves and cube Decisions: 3 ply

    Using that mode find the best move and the error is not showing. Note that this is faster than the 3-ply Huge.

    Now let’s back up another move and let’s see the influence of the error when it happens 2 moves ahead





     is Player 2

    score: 0
    pip: 81
    Money session
    pip: 60
    score: 0

     is Player 1
    XGID=aCDBB-BA-----a--A--bbbbcb-:1:-1:1:54:0:0:0:0:10
     to play 54

    1. Rollout1 16/7 eq: +0.784
    Player:
    Opponent:
    89.37% (G:1.74% B:0.01%)
    10.63% (G:0.11% B:0.00%)
    Conf: ± 0.001 (+0.783...+0.785)
    Duration: 1 minute 19 seconds
     
    1 20736 Games rolled with Variance Reduction.
    Moves and cube decisions: 3 ply

    1. Rollout1 16/7 eq: +0.784
    Player:
    Opponent:
    89.41% (G:1.73% B:0.01%)
    10.59% (G:0.11% B:0.00%)
    Conf: ± 0.001 (+0.783...+0.785)
    Duration: 1 minute 50 seconds
     
    1 20736 Games rolled with Variance Reduction.
    Moves and cube decisions: 3 ply
    Search interval: Huge

     The error is not visible anymore. This makes sense as the 0.085 error happens only four time of 1296, creating a error of 0.000262 equity (or 0.013% which can be seen in the winning percentage)


    Note about cube

    The search interval affects only checker play. However when using eXtreme Gammon 3-ply, cube decisions are first analyzed in 1-ply (cubeful). If the decision of not doubling is 0.200 more than doubling, the analysis stops. If less than 0.200 then a 3-ply is preformed.
    In the process of this study we check also the effect of the 0.200 threshold by forcing all cube decisions to go to 3-ply.
    On the sample set, there are 21434 non obvious cube decisions (non obvious meaning less than 0.200 equity between No Double and Double)
    Using the 0.200 threshold results in 3 cube error in match play (0.019, 0.019 and 0.017) and 3 in Money games (0.052, 0.044 and 0.033)
    This represents an error every 3692 decisions, or every 1901 games. The estimated cost in Elo term is 0.06.
    It could be an idea of improvement to apply the search interval factor to the 0.200 threshold. In this case the large interval (0.300 thresholds) would result in a single error, while the huge (0.400 threshold) would catch them all. But again, the gain is very minimal.
    Note: Version 1.14 did introduce the fact that the search interval modify the cube threshold. So with a Huge interval the threshold to use 3-ply is 0.400.


    Conclusion

    The default search interval yield very good results and provide an excellent compromise between speed and accuracy.
    The effect on the analysis is less than 3 Elo points (or 0.09 PR). For rollouts, in rare case it can generate a little bias on the results (0.005 about once every 3800 positions).
    My advice is to keep it that way and use the time gain to either switch to a 2 passes analysis or increase the length of your Rollouts.


    Version 2 data

    Version 2 is much stronger and accurate than version 1 (for instance the error rate of 3-ply was almost divided by 2). Because the evaluation is better the effect of serach interval is much lesser than with version 1.
    For 3 ply, Normal Search interval is only 0.95 Elo (0.03 PR) weaker than with an infinite Search Interval (XG1 was 2.99 weaker).
    For XGR++ Normal Search interval is 0.59 Elo (0.018 PR) weaker than with an infinite Search Interval


    Appendix

    The following table show how the search interval works for every level.

    A line such as “3-ply 8 moves – 0.160” means: Run a 3-ply on the top 8 moves within the 0.160 equity of the best move.

    Normal Large x1.5 Huge x2 Gigantic x4

    3-ply 2-ply 8 moves – 0.160
    3-ply 4 moves – 0.080
    2-ply 12 moves – 0.240
    3-ply 6 moves – 0.120
    2-ply 16 moves – 0.320
    3-ply 8 moves – 0.160
    2-ply 32 moves – 0.640
    3-ply 16 moves – 0.320

    4-ply 2-ply 16 moves – 0.320
    3-ply 8 moves – 0.160
    4-ply 4 moves - 0.080
    2-ply 24 moves – 0.480
    3-ply 12 moves – 0.240
    4-ply 6 moves - 0.120
    2-ply 32 moves – 0.640
    3-ply 16 moves – 0.320
    4-ply 8 moves - 0.160
    2-ply 32 moves – 1.280
    3-ply 32 moves – 0.640
    4-ply 16 moves - 0.320

    5-ply 2-ply 32 moves – 0.320
    3-ply 16 moves – 0.160
    4-ply 8 moves - 0.080
    5-ply 4 moves - 0.080
    2-ply 32 moves – 0.480
    3-ply 24 moves – 0.240
    4-ply 12 moves - 0.120
    5-ply 6 moves - 0.120
    2-ply 32 moves – 0.640
    3-ply 32 moves – 0.320
    4-ply 16 moves - 0.160
    5-ply 8 moves - 0.160
    2-ply 32 moves – 1.280
    3-ply 32 moves – 0.640
    4-ply 32 moves - 0.320
    5-ply 16 moves - 0.320

    XGR/XGR+ 2-ply 16 moves – 0.320
    3-ply 8 moves – 0.160
    XGR(+) 4 moves - 0.040
    2-ply 24 moves – 0.480
    3-ply 12 moves – 0.240
    XGR(+) 6 moves - 0.060
    2-ply 32 moves – 0.640
    3-ply 16 moves – 0.320
    XGR(+) 8 moves - 0.080
    2-ply 32 moves – 1.280
    3-ply 32 moves – 0.640
    XGR(+)16 moves - 0.160

    XGR++ 2-ply 16 moves – 0.320
    3-ply 8 moves – 0.160
    4-ply 4 moves - 0.080
    XGR++ 4 moves - 0.040
    2-ply 24 moves – 0.480
    3-ply 12 moves – 0.240
    4-ply 6 moves - 0.120
    XGR++ 6 moves - 0.060
    2-ply 32 moves – 0.640
    3-ply 16 moves – 0.320
    4-ply 8 moves - 0.160
    XGR++ 8 moves - 0.080
    2-ply 32 moves – 1.280
    3-ply 32 moves – 0.640
    4-ply 16 moves - 0.320
    XGR++ 16 moves - 0.160