Score not propagating between moves #26

CivilizationalAgency · 2024-09-12T01:30:35Z

Since yesterday I've noticed that scores are not accurately propagating back between moves like they used to (whilst also allowing for some loss/regression to 0 for uncertainty), so the score is contradicting itself between moves and the move ranking is completely wrong, since the score of a move is no longer given by the evaluation of the final move of the best line

noobpwnftw · 2024-09-12T03:02:29Z

I've changed the score backup function to a more well-defined weighed averaging scheme. It is expected to be more accurately propagating leaf scores back to root, however this change can take some time to reach every line.

Bratish971 · 2024-09-13T07:22:09Z

I've changed the score backup function to a more well-defined weighed averaging scheme. It is expected to be more accurately propagating leaf scores back to root, however this change can take some time to reach every line.

Does it means, what if in main line at the end score 0, at start of line score be different from 0?

CivilizationalAgency · 2024-09-13T23:25:27Z

It does not seem to be propagating even between consecutive moves, for instance for the chess database the strongest first move at the time of writing has a score of 6, but the strongest responses from black have a score of -1. Previously the largest discrepancy between consecutive moves was 2 points if I recall correctly

robertnurnberg · 2024-09-14T12:52:49Z

Note that the score of the best move is no longer equal to the "evaluation" of that position on cdb. The evaluation of the position is now based on https://en.wikipedia.org/wiki/Softmax_function. For the position after 1. d4, we get this weighted average:

> python cdbeval.py --san "1. d4"
move:  g8f6, score:   -1, weight: 1.000000
move:  d7d5, score:   -1, weight: 1.000000
move:  e7e6, score:   -3, weight: 0.818731
move:  c7c6, score:   -9, weight: 0.449329
move:  d7d6, score:  -13, weight: 0.301194
move:  g7g6, score:  -17, weight: 0.201897
move:  f7f5, score:  -21, weight: 0.135335
move:  a7a6, score:  -26, weight: 0.082085
move:  c7c5, score:  -29, weight: 0.060810
move:  b8c6, score:  -29, weight: 0.060810
move:  h7h6, score:  -71, weight: 0.000912
move:  a7a5, score:  -72, weight: 0.000825
move:  b8a6, score:  -75, weight: 0.000611
move:  b7b6, score:  -79, weight: 0.000410
move:  g8h6, score: -105, weight: 0.000030
move:  h7h5, score: -114, weight: 0.000012
move:  b7b5, score: -126, weight: 0.000004
move:  e7e5, score: -140, weight: 0.000001
move:  f7f6, score: -143, weight: 0.000001
move:  g7g5, score: -227, weight: 0.000000
Weighted eval:  -5.971027695491816

If you want to test this also for other positions, you can use this script: cdbeval.py.

CivilizationalAgency · 2024-09-14T19:12:46Z

Note that the score of the best move is no longer equal to the "evaluation" of that position on cdb. The evaluation of the position is now based on https://en.wikipedia.org/wiki/Softmax_function. For the position after 1. d4, we get this weighted average:

> python cdbeval.py --san "1. d4"
move:  g8f6, score:   -1, weight: 1.000000
move:  d7d5, score:   -1, weight: 1.000000
move:  e7e6, score:   -3, weight: 0.818731
move:  c7c6, score:   -9, weight: 0.449329
move:  d7d6, score:  -13, weight: 0.301194
move:  g7g6, score:  -17, weight: 0.201897
move:  f7f5, score:  -21, weight: 0.135335
move:  a7a6, score:  -26, weight: 0.082085
move:  c7c5, score:  -29, weight: 0.060810
move:  b8c6, score:  -29, weight: 0.060810
move:  h7h6, score:  -71, weight: 0.000912
move:  a7a5, score:  -72, weight: 0.000825
move:  b8a6, score:  -75, weight: 0.000611
move:  b7b6, score:  -79, weight: 0.000410
move:  g8h6, score: -105, weight: 0.000030
move:  h7h5, score: -114, weight: 0.000012
move:  b7b5, score: -126, weight: 0.000004
move:  e7e5, score: -140, weight: 0.000001
move:  f7f6, score: -143, weight: 0.000001
move:  g7g5, score: -227, weight: 0.000000
Weighted eval:  -5.971027695491816

If you want to test this also for other positions, you can use this script: cdbeval.py.

Thank you for the response! I understand that giving a greater weighting to suboptimal moves would make the score more robust to an incorrectly calculated best response so scores should be more stable, the tradeoff being that the weighting of the strongest move is diluted. Intuitively this would become most useful for moves evaluated to a shallower depth where there is greater uncertainty, and conversely for moves with greater depth you just use the best response. I see the temperature parameter in the script, where is it coming from? It would make sense to me if it was inversely related to evaluation depth, but this doesn't seem to be the case since already for the evaluation of the first moves (e.g. 1. d4) the weighting of the best response is already being diluted. Or is it just because it isn't updated yet like @noobpwnftw mentioned?

robertnurnberg · 2024-09-14T20:03:55Z

Yes, the script uses the same (global) temperature as cdb. For a more detailed discussion of the pros and cons you could join the chessdb channel on the stockfish discord server: https://discord.com/channels/435943710472011776/1101022188313772083

CivilizationalAgency · 2024-09-14T21:06:23Z

Has the use of a dynamic temperature as a function of PV depth been considered to restore a more useful score for positions with a high eval depth/low uncertainty?

noobpwnftw · 2024-09-14T23:09:15Z

Don't have a way to make estimations of that, I guess given time it'll solve the problem by itself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Score not propagating between moves #26

Score not propagating between moves #26

CivilizationalAgency commented Sep 12, 2024

noobpwnftw commented Sep 12, 2024

Bratish971 commented Sep 13, 2024

CivilizationalAgency commented Sep 13, 2024 •

edited

Loading

robertnurnberg commented Sep 14, 2024

CivilizationalAgency commented Sep 14, 2024

robertnurnberg commented Sep 14, 2024

CivilizationalAgency commented Sep 14, 2024

noobpwnftw commented Sep 14, 2024

Score not propagating between moves #26

Score not propagating between moves #26

Comments

CivilizationalAgency commented Sep 12, 2024

noobpwnftw commented Sep 12, 2024

Bratish971 commented Sep 13, 2024

CivilizationalAgency commented Sep 13, 2024 • edited Loading

robertnurnberg commented Sep 14, 2024

CivilizationalAgency commented Sep 14, 2024

robertnurnberg commented Sep 14, 2024

CivilizationalAgency commented Sep 14, 2024

noobpwnftw commented Sep 14, 2024

CivilizationalAgency commented Sep 13, 2024 •

edited

Loading