Skip to content

Apply flooring and half-millisecond-adjustments to hit windows #33882

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bdach
Copy link
Collaborator

@bdach bdach commented Jun 25, 2025

Split out from #33078

This is a "two-birds-with-one-stone" change, which addresses both #28744 and #11311 simultaneously.

  • The replay stability issue caused by time instants being rounded to nearest integer is fixed by this, because flooring and
    subtracting/adding 0.5 from the hit window threshold makes it impossible for a judgement to switch to anything else after replay rounding is applied - all hit windows are always a full integer plus 0.5 milliseconds, which immunizes them to rounding-to-full-ms issues.

  • The direction of applying the 0.5 adjustment additionally fixes the disparity with stable - in osu! and taiko 0.5 is subtracted as hit window ranges in those rulesets are exclusive on stable, while in mania 0.5 is added, as the hit window ranges there are inclusive on stable.

As should be obvious, this materially changes hit windows. To what degree this is a significant change is up for discussion; I would say it is not since hitting half a millisecond changes would require 2000fps input recording, and we're still timestamping inputs using the update thread's clock, that gives a 1ms resolution at best.

In the worst case, in osu! and taiko, this can change a hit window range by 1.5ms (e.g. 300.9ms -> floored to 300ms -> 299.5ms after subtraction of the half). It's more than the best-case resolution of input timestamps, but not by much. Considering how cleanly this resolves the issues in question, I see it as an acceptable tradeoff.

Notably some tests are adjusted here. The replay stability tests needed adjustments because hit windows have been materially changed. What matters in the replay stability tests is covering the time instants near the hit window edges and ensuring that re-encode doesn't mutate the resulting judgements, not what the particular numbers used are.

Stable cross-reference, if you desire it:

bdach added 2 commits June 25, 2025 11:44
This is a "two-birds-with-one-stone" change, which addresses both
ppy#28744 and
ppy#11311 simultaneously.

- The replay stability issue caused by time instants being rounded to
  nearest integer is fixed by this, because flooring and
  subtracting/adding 0.5 from the hit window threshold makes it
  impossible for a judgement to switch to anything else after replay
  rounding is applied - all hit windows are always a full integer plus
  0.5 milliseconds, which immunizes them to rounding-to-full-ms issues.

- The direction of applying the 0.5 adjustment additionally fixes the
  disparity with stable - in osu! and taiko 0.5 is subtracted as
  hit window ranges in those rulesets are exclusive on stable, while in
  mania 0.5 is added, as the hit window ranges there are *inclusive* on
  stable.

As should be obvious, this materially changes hit windows. To what degree
this is a *significant* change is up for discussion; I would say "no"
since hitting half a millisecond changes would require 2000fps input
recording, and we're still timestamping inputs using the update thread's
clock, that gives a 1ms resolution at best.

In the worst case, in osu! and taiko, this can change a hit window range
by 1.5ms (e.g. 300.9ms -> floored to 300ms -> 299.5ms after subtraction
of the half). It's more than the best-case resolution of input
timestamps, but not by much. Considering how cleanly this resolves the
issues in question, I see it as an acceptable tradeoff.
The replay stability tests needed adjustments because hit windows have
been materially changed with the previous commit. What matters in the
replay stability tests is covering the time instants near the hit window
edges and ensuring that re-encode doesn't mutate the resulting
judgements, not what the particular numbers used are.
@Detze
Copy link
Contributor

Detze commented Jun 25, 2025

As should be obvious, this materially changes hit windows. To what degree this is a significant change is up for discussion; I would say it is not since hitting half a millisecond changes would require 2000fps input recording, and we're still timestamping inputs using the update thread's clock, that gives a 1ms resolution at best.

Is it actually true that (on integer ODs1) the real world time hit windows are unchanged before and after this 0.5 ms rounding change?

In the worst case, in osu! and taiko, this can change a hit window range by 1.5ms (e.g. 300.9ms -> floored to 300ms -> 299.5ms after subtraction of the half). It's more than the best-case resolution of input timestamps, but not by much. Considering how cleanly this resolves the issues in question, I see it as an acceptable tradeoff.

Are (assuming the above is true, only non-integer OD1) scores set with and without this change going to be handled differently during pp calculation? Otherwise, past scores (as well as scores set while simply choosing to play on an older client?) are advantaged (disadvantaged on mania) compared to future scores. Creating disparities like these is certainly something that should be avoided. It's not just related to pp either, but also beatmap leaderboard position (/score/accuracy2/etc.). Non-integer OD1 is not at all niche or uncommon (especially with Difficulty Adjust, fortunately currently not ranked). Any non-astronomical difference in real world time hit windows, even 1 ms or 0.5 ms, is a significant change in hit accuracy (as each score usually contains hundreds if not thousands of hits, and many of them will inevitably be at hit window edges), the replay playback inaccuracy issue serves as a clear illustration of that.

Personally, I would suggest to only apply this change to Classic mod as "legacy hit windows". IMO lazer's mapping of OD to real world time hit windows is superior (simpler and more intuitive, particularly on non-1x speed rate) compared to stable's. Current lazer scores already use this mapping - creating a situation where is varies depending on client version does not sound like a good idea. It also sidesteps the above issue of certain scores being advantaged.

Footnotes

  1. To be precise, any situation where nominative hit windows are / are not fractional. Non-integer OD is not the only case of non-integer hit windows, lazer mania Perfect judgement hit windows are another example. 2 3

  2. Including players who care about specifically SS scores.

@bdach
Copy link
Collaborator Author

bdach commented Jun 25, 2025

Is it actually true that (on integer ODs) the real world time hit windows are unchanged before and after this 0.5 ms rounding change?

No? I've never claimed that the windows are unchanged. At most I've argued that they're essentially unchanged because of input timestamping foibles.

Are (assuming the above is true, only non-integer OD) scores set with and without this change going to be handled differently during pp calculation?

My honest approach to this is to say no, nothing should be handled specially, however "theoretically wrong" it is. Just for simplicity's sake.

My position is: We're going back to how stable does things. That's it. I argue that nobody should be significantly harmed by this. That's a statement that probably isn't completely true, also probably cannot be proven or disproven at this point, but it is made in the interest of decreasing complexity and preserving our collective sanity. I don't want to go down a five month expedition to determine how to best fairly recompense anyone who may have been wronged by any of this.

If I'm brutally honest, your reply (or, what I understood of it, which is admittedly not much) sounds like an attempt to lobby towards having your solution (#26452) go forward, and to me, the fact that it's been sitting there for one and a half years plus unreviewed by anyone but me is an indicator that your approach, however "theoretically correct", is just too complicated to go forward with.

@Natelytle
Copy link
Contributor

For what it's worth, I agree with bdach here as a neutral(-ish) third party.

Fractional hit windows just don't really come with any benefits, and it would mean that unless replays got changed to store double value hit timings (which I expect would cause a lot of floating point nightmare scenarios in the long run), all lazer scores set from this PR onward would continue to be plagued by unfixable replay errors.

Additionally, as someone with experience doing work on the PP system, the magnitude of the advantage is too small to matter IMO. Running a score of mine with this PR, it does go up by 7x100, however that amounts to a difference of 12pp. Considering there has been no uprising about balancing concerns between the current lazer accuracy system and the stable accuracy system, I doubt there will be an uprising about this miniscule "advantage" either.

@peppy
Copy link
Member

peppy commented Jun 26, 2025

Here's a simple way of looking at this (we do not need to consider input recording frames or anything else – it's irrelevant as at the end of the day frame pacing will cause this to all be random anyways):

Until now, lazer has been more lenient for osu! and osu!taiko, and less lenient for osu!mania. This change reduces lazer hit windows to match stable. The reductions are all within (or just outside of) error margin:

osu! (OD 5)

Judgement Reduction (%)
GREAT 1.00%
OK 0.50%
MEH 0.33%

osu! (OD 9)

Judgement Reduction (%)
GREAT 2.50%
OK 1.27%
MEH 0.74%

osu!mania (OD 5)

Judgement Reduction (%)
PERFECT -0.26%
GREAT -0.51%
GOOD -0.61%
OK -0.45%
MEH -0.37%
MISS -0.29%

osu!mania (OD 9.3)

Judgement Reduction (%)
PERFECT -0.58%
GREAT -0.83%
GOOD -0.65%
OK -0.40%
MEH -0.20%
MISS -0.12%

osu!taiko (OD 7.8)

Judgement Reduction (%)
GREAT 2.07%
OK 1.11%
MISS 0.31%

But we also need to consider that players hit with a roughly normal distribution around the actual hitobject time, meaning that for an average user, the ratio of hits actually affected by these changes is not an equal distribution. As in, you cannot say that with these changes on a non-perfectly play, the change in judgement accuracy should match the numbers above. It will be much smaller percentage changes.

If anyone actually thinks this is an issue, I'd recommend testing on some very long replays for yourself. But let's also remember that this is bringing things back in line with stable, meaning that the only issue is that scores set on lazer to date is all that has been affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants