Skip to content

Implement swarm testing for st.one_of #4322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tybug
Copy link
Member

@tybug tybug commented Mar 26, 2025

Part of #2643. I think swarm testing is great and would love to get it into more places in Hypothesis. There are still some problems with this implementation (see review comments).

Comment on lines +65 to 71
# this really messes up our deduplication tracking, because all 255
# draws are unique. But we more or less have to choose whether something
# is enabled on-demand with a prior probability, rather than choosing what
# is enabled up front, because the latter results in a very large choice
# sequence when there are lots of possibilities.
# (a tradeoff might be selecting up front when there are <= 3 options?)
self.__p_disabled = self.__data.draw_integer(0, 254) / 255
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment; not too sure what to do here. Here's one example of this added redundancy:

@given(st.from_regex(re.compile("[ab]*", re.IGNORECASE), fullmatch=True))
def f(v):
    print(v)
f()

which prints "A" once on master but ~8 times on this branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if self.__p_disabled is only ever used as the p= argument to draw_bool(), and is strictly between 0 and 1, do we need to choose it via the choice sequence at all? vs. data.random(), or 0.5 if that's not available?

Copy link
Member Author

@tybug tybug Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of. If we don't draw this from the choice sequence, DataTree thinks the later is_disabled draw is flaky because the kwargs p passed to draw_boolean has changed. It's not actually flaky because the values in the choice sequence don't depend on p (as long as we're careful about p=0.0 and p=1), but DataTree doesn't know that.

We could imagine some kind of "ignore this param for flakiness" api, but I'm not sure how that would look.

Alternatively, hardcode that p in draw_boolean shouldn't be checked for flakiness? It's controlling the distribution, not the domain, so arguably consumers should be able to change it without flakiness...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ignoring distribution-only arguments in flakiness-checks makes sense; we can still replay perfectly.

Comment on lines +699 to 703
self.enabled_branches_strategy = st.shared(
FeatureStrategy(self.original_strategies),
key=("one_of swarm testing", self.original_strategies),
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that using st.shared for the FeatureStrategy means that any separately-defined strategies which happen to have the same key will have the same swarm behavior. i.e. two separate usages of st.integers() | st.floats() will use the same swarm decision of what to disable. But I believe they will still draw separate random choices below the swarm step, so the values won't be the same, just the disabling decision.

I think diversity of whether to share swarm decisions here would be best for bug finding. What we might be able to do is draw a meta p which chooses, for each occurrence of the same-keyed strategy s, whether to reuse swarm behavior or not. But I don't know if this would be easy to implement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds complicated; let's leave that for a follow-up PR.

@tybug
Copy link
Member Author

tybug commented Apr 17, 2025

This is a core change to our generation logic - which we also just changed in #4356. Since I'm not confident this PR will be regression free (in either performance or bugfinding ability), I'm tempted to defer this PR for a bit to let any issues with the constants pool shake out first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants