-
Notifications
You must be signed in to change notification settings - Fork 616
Implement swarm testing and use it for rule based stateful tests #2238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
a6e5944
Make sure to freeze data after use.
DRMacIver d8634e4
We can have two statistical events due to aborted tests
DRMacIver c645ccd
Add a strategy for swarm testing
DRMacIver 5b00908
Add test to demonstrate the problem
DRMacIver 8032120
Use swarm testing to enable/disable rules in stateful testing
DRMacIver 493bb25
Add release file
DRMacIver 7750be4
Fix format for reproduce failure tests
DRMacIver bc9cd73
That needs to be a list on Python 2
DRMacIver 5d69404
Fix typo
DRMacIver bce9d2a
Add warning comment about order of checks
DRMacIver 02e4a30
Improve RELEASE.rst
DRMacIver 42ee3be
Allow explicit construction of FeatureFlags
DRMacIver File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
RELEASE_TYPE: minor | ||
|
||
This release significantly improves the data distribution in rule based stateful testing <stateful_testing>, | ||
by using a technique called `Swarm Testing (Groce, Alex, et al. "Swarm testing." | ||
Proceedings of the 2012 International Symposium on Software Testing and Analysis. ACM, 2012.) <https://agroce.github.io/issta12.pdf>`_ | ||
to select which rules are run in any given test case. This should allow it to find many issues that it would previously have missed. | ||
|
||
This change is likely to be especially beneficial for stateful tests with large numbers of rules. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
129 changes: 129 additions & 0 deletions
129
hypothesis-python/src/hypothesis/searchstrategy/featureflags.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# coding=utf-8 | ||
# | ||
# This file is part of Hypothesis, which may be found at | ||
# https://github.com/HypothesisWorks/hypothesis/ | ||
# | ||
# Most of this work is copyright (C) 2013-2019 David R. MacIver | ||
# ([email protected]), but it contains contributions by others. See | ||
# CONTRIBUTING.rst for a full list of people who may hold copyright, and | ||
# consult the git log if you need to determine who owns an individual | ||
# contribution. | ||
# | ||
# This Source Code Form is subject to the terms of the Mozilla Public License, | ||
# v. 2.0. If a copy of the MPL was not distributed with this file, You can | ||
# obtain one at https://mozilla.org/MPL/2.0/. | ||
# | ||
# END HEADER | ||
|
||
from __future__ import absolute_import, division, print_function | ||
|
||
import hypothesis.internal.conjecture.utils as cu | ||
from hypothesis.searchstrategy.strategies import SearchStrategy | ||
|
||
FEATURE_LABEL = cu.calc_label_from_name("feature flag") | ||
|
||
|
||
class FeatureFlags(object): | ||
"""Object that can be used to control a number of feature flags for a | ||
given test run. | ||
|
||
This enables an approach to data generation called swarm testing ( | ||
see Groce, Alex, et al. "Swarm testing." Proceedings of the 2012 | ||
International Symposium on Software Testing and Analysis. ACM, 2012), in | ||
which generation is biased by selectively turning some features off for | ||
each test case generated. When there are many interacting features this can | ||
find bugs that a pure generation strategy would otherwise have missed. | ||
|
||
FeatureFlags are designed to "shrink open", so that during shrinking they | ||
become less restrictive. This allows us to potentially shrink to smaller | ||
test cases that were forbidden during the generation phase because they | ||
required disabled features. | ||
""" | ||
|
||
def __init__(self, data=None, enabled=(), disabled=()): | ||
self.__data = data | ||
self.__decisions = {} | ||
|
||
for f in enabled: | ||
self.__decisions[f] = 0 | ||
|
||
for f in disabled: | ||
self.__decisions[f] = 255 | ||
|
||
# In the original swarm testing paper they turn features on or off | ||
# uniformly at random. Instead we decide the probability with which to | ||
# enable features up front. This can allow for scenarios where all or | ||
# no features are enabled, which are vanishingly unlikely in the | ||
# original model. | ||
# | ||
# We implement this as a single 8-bit integer and enable features which | ||
# score >= that value. In particular when self.__baseline is 0, all | ||
# features will be enabled. This is so that we shrink in the direction | ||
# of more features being enabled. | ||
if self.__data is not None: | ||
self.__baseline = data.draw_bits(8) | ||
else: | ||
# If data is None we're in example mode so all that matters is the | ||
# enabled/disabled lists above. We set this up so that | ||
self.__baseline = 1 | ||
|
||
def is_enabled(self, name): | ||
"""Tests whether the feature named ``name`` should be enabled on this | ||
test run.""" | ||
if self.__data is None or self.__data.frozen: | ||
# Feature set objects might hang around after data generation has | ||
# finished. If this happens then we just report all new features as | ||
# enabled, because that's our shrinking direction and they have no | ||
# impact on data generation if they weren't used while it was | ||
# running. | ||
try: | ||
return self.__is_value_enabled(self.__decisions[name]) | ||
except KeyError: | ||
return True | ||
|
||
data = self.__data | ||
|
||
data.start_example(label=FEATURE_LABEL) | ||
if name in self.__decisions: | ||
# If we've already decided on this feature then we don't actually | ||
# need to draw anything, but we do write the same decision to the | ||
# input stream. This allows us to lazily decide whether a feature | ||
# is enabled, because it means that if we happen to delete the part | ||
# of the test case where we originally decided, the next point at | ||
# which we make this decision just makes the decision it previously | ||
# made. | ||
value = self.__decisions[name] | ||
data.draw_bits(8, forced=value) | ||
else: | ||
# If the baseline is 0 then everything is enabled so it doesn't | ||
# matter what we have here and we might as well make the shrinker's | ||
# life easier by forcing it to zero. | ||
if self.__baseline == 0: | ||
value = 0 | ||
data.draw_bits(8, forced=0) | ||
else: | ||
value = data.draw_bits(8) | ||
self.__decisions[name] = value | ||
data.stop_example() | ||
return self.__is_value_enabled(value) | ||
|
||
def __is_value_enabled(self, value): | ||
"""Check if a given value drawn for a feature counts as enabled. Note | ||
that low values are more likely to be enabled. This is again in aid of | ||
shrinking open. In particular a value of 255 is always enabled.""" | ||
return (255 - value) >= self.__baseline | ||
|
||
def __repr__(self): | ||
enabled = [] | ||
disabled = [] | ||
for k, v in self.__decisions.items(): | ||
if self.__is_value_enabled(v): | ||
enabled.append(k) | ||
else: | ||
disabled.append(k) | ||
return "FeatureFlags(enabled=%r, disabled=%r)" % (enabled, disabled) | ||
|
||
|
||
class FeatureStrategy(SearchStrategy): | ||
def do_draw(self, data): | ||
return FeatureFlags(data) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# coding=utf-8 | ||
# | ||
# This file is part of Hypothesis, which may be found at | ||
# https://github.com/HypothesisWorks/hypothesis/ | ||
# | ||
# Most of this work is copyright (C) 2013-2019 David R. MacIver | ||
# ([email protected]), but it contains contributions by others. See | ||
# CONTRIBUTING.rst for a full list of people who may hold copyright, and | ||
# consult the git log if you need to determine who owns an individual | ||
# contribution. | ||
# | ||
# This Source Code Form is subject to the terms of the Mozilla Public License, | ||
# v. 2.0. If a copy of the MPL was not distributed with this file, You can | ||
# obtain one at https://mozilla.org/MPL/2.0/. | ||
# | ||
# END HEADER | ||
|
||
from __future__ import absolute_import, division, print_function | ||
|
||
from hypothesis import given, strategies as st | ||
from hypothesis.internal.compat import hrange | ||
from hypothesis.searchstrategy.featureflags import FeatureFlags, FeatureStrategy | ||
from tests.common.debug import find_any, minimal | ||
|
||
STRAT = FeatureStrategy() | ||
|
||
|
||
def test_can_all_be_enabled(): | ||
find_any(STRAT, lambda x: all(x.is_enabled(i) for i in hrange(100))) | ||
|
||
|
||
def test_can_all_be_disabled(): | ||
find_any(STRAT, lambda x: all(not x.is_enabled(i) for i in hrange(100))) | ||
|
||
|
||
def test_minimizes_open(): | ||
features = hrange(10) | ||
|
||
flags = minimal(STRAT, lambda x: [x.is_enabled(i) for i in features]) | ||
|
||
assert all(flags.is_enabled(i) for i in features) | ||
|
||
|
||
def test_minimizes_individual_features_to_open(): | ||
features = list(hrange(10)) | ||
|
||
flags = minimal( | ||
STRAT, lambda x: sum([x.is_enabled(i) for i in features]) < len(features) | ||
) | ||
|
||
assert all(flags.is_enabled(i) for i in features[:-1]) | ||
assert not flags.is_enabled(features[-1]) | ||
|
||
|
||
def test_marks_unknown_features_as_enabled(): | ||
x = find_any(STRAT, lambda v: True) | ||
|
||
assert x.is_enabled("fish") | ||
|
||
|
||
def test_by_default_all_enabled(): | ||
f = FeatureFlags() | ||
|
||
assert f.is_enabled("foo") | ||
|
||
|
||
@given(st.data()) | ||
def test_repr_can_be_evalled(data): | ||
flags = data.draw(STRAT) | ||
|
||
features = data.draw(st.lists(st.text(), unique=True)) | ||
|
||
for f in features: | ||
flags.is_enabled(f) | ||
|
||
flags2 = eval(repr(flags)) | ||
|
||
for f in features: | ||
assert flags2.is_enabled(f) == flags.is_enabled(f) | ||
|
||
more_features = data.draw(st.lists(st.text().filter(lambda s: s not in features))) | ||
|
||
for f in more_features: | ||
assert flags2.is_enabled(f) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.