tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound #5549

austereantelope · 2022-01-24T19:38:29Z

Patch description
Hi,

The test test_sampled_equals_unsampled_when_biased_against_non_sampled_positions in sampled_softmax_loss_test.py has an assertion bound (assert (abs(pct_error) < 0.001)) that is too loose. This means potential bug in the code could still pass the original test.

To quantify this I conducted some experiments where I generated multiple mutations of the source code under test and ran each mutant and the original code 100 times to build a distribution of their outputs. Each mutant is generated using simple mutation operators (e.g. > can become < ) on source code covered by the test. I used KS-test to find mutants that produced a different distribution from the original and use those mutants as a proxy for bugs that could be introduced. In the graph below I show the distribution of both the original code and also the mutants with a different distribution.

Here we see that the bound of 0.001 is too loose since the original distribution (in orange) is less than 0.001. Furthermore in this graph we can observe that there are many mutants (proxy for bugs) that are below the bound as well and that is undesirable since the test should aim to catch potential bugs in code. I quantify the "bug detection" of this assertion by varying the bound in a trade-off graph below.

In this graph, I plot the mutant catch rate (ratio of mutant outputs that fail the test) and the original pass rate (ratio of original output that pass the test). The original bound of 0.001 (red dotted line) has a catch rate of 0.51.

To improve this test, I propose to tighten the bound to 0.0003 (the blue dotted line). The new bound has a catch rate of 0.66 (+0.15 increase compare to original) while still has >99 % pass rate (test is not flaky, I ran the updated test 500 times and observed 100 % pass rate). I think this is a good balance between improving the bug-detection ability of the test while keep the flakiness of the test low.

Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions or questions.

My Environment:

python=3.7.11
pytorch=1.10.0

my allennlp Experiment SHA:
2cdb8742c8c8c3c38ace4bdfadbdc750a1aa2475

Before submitting

I've read and followed all steps in the Making a pull request
section of the CONTRIBUTING docs.
I've updated or added any relevant docstrings following the syntax described in the
Writing docstrings section of the CONTRIBUTING docs.
If this PR fixes a bug, I've added a test that will fail without my fix.
If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

All GitHub Actions jobs for my pull request have passed.
codecov/patch reports high test coverage (at least 90%).
You can find this under the "Actions" tab of the pull request once the other checks have finished.

…_positions bound

…iased_against_non_sampled_positions-bound

AkshitaB · 2022-02-01T08:16:21Z

@austereantelope This is great! Thanks for being so thorough!

…iased_against_non_sampled_positions-bound

austereantelope and others added 3 commits January 24, 2022 13:31

tighten test_sampled_equals_unsampled_when_biased_against_non_sampled…

76a1944

…_positions bound

Merge branch 'main' into tighten-test_sampled_equals_unsampled_when_b…

4ca52e5

…iased_against_non_sampled_positions-bound

Merge branch 'main' into tighten-test_sampled_equals_unsampled_when_b…

6b17296

…iased_against_non_sampled_positions-bound

AkshitaB approved these changes Feb 1, 2022

View reviewed changes

AkshitaB enabled auto-merge (squash) February 1, 2022 08:16

Merge branch 'main' into tighten-test_sampled_equals_unsampled_when_b…

2e889d6

…iased_against_non_sampled_positions-bound

AkshitaB merged commit 3c2299a into allenai:main Feb 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound #5549

tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound #5549

austereantelope commented Jan 24, 2022 •

edited

Loading

AkshitaB commented Feb 1, 2022

tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound #5549

tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound #5549

Conversation

austereantelope commented Jan 24, 2022 • edited Loading

Before submitting

After submitting

AkshitaB commented Feb 1, 2022

austereantelope commented Jan 24, 2022 •

edited

Loading