Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

[CI][Flaky test][nightly] flaky test in estimator nightly test  #15199

Open
@roywei

Description

@roywei

This test_sentiment_rnn.py seems to be flaky.
see pipeline.
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/342/pipeline

It's been modified from https://github.com/d2l-ai/d2l-en/blob/master/chapter_natural-language-processing/sentiment-analysis-rnn.md
Dataset: IMDB
Optimizer: Adam
Learning rate: 0.01
Initializer: Xavier
Platform: Ubuntu GPU (AWS p3.2xlarge)

I was able to reproduce the error by running this test 1000 times locally. This accuracy seems abnormal, but other runs seems fine. Accuracy increased from 0.8 to 0.9 in 5 epochs.

[Epoch 0] Finished in 61.230s, train accuracy: 0.7166, train softmaxcrossentropyloss: 0.5390, validation accuracy: 0.8188, validation softmaxcrossentropyloss: 0.4100

[Epoch 1] Finished in 60.762s, train accuracy: 0.5248, train softmaxcrossentropyloss: 0.6945, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7103

[Epoch 2] Finished in 61.041s, train accuracy: 0.5001, train softmaxcrossentropyloss: 0.7150, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7381

[Epoch 3] Finished in 60.868s, train accuracy: 0.5078, train softmaxcrossentropyloss: 0.7176, validation accuracy: 0.5044, validation softmaxcrossentropyloss: 0.6928


[Epoch 4] Finished in 60.850s, train accuracy: 0.5050, train softmaxcrossentropyloss: 0.7097, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7254

Traceback (most recent call last):

  File "test_sentiment_rnn.py", line 287, in <module>

    test_estimator_gpu(**kwargs)

  File "test_sentiment_rnn.py", line 268, in test_estimator_gpu

    assert acc.get()[1] > 0.70

AssertionError

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions