This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
[CI][Flaky test][nightly] flaky test in estimator nightly test #15199
Open
Description
This test_sentiment_rnn.py seems to be flaky.
see pipeline.
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/342/pipeline
It's been modified from https://github.com/d2l-ai/d2l-en/blob/master/chapter_natural-language-processing/sentiment-analysis-rnn.md
Dataset: IMDB
Optimizer: Adam
Learning rate: 0.01
Initializer: Xavier
Platform: Ubuntu GPU (AWS p3.2xlarge)
I was able to reproduce the error by running this test 1000 times locally. This accuracy seems abnormal, but other runs seems fine. Accuracy increased from 0.8 to 0.9 in 5 epochs.
[Epoch 0] Finished in 61.230s, train accuracy: 0.7166, train softmaxcrossentropyloss: 0.5390, validation accuracy: 0.8188, validation softmaxcrossentropyloss: 0.4100
[Epoch 1] Finished in 60.762s, train accuracy: 0.5248, train softmaxcrossentropyloss: 0.6945, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7103
[Epoch 2] Finished in 61.041s, train accuracy: 0.5001, train softmaxcrossentropyloss: 0.7150, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7381
[Epoch 3] Finished in 60.868s, train accuracy: 0.5078, train softmaxcrossentropyloss: 0.7176, validation accuracy: 0.5044, validation softmaxcrossentropyloss: 0.6928
[Epoch 4] Finished in 60.850s, train accuracy: 0.5050, train softmaxcrossentropyloss: 0.7097, validation accuracy: 0.5000, validation softmaxcrossentropyloss: 0.7254
Traceback (most recent call last):
File "test_sentiment_rnn.py", line 287, in <module>
test_estimator_gpu(**kwargs)
File "test_sentiment_rnn.py", line 268, in test_estimator_gpu
assert acc.get()[1] > 0.70
AssertionError