This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
[Optimizer][Bug] Gradient is mutated in the Adam optimizer #15759
Closed
Description
In the implementation of Adam: grad, mean and var are all mutated (See https://github.com/apache/incubator-mxnet/blob/master/src/operator/optimizer_op-inl.h#L1307-L1315). However, the FMutateInput
flag is only set to {2, 3}, which should be {1, 2, 3}. (See https://github.com/apache/incubator-mxnet/blob/master/src/operator/optimizer_op.cc#L699)
To reproduce the bug, you may check the value of the gradient before/after adam_update
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/optimizer/optimizer.py#L1226-L1227 .
We can add the following into optimizer.py
grad1 = grad.asnumpy()
adam_update(weight, grad, mean, var, out=weight, ...)
grad2 = grad.asnumpy()
import numpy as np
np.testing.assert_allclose(grad1, grad2)
Create an adam optimizer with wd=1E-3
and train any model. You will find that the gradient has been mutated.