Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

LSTM and GRU layers without DNNL enabled give wrong gradients #17898

Open
@xziya

Description

@xziya

Description

Currently, we have two implementations of RNN layers on the CPU backend, which are

Both of them can be invoked from mx.sym.RNN, mx.rnn.FusedRNNCell, mx.gluon.rnn.LSTM/GRU/RNN. The fusion of DNNL provides more efficient Forward and Backward, while the native one gives a backup for some devices or environments that cannot use DNNL library.

Recently, we have found that there are some problems leading to the wrong gradients' calculation of the native implementation. Just tracking the issue here, and it will be fixed ASAP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions