LSTM and GRU layers without DNNL enabled give wrong gradients

## Description
Currently, we have two implementations of RNN layers on the CPU backend, which are

+ Native fusion implementation,
+ The fusion enabled by DNNL library (https://intel.github.io/mkl-dnn/dev_guide_rnn.html).

Both of them can be invoked from `mx.sym.RNN`, `mx.rnn.FusedRNNCell`, `mx.gluon.rnn.LSTM/GRU/RNN`. The fusion of DNNL provides more efficient Forward and Backward, while the native one gives a backup for some devices or environments that cannot use DNNL library.

Recently, we have found that there are some problems leading to the wrong gradients' calculation of the native implementation. Just tracking the issue here, and it will be fixed ASAP.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LSTM and GRU layers without DNNL enabled give wrong gradients #17898

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LSTM and GRU layers without DNNL enabled give wrong gradients #17898

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions