batch normalization with buckets #2663
Description
I want to implement lstm with batch normalization. When I run a sequence to sequence sample with bucketing, mxnet shows an error of
`
Traceback (most recent call last):
File "/home/odin/Documents/sent2mat_mx/model.py", line 206, in
model(config)
File "/home/odin/Documents/sent2mat_mx/model.py", line 201, in model
lr_decay],
File "/home/odin/local/mxnet/python/mxnet/model.py", line 788, in fit
sym_gen=self.sym_gen)
File "/home/odin/local/mxnet/python/mxnet/model.py", line 222, in _train_multi_device
executor_manager.load_data_batch(data_batch)
File "/home/odin/local/mxnet/python/mxnet/executor_manager.py", line 387, in load_data_batch
shared_group=self.execgrp)
File "/home/odin/local/mxnet/python/mxnet/executor_manager.py", line 224, in init
shared_data_arrays=self.shared_data_arrays[i])
File "/home/odin/local/mxnet/python/mxnet/executor_manager.py", line 170, in _bind_exec
assert aux_shape[i] == a.shape
IndexError: list index out of range
`
It seems to me that when generating executor_manager (model.py line 184), it tends to use the default bucket length, which is the longest bucket. However, when feeding data, (model.py line 222), the bucket length is not essentially the same as the longest bucket. The auxiliary states are then not the same as previous ones.
Is it possible to solve this problem? Any help would be appreciated.