Closed
Description
System Info
Latest version of transformers, Colab TPU, tensorflow 2.
- Colab TPU
- transformers: 4.21.0
- tensorflow: 2.8.2 / 2.6.2
- Python 3.7
Who can help?
@LysandreJik, @Rocketknight1, @san
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
I am facing some issues while trying to fine-tune a TensorFlow DeBERTa model microsoft/deberta-v3-base
on TPU.
I have created some Colab notebooks showing the errors. Note, the second and third notebooks already include some measures to circumvent previous errors.
- ValueError with partially known TensorShape with latest
take_along_axis
change: FineTuning_TF_DeBERTa_TPU_1 - Output shape mismatch of branches with custom dropout: FineTuning_TF_DeBERTa_TPU_2
- XLA compilation error because of dynamic/computed tensor shapes: FineTuning_TF_DeBERTa_TPU_3
I have seen similar issues when using microsoft/deberta-base
.
I believe the following issues are related:
- TF2 DeBERTaV2 runs super slow on TPUs #18239
- Debertav2 debertav3 TPU : socket closed #18276. From this I used the fix on
take_along_axis
.
Thanks!
Expected behavior
Fine tuning is possible as it happens when using a GPU.