Skip to content

Fine tuning TensorFlow DeBERTa fails on TPU #18476

Closed
@tmoroder

Description

@tmoroder

System Info

Latest version of transformers, Colab TPU, tensorflow 2.

  • Colab TPU
  • transformers: 4.21.0
  • tensorflow: 2.8.2 / 2.6.2
  • Python 3.7

Who can help?

@LysandreJik, @Rocketknight1, @san

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I am facing some issues while trying to fine-tune a TensorFlow DeBERTa model microsoft/deberta-v3-base on TPU.

I have created some Colab notebooks showing the errors. Note, the second and third notebooks already include some measures to circumvent previous errors.

I have seen similar issues when using microsoft/deberta-base.

I believe the following issues are related:

Thanks!

Expected behavior

Fine tuning is possible as it happens when using a GPU.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions