Skip to content

Replace input_values_processing with unpack_inputs #21502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

amyeroberts
Copy link
Collaborator

@amyeroberts amyeroberts commented Feb 7, 2023

What does this PR do?

Updates Hubert and Wav2Vec2 to use the repo standard unpack_inputs decorator instead of input_values_processing. Previously, TensorFlow models use input_values_processing, a function which processed inputs ready to be passed to the TF layers. This was removed and replaced with unpack inputs, see #16051. For some reason, wav2vec2 wasn't part of this update. A bug was reported, but the conversation and details can't be found in the discord 😢

This PR also updates any relevant tests:

  • Adds unittest.skip decorator so tests are marked as skipped rather than passed
  • Modifies batch sizes for tests that were previously skipped for OOM to make sure logic is fully tested.
  • Adds back tests which shouldn't have been skipped. This caught the issue with loss not being calculated (see below)

Note: The loss calculation for TFHubertForCTC wasn't being calculated properly. Without the unpack_inputs decorator, when model.fit was being called, labels were passed into the model as part of the input_values dictionary. These were then processed and part of the inputs dictionary. However, labels was always taken from the function kwargs and so loss calculation skipped. This is why the previous test 'returned the wrong shape'. The issue was resolved for TFWav2Vec2ForCTC in a previous PR #18014

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 7, 2023

The documentation is not available anymore as the PR was closed or merged.

@amyeroberts amyeroberts force-pushed the unpack-inputs-hubert-wav2vec2 branch from 66b426a to 1b3dae2 Compare February 8, 2023 11:28
@amyeroberts amyeroberts changed the title Replace input_values_prrocessing with unpack_inputs Replace input_values_processing with unpack_inputs Feb 8, 2023
@amyeroberts amyeroberts marked this pull request as ready for review February 9, 2023 10:49
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup!

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks so much cleaner now

@amyeroberts amyeroberts merged commit cb56590 into huggingface:main Feb 10, 2023
@gante gante mentioned this pull request Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants