Closed
Description
System Info
transformers==4.20.1, torch==1.9.0, tensorflow2==2.9.
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Steps to reproduce the behavior
- Download SQuADv.11 fine-tuned bert large weights from: https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad
- Successfully reproduce the inference F1-score by runing this pytorch example.
- But fail to reproduce the inference f1-score by runing this tensorflow2 example.
- The reason is the tensorflow example miss the token_type_ids inputs. I add this input at following position to solved this problem:
https://github.com/huggingface/transformers/blob/main/examples/tensorflow/question-answering/run_qa.py#L640
tensor_keys = ["attention_mask", "token_type_ids", "input_ids"]
eval_inputs = {
"input_ids": tf.ragged.constant(processed_datasets["validation"]["input_ids"]).to_tensor(),
"token_type_ids": tf.ragged.constant(processed_datasets["validation"]["token_type_ids"]).to_tensor(),
"attention_mask": tf.ragged.constant(processed_datasets["validation"]["attention_mask"]).to_tensor(),
}
predict_inputs = {
"input_ids": tf.ragged.constant(processed_datasets["test"]["input_ids"]).to_tensor(),
"token_type_ids": tf.ragged.constant(processed_datasets["test"]["token_type_ids"]).to_tensor(),
"attention_mask": tf.ragged.constant(processed_datasets["test"]["attention_mask"]).to_tensor(),
}
Expected behavior
Both pytorch and tensorflow example produce same F1-score based on this weights.