Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 #5326

dependabot · 2021-07-22T13:02:53Z

Updates the requirements on transformers to permit the latest version.

Release notes

v4.9.0: TensorFlow examples, CANINE, tokenizer training, ONNX rework

ONNX rework

This version introduces a new package, transformers.onnx, which can be used to export models to ONNX. Contrary to the previous implementation, this approach is meant as an easily extendable package where users may define their own ONNX configurations and export the models they wish to export.
python -m transformers.onnx --model=bert-base-cased onnx/bert-base-cased/
Validating ONNX model...
        -[✓] ONNX model outputs' name match reference model ({'pooler_output', 'last_hidden_state'}
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 8, 768) matchs (2, 8, 768)
                -[✓] all values close (atol: 0.0001)
        - Validating ONNX Model output "pooler_output":
                -[✓] (2, 768) matchs (2, 768)
                -[✓] all values close (atol: 0.0001)
All good, model saved at: onnx/bert-base-cased/model.onnx
[RFC] Laying down building stone for more flexible ONNX export capabilities #11786 (@mfuntowicz)

CANINE model

Four new models are released as part of the CANINE implementation: CanineForSequenceClassification, CanineForMultipleChoice, CanineForTokenClassification and CanineForQuestionAnswering, in PyTorch.

The CANINE model was proposed in CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation by Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting. It’s among the first papers that train a Transformer without using an explicit tokenization step (such as Byte Pair Encoding (BPE), WordPiece, or SentencePiece). Instead, the model is trained directly at a Unicode character level. Training at a character level inevitably comes with a longer sequence length, which CANINE solves with an efficient downsampling strategy, before applying a deep Transformer encoder.

Add CANINE #12024 (@NielsRogge)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=canine

Tokenizer training

This version introduces a new method to train a tokenizer from scratch based off of an existing tokenizer configuration.
from datasets import load_dataset
from transformers import AutoTokenizer
dataset = load_dataset("wikitext", name="wikitext-2-raw-v1", split="train")
We train on batch of texts, 1000 at a time here.
batch_size = 1000
corpus = (dataset[i : i + batch_size]["text"] for i in range(0, len(dataset), batch_size))
tokenizer = AutoTokenizer.from_pretrained("gpt2")
new_tokenizer = tokenizer.train_new_from_iterator(corpus, vocab_size=20000)

... (truncated)

Commits

72aee83 Release: v4.9.0
fcf8301 Fix type of max_seq_length arg in run_swag.py (#12832)
27a8c9e [parallelism doc] document Deepspeed-Inference and parallelformers (#12836)
807b6bd [Deepspeed] warmup_ratio docs (#12830)
8c2384d Raise warning in HP search when hp is not in args (#12831)
cf0755a [debug] DebugUnderflowOverflow doesn't work with DP (#12816)
ac3cb66 Add _CHECKPOINT_FOR_DOC to all models (#12811)
786ced3 Add versioning system to fast tokenizer files (#12713)
037bdf8 Refer warmup_ratio when setting warmup_num_steps. (#12818)
15d19ec fix convert_tokens_to_string calls (#11716)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [transformers](https://github.com/huggingface/transformers) to permit the latest version. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.1.0...v4.9.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>

dirkgr · 2021-07-23T00:01:18Z

@AkshitaB, do you have an updated version of TensorCache that we can drop in here so I don't have to investigate what's wrong with the old one?

dependabot · 2021-07-26T20:28:45Z

A newer version of transformers exists, but since this PR has been edited by someone other than Dependabot I haven't updated it. You'll get a PR for the updated version as normal once this PR is merged.

dependabot bot added the dependencies Pull requests that update a dependency file label Jul 22, 2021

Merge branch 'main' into dependabot/pip/transformers-gte-4.1-and-lt-4.10

2583859

Merge branch 'main' into dependabot/pip/transformers-gte-4.1-and-lt-4.10

debe022

Merge branch 'main' into dependabot/pip/transformers-gte-4.1-and-lt-4.10

d69b3d6

dirkgr approved these changes Jul 26, 2021

View reviewed changes

dirkgr merged commit fd429b2 into main Jul 26, 2021

dirkgr deleted the dependabot/pip/transformers-gte-4.1-and-lt-4.10 branch July 26, 2021 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 #5326

Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 #5326

dependabot bot commented on behalf of github Jul 22, 2021

dirkgr commented Jul 23, 2021

dependabot bot commented on behalf of github Jul 26, 2021

Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 #5326

Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 #5326

Conversation

dependabot bot commented on behalf of github Jul 22, 2021

v4.9.0: TensorFlow examples, CANINE, tokenizer training, ONNX rework

ONNX rework

CANINE model

Tokenizer training

dirkgr commented Jul 23, 2021

dependabot bot commented on behalf of github Jul 26, 2021