Skip to content

Update longformer.md #37622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 21, 2025
Merged

Conversation

JihadHammoud02
Copy link
Contributor

Refactored Longformer docs
Added examples for pipeline, Automodel and cli
Added quantization
Did not add a Attention visualizer, from what I researched it doesn't support it, if it is not the case I am happy to add it !
Added a note concerning versions < 4.37.0.dev

@github-actions github-actions bot marked this pull request as draft April 18, 2025 19:47
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@JihadHammoud02 JihadHammoud02 marked this pull request as ready for review April 18, 2025 19:53
@github-actions github-actions bot requested a review from stevhliu April 18, 2025 19:54
Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for adding!


For more information please also refer to [`~LongformerModel.forward`] method.
Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a Quantization example here since the model isn't that big

- [Question answering task guide](../tasks/question_answering)
- [Masked language modeling task guide](../tasks/masked_language_modeling)
- [Multiple choice task guide](../tasks/multiple_choice)
- If you're using Transformers < 4.37.0.dev, set `trust_remote_code=True` in [~AutoModel.from_pretrained]. Otherwise, make sure you update Transformers to the latest stable version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary to include this note. Instead, add the below

  • Longformer is based on RoBERTa and doesn't have token_type_ids. You don't need to indicate which token belongs to which segment. You only need to separate the segments with the separation token </s> or tokenizer.sep_token.
  • You can set which tokens can attend locally and which tokens attend globally with the global_attention_mask at inference (see this example for more details). A value of 0 means a token attends locally and a value of 1 means a token attends globally.
  • [LongformerForMaskedLM] is trained like [RobertaForMaskedLM] and should be used as shown below.
    input_ids = tokenizer.encode("This is a sentence from [MASK] training data", return_tensors="pt")
    mlm_labels = tokenizer.encode("This is a sentence from the training data", return_tensors="pt")
    loss = model(input_ids, labels=input_ids, masked_lm_labels=mlm_labels)[0]

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again! 🤗

@stevhliu stevhliu merged commit b2db54f into huggingface:main Apr 21, 2025
10 checks passed
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
* Update longformer.md

* Update longformer.md

* Update docs/source/en/model_doc/longformer.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/model_doc/longformer.md

Co-authored-by: Steven Liu <[email protected]>

* Update longformer.md

---------

Co-authored-by: Steven Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants