use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… #39412

sywangyi · 2025-07-15T03:10:52Z

…tention
the GQA could be accelerated in torch.nn.functional.scaled_dot_product_attention. this pytorch api offer a param to enable gqa. see https://docs.pytorch.org/docs/2.7/generated/torch.nn.functional.scaled_dot_product_attention.html#torch-nn-functional-scaled-dot-product-attention

…tention Signed-off-by: Wang, Yi A <[email protected]>

src/transformers/integrations/sdpa_attention.py

Signed-off-by: Wang, Yi A <[email protected]>

liangan1 · 2025-07-15T07:42:11Z

@LuFinch pls help to review this pr.

sywangyi · 2025-07-15T07:46:40Z

FAILED tests/models/nougat/test_image_processing_nougat.py::NougatImageProcessingTest::test_slow_fast_equivalence_batched - AssertionError: 0.005013074725866318 not less than or equal to 0.005 this failure case has nothing to do with the PR. the case does not call sdpa attention

vasqu · 2025-07-15T10:03:56Z

Please see #35235 (comment)

The enable_gqa kwarg is pretty restrictive and would need proper checks around it (version, mask) to ensure we do not fall back to the math kernel / use unsupported features of older torch.

Signed-off-by: Wang, Yi A <[email protected]>

sywangyi · 2025-07-15T12:49:51Z

Please see #35235 (comment)

The enable_gqa kwarg is pretty restrictive and would need proper checks around it (version, mask) to ensure we do not fall back to the math kernel / use unsupported features of older torch.

thanks for the review. add check

use the enable_gqa param in torch.nn.functional.scaled_dot_product_at…

0fd2f66

…tention Signed-off-by: Wang, Yi A <[email protected]>

liangan1 reviewed Jul 15, 2025

View reviewed changes

src/transformers/integrations/sdpa_attention.py Show resolved Hide resolved

ci failure fix

82cbeb2

Signed-off-by: Wang, Yi A <[email protected]>

add check

029bc7f

Signed-off-by: Wang, Yi A <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… #39412

use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… #39412

sywangyi commented Jul 15, 2025

Uh oh!

Uh oh!

liangan1 commented Jul 15, 2025

Uh oh!

sywangyi commented Jul 15, 2025

Uh oh!

vasqu commented Jul 15, 2025

Uh oh!

sywangyi commented Jul 15, 2025

Uh oh!

Uh oh!

use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… #39412

Are you sure you want to change the base?

use the enable_gqa param in torch.nn.functional.scaled_dot_product_at… #39412

Conversation

sywangyi commented Jul 15, 2025

Uh oh!

Uh oh!

liangan1 commented Jul 15, 2025

Uh oh!

sywangyi commented Jul 15, 2025

Uh oh!

vasqu commented Jul 15, 2025

Uh oh!

sywangyi commented Jul 15, 2025

Uh oh!

Uh oh!