Add `inputs_k` and `inputs_v` args to attention layer #3379

chiamp · 2023-09-28T22:23:06Z

Currently, MultiHeadDotProductAttention layer's call method signature is MultiHeadDotProductAttention.__call__(inputs_q, inputs_kv, mask=None, deterministic=None). As discussed in #1737, there are some cases where passing in separate values for the key and values is desired, which isn't possible with the current API. This PR adds two more arguments, inputs_k and inputs_v to the call method signature and sets the method signature to the following: MultiHeadDotProductAttention.__call__(inputs_q, inputs_k=None, inputs_v=None, *, inputs_kv=None, mask=None, deterministic=None). Note that the inputs_kv, mask and deterministic args are now keyword arguments.

if inputs_k and inputs_v are None, then they will both copy the value of inputs_q (i.e. self attention)
if inputs_v is None, it will copy the value of inputs_k (same behavior as the previous API, i.e. module.apply(inputs_q=query, inputs_k=key_value, ...) is equivalent to module.apply(inputs_q=query, inputs_kv=key_value, ...))
if inputs_kv is not None, both inputs_k and inputs_v will copy the value of inputs_kv

Users can still use inputs_kv but a DeprecationWarning will be raised and inputs_kv will be removed in the future.
Since self attention can be done using this new API, the SelfAttention layer will also raise a DeprecationWarning and will be removed in the future.

Check out #3389 to see examples of how to port your code over to the new API.

codecov-commenter · 2023-09-28T22:48:35Z

Codecov Report

Merging #3379 (1d41190) into main (f20aed4) will increase coverage by 0.02%.
Report is 2 commits behind head on main.
The diff coverage is 90.90%.

@@            Coverage Diff             @@
##             main    #3379      +/-   ##
==========================================
+ Coverage   83.60%   83.62%   +0.02%     
==========================================
  Files          56       56              
  Lines        6746     6767      +21     
==========================================
+ Hits         5640     5659      +19     
- Misses       1106     1108       +2

Files	Coverage Δ
flax/linen/attention.py	`94.19% <90.90%> (-0.59%)`	⬇️

... and 1 file with indirect coverage changes

flax/linen/attention.py

cgarciae · 2023-10-05T14:10:22Z

Left a comment. Otherwise, looks good!

-- f6a222c710f87efbc3c1132af5deb853fb207ad9 by Marcus Chiam <[email protected]>: split inputs_kv arg in attention layer COPYBARA_INTEGRATE_REVIEW=#3379 from chiamp:attention f6a222c710f87efbc3c1132af5deb853fb207ad9 PiperOrigin-RevId: 572671273

chiamp · 2023-10-11T20:57:22Z

Closing after this commit landed.

chiamp self-assigned this Sep 28, 2023

chiamp changed the title ~~split inputs_kv arg in attention layer~~ Add inputs_k and inputs_v args to attention layer Sep 28, 2023

chiamp added the pull ready label Sep 28, 2023

chiamp force-pushed the attention branch from b7be6d7 to a40e86f Compare September 28, 2023 22:40

chiamp requested review from levskaya and cgarciae September 28, 2023 23:04

chiamp force-pushed the attention branch 8 times, most recently from 2db0753 to dc02493 Compare October 4, 2023 22:13

cgarciae reviewed Oct 5, 2023

View reviewed changes

flax/linen/attention.py Outdated Show resolved Hide resolved

cgarciae approved these changes Oct 5, 2023

View reviewed changes

split inputs_kv arg in attention layer

1d41190

chiamp force-pushed the attention branch from dc02493 to 1d41190 Compare October 5, 2023 18:27

chiamp closed this Oct 11, 2023

chiamp deleted the attention branch October 27, 2023 21:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `inputs_k` and `inputs_v` args to attention layer #3379

Add `inputs_k` and `inputs_v` args to attention layer #3379

Uh oh!

chiamp commented Sep 28, 2023 •

edited

Loading

Uh oh!

codecov-commenter commented Sep 28, 2023 •

edited

Loading

Uh oh!

Uh oh!

cgarciae commented Oct 5, 2023

Uh oh!

chiamp commented Oct 11, 2023

Uh oh!

Uh oh!

Add inputs_k and inputs_v args to attention layer #3379

Add inputs_k and inputs_v args to attention layer #3379

Uh oh!

Conversation

chiamp commented Sep 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Sep 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

cgarciae commented Oct 5, 2023

Uh oh!

chiamp commented Oct 11, 2023

Uh oh!

Uh oh!

Add `inputs_k` and `inputs_v` args to attention layer #3379

Add `inputs_k` and `inputs_v` args to attention layer #3379

chiamp commented Sep 28, 2023 •

edited

Loading

codecov-commenter commented Sep 28, 2023 •

edited

Loading