Skip to content

Fix to use is_initializing for init-detection. #2486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 18, 2022

Conversation

yotarok
Copy link
Contributor

@yotarok yotarok commented Sep 27, 2022

What does this PR do?

The logic used for detecting initialization-mode in linen.BatchNorm is currently based on mutability of params.
There's a clean approach to do it by Module.is_initializing and the previous logic does not work if the user provides mutable=True.

Checklist

  • This PR fixes a minor issue (e.g.: typo or small bug) or improves the docs (you can dismiss the other
    checks if that's the case).
  • This change is discussed in a Github issue/
    discussion (please add a
    link).
    • (No GitHub issue posted yet)
  • The documentation and docstrings adhere to the
    documentation guidelines.
  • This change includes necessary high-coverage tests.
    (No quality testing = no merge!)
    • Covered by the existing tests on BatchNorm

@cgarciae
Copy link
Collaborator

@levskaya was there a reason not to do the change proposed in this PR when you introduced is_initializing?

@cgarciae
Copy link
Collaborator

@yotarok can you merge/rebase with main to fix the tests?

@yotarok yotarok force-pushed the fix_batch_norm_init branch from a3b5a6d to 1c6e197 Compare September 29, 2022 02:23
@yotarok
Copy link
Contributor Author

yotarok commented Sep 29, 2022

@cgarciae Thanks for the comment. Rebased and force-pushed.

@levskaya
Copy link
Collaborator

@levskaya was there a reason not to do the change proposed in this PR when you introduced is_initializing?

Mostly I was overwhelmed and dropped it - Since it is a subtle change in semantics we need to test this against existing users to make sure we're not introducing a potentially invisible breaking change... and probably announce the behavior change.

@yotarok
Copy link
Contributor Author

yotarok commented Sep 30, 2022

I agree, once this PR is approved, we (or Flax core devs) should announce the behavior change.
FYI, the issue I described above in PR description is also invisible. When a user provides mutable=True to apply, BatchNorm silently stops updating the statistics, and it's hard to notice because the return value still contains those non-updated BatchNorm stats, and there's no warning message.

@codecov-commenter
Copy link

Codecov Report

Merging #2486 (1c6e197) into main (cc88a73) will decrease coverage by 0.05%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##             main    #2486      +/-   ##
==========================================
- Coverage   78.81%   78.76%   -0.06%     
==========================================
  Files          48       48              
  Lines        5066     5068       +2     
==========================================
- Hits         3993     3992       -1     
- Misses       1073     1076       +3     
Impacted Files Coverage Δ
flax/training/checkpoints.py 60.93% <40.00%> (-0.43%) ⬇️
flax/linen/normalization.py 97.08% <100.00%> (-0.03%) ⬇️
flax/serialization.py 69.09% <100.00%> (-0.19%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@yotarok
Copy link
Contributor Author

yotarok commented Oct 4, 2022

Ping.
Or, should I work on improving coverage for getting this reviewed?

@jheek
Copy link
Member

jheek commented Oct 10, 2022

@yotarok I'm going to run a large internal test suite to check if this breaks in surprising way.

@jheek
Copy link
Member

jheek commented Oct 10, 2022

This doesn't cause any internal breakages, I think we can merge this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants