-
Notifications
You must be signed in to change notification settings - Fork 1.2k
ENH: Allow FSDP ignored modules to be regex #3698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ENH: Allow FSDP ignored modules to be regex #3698
Conversation
Description For FSDP, there is an option to indicate ignored_modules, which should be a list of modules are ignored by FSDP. Even though this argument was supported in accelerate, it was not very usable: 1. Listing all modules can tricky, especially with something like PEFT, where the whole model is wrapped and thus the module structure changes. 2. When configuring this argument, accelerate takes a detour via environment variables. These can only be strings. Therefore, passing a list of modules is not feasible. Moreover, I noticed that the environment variable for ignored_modules was not even set, so configuring this argument didn't even work. Status This PR is lacking tests. I would be happy for pointers on how to add those. Context When using PEFT with LoRA and the target_parameters feature, I ran into an issue training such a model with FSDP. The only working fix I found was to ignore the layers targeted by LoRA. However, I could not configure accelerate to do that. With this PR, it is possible. I could successfully trained such a PEFT model that targets q_proj and v_proj by setting fsdp_ignored_modules: '.*\.(q_proj$|v_proj$)'.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
The failing test is unrelated and caused by an error in transformers that makes it not work with torch < 2.4. It should be patched soon.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this ! You can have a look at the tests/fsdp/test_fsdp.py file cc @S1ro1 for more info
@SunMarc Thanks for the pointer but the issue I encountered is the following: I want to initialize an
I couldn't find any test that would do something along these lines. There are some tests that use configs, like in |
|
Thanks for the pointer @S1ro1 and sorry for my basic questions, but I'm still struggling to see how I can test the change. The two problems I have when using said function as a starting point:
I guess I could try to add 3 tests: 1) Check that the config correctly sets the corresponding env var. 2) A test similar to |
I think reasonable is to test only the path after config file, the config_file -> env is tested quite okay + it's usually a place that is touched once and never again. IMO testing from env/fsdp_plugin -> model wrapper. |
What does this PR do?
For FSDP, there is an option to indicate
ignored_modules
, which should be a list of modules are ignored by FSDP. Even though this argument was supported in accelerate, it was not very usable:Moreover, I noticed that the environment variable for
ignored_modules
was not even set, so configuring this argument didn't even work.Status
Don't merge yet This PR is lacking tests. I would be happy for pointers on how to add those.
Context
When using PEFT with LoRA and the new
target_parameters
feature, I ran into an issue training such a model with FSDP. The only working fix I found was to ignore the layers targeted by LoRA. However, I could not configure accelerate to do that. With this PR, it is possible: I could successfully train such a PEFT model that targetsq_proj
andv_proj
by settingfsdp_ignored_modules: '.*\.(q_proj$|v_proj$)'
.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@SunMarc