Add configurable normalization schemes to SigLIP image processors #38444
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Addresses issue #38318 by adding configurable normalization schemes to SigLIP image processors, allowing users to choose between official SigLIP normalization and traditional ImageNet normalization while maintaining full backwards compatibility.
Problem
Users reported that SigLIP models may perform better for feature clustering when using traditional ImageNet normalization values instead of the official SigLIP values:
mean=[0.5, 0.5, 0.5]
,std=[0.5, 0.5, 0.5]
mean=[0.485, 0.456, 0.406]
,std=[0.229, 0.224, 0.225]
However, changing the default values would break backwards compatibility and contradict official SigLIP documentation.
Solution
Added a
normalization_scheme
parameter that provides user choice without breaking existing functionality:Key Features:
"siglip"
and"imagenet"
schemesimage_mean
/image_std
Usage Examples
Default Usage (No Changes Required)
Better Clustering Performance
Auto-Detection from Configs
Files Changed
src/transformers/models/siglip/image_processing_siglip.py
src/transformers/models/siglip/image_processing_siglip_fast.py
Testing
The implementation has been validated to ensure:
Related
Fixes #38318
This solution provides the best of both worlds: researchers can easily access ImageNet normalization for better clustering while maintaining official SigLIP compatibility by default.