Skip to content

Reduce binary size of refine functions #1095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: branch-25.08
Choose a base branch
from

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Jul 9, 2025

The refine functions that work with GPU data use IVF-Flat under the hood to perform the refinement operation. This PR adds extern template declarations for ivfflat_interleaved_scan and uses these in the refine functions. This way we avoid recompiling the IVF-Flat search kernels, and save binary size.

@tfeher tfeher requested a review from a team as a code owner July 9, 2025 17:29
@tfeher tfeher self-assigned this Jul 9, 2025
@github-actions github-actions bot added the cpp label Jul 9, 2025
@tfeher tfeher added improvement Improves an existing functionality non-breaking Introduces a non-breaking change and removed cpp labels Jul 9, 2025
@github-actions github-actions bot added the cpp label Jul 9, 2025
@tfeher
Copy link
Contributor Author

tfeher commented Jul 9, 2025

The ivfflat_interleaved_scan function is expected to be the largest contributor in binary size for the refine_device. We still have a few other kernel calls in refine_device, In a separate PR I will check if we can get rid of those.

@tfeher
Copy link
Contributor Author

tfeher commented Jul 9, 2025

The binary size is nicely reduced, but test fail due to undefined symbols. It worked locally, I will look into it.

filname compile time binary size
refine_device_half_float.cu.o 112.149 s 2.185 MB
refine_device_float_float.cu.o 111.532 s 2.263 MB
refine_device_uint8_t_float.cu.o 111.155 s 2.183 MB
refine_device_int8_t_float.cu.o 110.015 s 2.183 MB

@tfeher
Copy link
Contributor Author

tfeher commented Jul 14, 2025

The error is related to the filter type used for instantiating the search kernels. I am looking into the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

2 participants