Smarter LLM Post-Training Quantization using End Loss Guidance, boosting the performance of
state-of-the-art weight-only scalar, weight-only vector, and weight-and-activation quantization methods.
- May, 2025: GuidedQuant is accepted to ICML 2025.
GuidedQuant enhances LLM quantization by integrating gradient information from the end loss into the quantization objective, boosting the performance of SOTA weight-only scalar, weight-only vector, and weight-and-activation quantization. Additionally, we introduce LNQ, a non-uniform scalar quantization algorithm which is guaranteed to monotonically decrease the quantization objective value.
To be released soon.
Please cite our paper if you find our work useful:
@inproceedings{kim2025guidedquant,
title={GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance},
author={Jinuk Kim and Marwa El Halabi and Wonpyo Park and Clemens JS Schaefer and Deokjae Lee and Yeonhong Park and Jae W. Lee and Hyun Oh Song},
booktitle = {International Conference on Machine Learning (ICML)},
year={2025},
}