[Feature] optimize group gemm #3323

zhyncs · 2025-02-05T22:56:43Z

1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
2. Please use English, otherwise it will be closed.

Rewrite the Grouped GEMM used by LoRA with cuBLAS 12.5 in sgl-kernel for improved speed.

No response

Fridge003 · 2025-02-05T22:59:46Z

Thanks, this seems to be a good idea!

zhyncs added the performance label Feb 5, 2025

zhyncs assigned Fridge003 Feb 5, 2025

Fridge003 mentioned this issue Feb 5, 2025

[Feature] Lora Development Roadmap #2929

Open

16 tasks

Fridge003 added the lora label Feb 5, 2025

zhyncs added the high priority label Feb 12, 2025

Fridge003 mentioned this issue Feb 17, 2025

[Feature] Apply Cublas Grouped Gemm kernel #3629

Merged

6 tasks

Fridge003 closed this as completed Feb 20, 2025

Provide feedback