Skip to content

[Feature] Enable inference support for Deepseekr1-w8a8-MTP #1834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Irving11-BKN
Copy link

@Irving11-BKN Irving11-BKN commented Jul 17, 2025

  1. Support the inference of the Deepseekr1-w8a8-mtp model with statically-quantized shared_head in MTP layers.

Signed-off-by: curryliu [email protected]

l30074184 added 5 commits July 17, 2025 09:57
Signed-off-by: l30074184 <[email protected]>
Signed-off-by: l30074184 <[email protected]>
Signed-off-by: l30074184 <[email protected]>
Signed-off-by: l30074184 <[email protected]>
Signed-off-by: l30074184 <[email protected]>
Copy link

codecov bot commented Jul 17, 2025

Codecov Report

Attention: Patch coverage is 33.33333% with 12 lines in your changes missing coverage. Please review.

Project coverage is 60.40%. Comparing base (c30ddb8) to head (a67bc73).
Report is 144 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/quantization/quant_config.py 33.33% 6 Missing ⚠️
vllm_ascend/models/deepseek_mtp.py 42.85% 4 Missing ⚠️
vllm_ascend/models/deepseek_v2.py 0.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (33.33%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1834       +/-   ##
===========================================
+ Coverage   27.39%   60.40%   +33.00%     
===========================================
  Files          56       72       +16     
  Lines        6191     8117     +1926     
===========================================
+ Hits         1696     4903     +3207     
+ Misses       4495     3214     -1281     
Flag Coverage Δ
unittests 60.40% <33.33%> (+33.00%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Irving11-BKN Irving11-BKN force-pushed the main branch 2 times, most recently from 27803cb to a67bc73 Compare July 18, 2025 08:19
Signed-off-by: l30074184 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant