[ascend]Optimize moe #203

yao-fengchen · 2025-03-28T08:41:34Z

No description provided.

Copilot

Pull Request Overview

This PR updates the CI configuration to use a new branch for LMDEPLOY, aligning with the "[ascend]Optimize moe" effort.

Updated LMDEPLOY_COMMIT_OR_BRANCH from 'main' to 'optimize_moe' in the GitHub workflow.

Copilot

Pull Request Overview

This PR introduces optimizations for the moe module by updating function signatures and configurations. Key changes include:

Adding new parameters (head_size and head_size_v) to multiple attention-related functions.
Removing the slicing of past key values in the DeepseekV2Attention_forward function.
Updating the CI workflow to reference the "optimize_moe" branch and a new Git repository URL.

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

File	Description
dlinfer/ops/llm.py	Updated function signatures and documentation for attention operations.
dlinfer/framework/lmdeploy_ext/dynamo/graph_mode_patch.py	Modified parameter usage in the DeepseekV2Attention_forward function.
.github/workflows/main.yml	Adjusted CI environment variables and updated the Git repository URL.

Comments suppressed due to low confidence (3)

dlinfer/ops/llm.py:132

[nitpick] Consider rephrasing the docstring to 'head_size (int): The size of each query head' for clarity.

head_size (int): The number of query head size.

dlinfer/framework/lmdeploy_ext/dynamo/graph_mode_patch.py:89

Removing the slice '[:nope_size]' might change the intended behavior. Please verify that passing the full tensor is correct.

past_key_value[0][..., :nope_size]

.github/workflows/main.yml:77

Ensure that the repository URL update from 'InternLM' to 'DeepLink-org' aligns with all relevant CI/CD configurations.

git clone https://github.com/DeepLink-org/lmdeploy.git ${{ env.LMDEPLOY_PATH }}

dlinfer/vendor/camb/camb_ops.py

This reverts commit a4a5297.

yao-fengchen requested a review from jinminxi104 as a code owner March 28, 2025 08:41

yao-fengchen mentioned this pull request Mar 28, 2025

Optimize ascend moe InternLM/lmdeploy#3364

Merged

jinminxi104 requested a review from Copilot March 29, 2025 13:15

Copilot AI reviewed Mar 29, 2025

View reviewed changes

yao-fengchen added 4 commits March 31, 2025 05:30

remove transpose in ascend moe

24349a7

optimize fused_moe

3a8d804

optimize graph_op for moe model

904c2cb

remove useless code

bab007d

yao-fengchen force-pushed the optimize_moe branch 2 times, most recently from a61c7bb to d738195 Compare April 2, 2025 03:04

refactor attention for ascend mla

702389e

yao-fengchen force-pushed the optimize_moe branch 3 times, most recently from d94378d to df127d6 Compare April 2, 2025 04:21

test ci

a4a5297

yao-fengchen force-pushed the optimize_moe branch 2 times, most recently from 8acb00d to a4a5297 Compare April 2, 2025 04:29

jinminxi104 requested a review from Copilot April 3, 2025 03:05

Copilot AI reviewed Apr 3, 2025

View reviewed changes

jinminxi104 reviewed Apr 3, 2025

View reviewed changes

dlinfer/vendor/camb/camb_ops.py Show resolved Hide resolved

yao-fengchen added 2 commits April 8, 2025 01:50

remove head_size in attention

e20fa98

unified kv_cache layout

6a169d8

yao-fengchen force-pushed the optimize_moe branch from fee4ebe to 6a169d8 Compare April 8, 2025 07:31

Revert "test ci"

9d991eb

This reverts commit a4a5297.

jinminxi104 approved these changes Apr 10, 2025

View reviewed changes

jinminxi104 merged commit bec381e into DeepLink-org:main Apr 10, 2025
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ascend]Optimize moe #203

[ascend]Optimize moe #203

Uh oh!

yao-fengchen commented Mar 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[ascend]Optimize moe #203

[ascend]Optimize moe #203

Uh oh!

Conversation

yao-fengchen commented Mar 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!