add inductor xpu #16

RUIJIEZHONG66166 · 2023-11-01T07:48:53Z

No description provided.

groovy/inductor/inductor_xpu_backend_e2e.groovy

chuanqi129 · 2023-11-01T07:58:01Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+        else
+            wget -q -e "https_proxy=http://proxy-us.intel.com:912" https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
+            bash Miniconda3-latest-Linux-x86_64.sh -b
+            source ${HOME}/miniconda3/etc/profile.d/conda.sh 2>&1 >> /dev/null


this line can be replaced by source ${HOME}/miniconda3/bin/activate

can apply it in other places

chuanqi129 · 2023-11-01T08:03:51Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+            fi
+        else
+            wget -q -e "https_proxy=http://proxy-us.intel.com:912" https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
+            bash Miniconda3-latest-Linux-x86_64.sh -b


Do you have met the miniconda package issue like conda/conda#13225 recently? Seems the package https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh (same as Miniconda3-py311_23.9.0-0-Linux-x86_64.sh) has issue now. suggest to use other python versions, eg. Miniconda3-py39_23.9.0-0-Linux-x86_64.sh

Never met this issue. We usually install the Miniconda3-latest-Linux-x86_64.sh on our machine firstly

Ok, please aware it, when we meet such issue can use similar workaround

chuanqi129 · 2023-11-01T08:07:11Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+        fi
+
+        # set gpu governor
+        if [[ -z "${USER_PASS}" ]];then USER_PASS="gta";fi


Don't hardcode passwd in this public repo, if we really need such info, let's pass them by Jenkins credential

set it as jenkins parameter

if the cpu mode setting moved, do we still need this one?

groovy/inductor/inductor_xpu_backend_e2e.groovy

chuanqi129 · 2023-11-01T08:42:34Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+                TRITON_CODEGEN_INTEL_XPU_BACKEND=1 python setup.py bdist_wheel
+                pip install dist/*.whl
+                source ${HOME}/env.sh 
+                python -c "import triton"


maybe need move to parent dir to do the import check to avoid some path issue, because under there is a triton folder python

groovy/inductor/inductor_xpu_backend_e2e.groovy

chuanqi129 · 2023-11-01T08:51:55Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+                wget -q -e use_proxy=no ${ipex_whl}
+                python -m pip install --force-reinstall $(basename ${ipex_whl})
+            else
+                bash ${WORKSPACE}/inductor-tools/scripts/env_prepare.sh


I don't see any parameter in the beginning of this env_prepare.sh script, please add them. By the way, please also add oneapi version as a parameter for this script and env.sh to add the basekit version control. You can refer latest ci/nightly workflows in xpu backend repo

chuanqi129 · 2023-11-01T08:57:48Z

scripts/inductor/env_prepare.sh

@@ -0,0 +1,57 @@
+installed_torch_git_version=$(python -c "import torch;print(torch.version.git_version)"|| true)


Add parameters in here

chuanqi129

Second round comments, those comments aim to improve the jobs flexibility.

chuanqi129 · 2023-11-01T14:31:59Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+            '''
+        }//retry
+    }//stage
+    stage('Accuracy-Test') {


Add a multi-choice parameter SCENARIO in this groovy and jenkins job, the value can be accuracy, performance, if the accuracy in the SCENARIO do the accuracy test, else bypass this stage. The performance stage similar with accuracy part. Default value can be choose accuracy & performance

chuanqi129 · 2023-11-01T14:37:46Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+            pip install styleFrame scipy pandas
+            pushd ${WORKSPACE}/pytorch
+            rm -rf inductor_log
+            bash inductor_xpu_test.sh huggingface amp_bf16 inference accuracy xpu 0 & \


Also keep SUITE, DT, MODE as this groovy and jenkins job's parameters, all those parameters use multi-choice parameter, and combine those parameter values, do test one-by-one. Default value can be set as current hardcode cmd. For example, if the SUITE={huggingface,timm_models}, DT={float32,amp_bf16}, MODE={inference,training}, then it will be have 8 combinations need to test.

chuanqi129 · 2023-11-01T14:45:00Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+                conda activate ${conda_env}
+                source ${HOME}/env.sh ${oneapi_ver}
+
+                cd ${WORKSPACE}/pytorch/inductor_log/huggingface


Extract this log summary part as a standalone scripts, and keep the flexibility for different suite/dt/mode/scenario(low priority). A simple way is that log summary scripts receive those parameters and maintain the criteria for different tests (by using a configure file or dict like structure). E.g. {"huggingface-amp_bf16-inference-accuracy":44, "huggingface-amp_bf16-training-accuracy":42}

chuanqi129 · 2023-11-01T14:46:15Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+        }//retry
+    }//stage
+    stage('Performance-Test') {
+        println('================================================================')


Similar request with accuracy test stage

chuanqi129 · 2023-11-01T15:01:33Z

groovy/inductor/inductor_xpu_backend_e2e.groovy

+            bash inductor_xpu_test.sh huggingface amp_bf16 inference accuracy xpu 0 & \
+            bash inductor_xpu_test.sh huggingface amp_bf16 training accuracy xpu 1 & \
+            bash inductor_xpu_test.sh huggingface amp_fp16 inference accuracy xpu 2 & \
+            bash inductor_xpu_test.sh huggingface amp_fp16 training accuracy xpu 3 & wait


The right cmd is:

bash inductor_xpu_test.sh ${SUITE} ${DT} ${MODE} accuracy xpu 0 static 4 0 & \ bash inductor_xpu_test.sh ${SUITE} ${DT} ${MODE} accuracy xpu 1 static 4 1 & \ bash inductor_xpu_test.sh ${SUITE} ${DT} ${MODE} accuracy xpu 2 static 4 2 & \ bash inductor_xpu_test.sh ${SUITE} ${DT} ${MODE} accuracy xpu 3 static 4 3 & wait

It means the one combination will be split as 4 sub-test run on 4 card at the same time. Please modify the cmd in performance part also. Your previously cmd run 2 combination at same time and each combination run 2 times on 2 card, not split them as sub-test.

chuanqi129 · 2023-11-10T06:06:30Z

Hi @RUIJIEZHONG66166 any update for those comments?

chuanqi129 and others added 2 commits October 25, 2023 11:14

Add PT2.1 gpu dockerfile and update script

ee69226

add inductor_xpu

62a9503

chuanqi129 reviewed Nov 1, 2023

View reviewed changes

RUIJIEZHONG66166 added 3 commits November 1, 2023 09:58

solve some issue

a15b02e

solve Jenkins try issue

ebc7f02

solve oneapi set

c7f001c

chuanqi129 reviewed Nov 1, 2023

View reviewed changes

update inductor xpu test

d570400

chuanqi129 force-pushed the main branch from ee69226 to 8cce9db Compare November 23, 2023 08:25

RUIJIEZHONG66166 added 2 commits December 25, 2023 02:36

add timm & torchbench

8534246

modify benchmark version

b744cbd

		@@ -0,0 +1,57 @@
		installed_torch_git_version=$(python -c "import torch;print(torch.version.git_version)"\|\| true)

add inductor xpu #16

Are you sure you want to change the base?

add inductor xpu #16

Uh oh!

Conversation

RUIJIEZHONG66166 commented Nov 1, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chuanqi129 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chuanqi129 commented Nov 10, 2023

Uh oh!

Uh oh!