Open
Description
Describe the bug
shufflenet_v2_x1_0 (int8) got "OSError: exception: access violation writing 0x000064656B636170" when testing PT2E with pytorch (2.8.0.dev20250525) and triton (3.4.0+gitae324eea) on BMG windows.
Error log
F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\mkldnn_lowerings.py:731: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.detach().clone() or sourceTensor.detach().clone().requires_grad_(True), rather than torch.tensor(sourceTensor).
torch.tensor(w_zp_tensor, dtype=torch.int32), name=w_zp.get_name()
Traceback (most recent call last):
File "C:\libohao\pt2e-accuracy\scripts\modelbench\quant\inductor_quant_acc.py", line 250, in <module>
run_model(model, args)
File "C:\libohao\pt2e-accuracy\scripts\modelbench\quant\inductor_quant_acc.py", line 174, in run_model
quant_output = optimized_model(images)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_dynamo\eval_frame.py", line 372, in __call__
return super().__call__(*args, **kwargs)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\nn\modules\module.py", line 1767, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\nn\modules\module.py", line 1778, in _call_impl
return forward_call(*args, **kwargs)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_dynamo\eval_frame.py", line 712, in compile_wrapper
raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 887, in _compile_fx_inner
raise InductorError(e, currentframe()).with_traceback(
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 871, in _compile_fx_inner
mb_compiled_graph = fx_codegen_and_compile(
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 1524, in fx_codegen_and_compile
return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 1402, in codegen_and_compile
compiled_module = graph.compile_to_module()
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\graph.py", line 2284, in compile_to_module
return self._compile_to_module()
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\graph.py", line 2294, in _compile_to_module
mod = self._compile_to_module_lines(wrapper_code)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\graph.py", line 2358, in _compile_to_module_lines
mod = PyCodeCache.load_by_key_path(
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\codecache.py", line 3153, in load_by_key_path
mod = _reload_python_module(key, path, set_sys_modules=in_toplevel)
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\compile_tasks.py", line 31, in _reload_python_module
exec(code, mod.__dict__, mod.__dict__)
File "C:\Users\dcai01\AppData\Local\Temp\torchinductor_dcai01\67\c67xlbdjxy32izvkeesdol664bh7mya3ahafihe4toofwxv34e7w.py", line 1485, in <module>
triton_poi_fused_clone_quantize_per_tensor_17 = async_compile.triton('triton_poi_fused_clone_quantize_per_tensor_17', '''
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\async_compile.py", line 400, in triton
kernel.precompile(
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\triton_heuristics.py", line 410, in precompile
self._make_launchers()
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\triton_heuristics.py", line 567, in _make_launchers
launchers.append(result.make_launcher())
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\triton_heuristics.py", line 1526, in make_launcher
binary._init_handles()
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\triton\compiler\compiler.py", line 495, in _init_handles
self.module, self.function, self.n_regs, self.n_spills, self.n_max_threads = driver.active.utils.load_binary(
File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\triton\backends\intel\driver.py", line 209, in load_binary
return self.shared_library.load_binary(args)
torch._inductor.exc.InductorError: OSError: exception: access violation writing 0x0000000000005F53
Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
Reproducer
# conda env
conda create -n pt2e_27_ww24 python=3.10 -y
pip install torch==2.8.0.dev20250525+xpu torchvision torchaudio --pre --index-url https://download.pytorch.org/whl/nightly/xpu
pip uninstall pytorch-triton-xpu -y
pip install pytorch_triton_xpu-3.4.0+gitae324eea-cp310-cp310-win_amd64.whl --no-deps
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git checkout 3560b8ebe9277e8c25335e35d8c9e0872052b2dc
pip install -r requirements.txt
pip install -r .ci\docker\requirements-ci.txt
git clone -b main https://github.com/chuanqi129/inductor-tools pt2e-accuracy
git clone -b yifeng/pt2e_xpu https://github.com/zxd1997066/benchmark pt2e-performance
cd pt2e-performance
pip install -r requirements.txt
python install.py --continue_on_fail
pip install pyre_extensions
pip install fbgemm-gpu
pip install --no-deps torchmetrics==1.0.3 torchrec
pip install --force-reinstall git+https://github.com/huggingface/transformers@243e186efbf7fb93328dd6b34927a4e8c8f24395
pip install pandas==2.2.3 numpy==1.22.4
# testing
python pt2e-accuracy/scripts/modelbench/quant/inductor_quant_acc.py --device xpu --dataset_dir C:\libohao\imagenet\val --model_list shufflenet_v2_x1_0
Environment details
-
PyTorch: pip install torch==2.8.0.dev20250525+xpu torchvision torchaudio --pre --index-url https://download.pytorch.org/whl/nightly/xpu
-
Triton:
-
Machine (ARL):
- CPU: Intel(R) Core(TM) Ultra 9 285H, 3700 Mhz, 16 Core(s), 16 Logical Processor(s)
- GPU: Intel(R) Arc(TM) 140T GPU(16GB)
- Driver: 32.0.101.6881
Metadata
Metadata
Assignees
Labels
No labels