Skip to content

[PT2E-PT2.8][Windows] shufflenet_v2_x1_0 (int8) got "torch._inductor.exc.InductorError: OSError: exception: access violation writing 0x000064656B636170" #4541

Open
@libohao1201

Description

@libohao1201

Describe the bug

shufflenet_v2_x1_0 (int8) got "OSError: exception: access violation writing 0x000064656B636170" when testing PT2E with pytorch (2.8.0.dev20250525) and triton (3.4.0+gitae324eea) on BMG windows.

Error log

F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\mkldnn_lowerings.py:731: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.detach().clone() or sourceTensor.detach().clone().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(w_zp_tensor, dtype=torch.int32), name=w_zp.get_name()
Traceback (most recent call last):
  File "C:\libohao\pt2e-accuracy\scripts\modelbench\quant\inductor_quant_acc.py", line 250, in <module>
    run_model(model, args)
  File "C:\libohao\pt2e-accuracy\scripts\modelbench\quant\inductor_quant_acc.py", line 174, in run_model
    quant_output = optimized_model(images)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_dynamo\eval_frame.py", line 372, in __call__
    return super().__call__(*args, **kwargs)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\nn\modules\module.py", line 1767, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\nn\modules\module.py", line 1778, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_dynamo\eval_frame.py", line 712, in compile_wrapper
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 887, in _compile_fx_inner
    raise InductorError(e, currentframe()).with_traceback(
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 871, in _compile_fx_inner
    mb_compiled_graph = fx_codegen_and_compile(
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 1524, in fx_codegen_and_compile
    return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\compile_fx.py", line 1402, in codegen_and_compile
    compiled_module = graph.compile_to_module()
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\graph.py", line 2284, in compile_to_module
    return self._compile_to_module()
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\graph.py", line 2294, in _compile_to_module
    mod = self._compile_to_module_lines(wrapper_code)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\graph.py", line 2358, in _compile_to_module_lines
    mod = PyCodeCache.load_by_key_path(
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\codecache.py", line 3153, in load_by_key_path
    mod = _reload_python_module(key, path, set_sys_modules=in_toplevel)
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\compile_tasks.py", line 31, in _reload_python_module
    exec(code, mod.__dict__, mod.__dict__)
  File "C:\Users\dcai01\AppData\Local\Temp\torchinductor_dcai01\67\c67xlbdjxy32izvkeesdol664bh7mya3ahafihe4toofwxv34e7w.py", line 1485, in <module>
    triton_poi_fused_clone_quantize_per_tensor_17 = async_compile.triton('triton_poi_fused_clone_quantize_per_tensor_17', '''
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\async_compile.py", line 400, in triton
    kernel.precompile(
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\triton_heuristics.py", line 410, in precompile
    self._make_launchers()
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\triton_heuristics.py", line 567, in _make_launchers
    launchers.append(result.make_launcher())
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\torch\_inductor\runtime\triton_heuristics.py", line 1526, in make_launcher
    binary._init_handles()
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\triton\compiler\compiler.py", line 495, in _init_handles
    self.module, self.function, self.n_regs, self.n_spills, self.n_max_threads = driver.active.utils.load_binary(
  File "F:\miniforge\envs\pt2e_27_ww24\lib\site-packages\triton\backends\intel\driver.py", line 209, in load_binary
    return self.shared_library.load_binary(args)
torch._inductor.exc.InductorError: OSError: exception: access violation writing 0x0000000000005F53

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Reproducer

# conda env 
conda create -n pt2e_27_ww24 python=3.10 -y
pip install torch==2.8.0.dev20250525+xpu torchvision torchaudio --pre --index-url https://download.pytorch.org/whl/nightly/xpu
pip uninstall pytorch-triton-xpu -y   
pip install pytorch_triton_xpu-3.4.0+gitae324eea-cp310-cp310-win_amd64.whl --no-deps

git clone https://github.com/pytorch/pytorch.git
cd pytorch
git checkout 3560b8ebe9277e8c25335e35d8c9e0872052b2dc

pip install -r requirements.txt
pip install -r .ci\docker\requirements-ci.txt

git clone -b main https://github.com/chuanqi129/inductor-tools pt2e-accuracy
git clone -b yifeng/pt2e_xpu https://github.com/zxd1997066/benchmark pt2e-performance

cd pt2e-performance
pip install -r requirements.txt
python install.py --continue_on_fail


pip install pyre_extensions
pip install fbgemm-gpu
pip install --no-deps torchmetrics==1.0.3 torchrec
pip install --force-reinstall git+https://github.com/huggingface/transformers@243e186efbf7fb93328dd6b34927a4e8c8f24395 
pip install pandas==2.2.3 numpy==1.22.4

# testing
python pt2e-accuracy/scripts/modelbench/quant/inductor_quant_acc.py --device xpu --dataset_dir C:\libohao\imagenet\val --model_list shufflenet_v2_x1_0

Environment details

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions