-
Notifications
You must be signed in to change notification settings - Fork 359
Support QLinearMatMul on CPU #3117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Tung D. Le <[email protected]>
Signed-off-by: Tung D. Le <[email protected]>
Signed-off-by: Tung D. Le <[email protected]>
Signed-off-by: Tung D. Le <[email protected]>
Signed-off-by: Tung D. Le <[email protected]>
Signed-off-by: Tung D. Le <[email protected]>
Value yZeroPoint = adaptor.getYZeroPoint(); | ||
|
||
// Only support integer8 and float32 now. | ||
if (!getElementType(A.getType()).isInteger(8)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
integer8 for A, B and zeroPoint, and float32 for scale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, made it clear now. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding, the writing lowers the QLinearMatMul to a sequence of onnx op. Should such rewriting be put in decomposition pass?
It can be in the decomposition pass. However, we want to keep it as it is in case of NNPA (or other accelerators). So we need a way (it may be similar to turning off constant prop patterns) for turning off the decomposition when compiling for NNPA, which is currently not available. So that's why I decompose it during onnx-to-krnl. |
Signed-off-by: Tung D. Le <[email protected]>
In general, onnx-mlir should provide a fine control on graph level optimization for different target architecture/accelerator. We can discuss such mechanism later. |
//===----------------------------------------------------------------------===// | ||
// Helper for quantization | ||
//===----------------------------------------------------------------------===// | ||
Value getOrCastToI8( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this function be DialectBuilder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, still make "little" sense from a performance's perspective to run quantized models on CPU, compared to better supported float32. Still wonder if a warning would make sense.
Signed-off-by: Tung D. Le <[email protected]>
@jenkins-droid test this please |
Signed-off-by: Tung D. Le <[email protected]>
Jenkins Linux amd64 Build #16504 [push] null... failed after 1 hr 23 min |
Jenkins Linux ppc64le Build #15487 [push] null... aborted after 1 hr 40 min |
Jenkins Linux s390x Build #16506 [push] null... aborted after 1 hr 40 min |
* Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]>
* update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]>
LLVM update 43d71ba (onnx#3086) * update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (onnx#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (onnx#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (onnx#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (onnx#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (onnx#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (onnx#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (onnx#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (onnx#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (onnx#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (onnx#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (onnx#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (onnx#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (onnx#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (onnx#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (onnx#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (onnx#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (onnx#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (onnx#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (onnx#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (onnx#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (onnx#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (onnx#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (onnx#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (onnx#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (onnx#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (onnx#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (onnx#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (onnx#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (onnx#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (onnx#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (onnx#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (onnx#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]>
AMD changes: Update lowering and tests for onnx->tosa conversions that are not upstream Partial cherry-pick of f03b287 LLVM update 43d71ba (onnx#3086) * update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (onnx#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (onnx#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (onnx#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (onnx#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (onnx#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (onnx#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (onnx#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (onnx#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (onnx#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (onnx#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (onnx#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (onnx#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (onnx#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (onnx#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (onnx#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (onnx#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (onnx#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (onnx#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (onnx#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (onnx#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (onnx#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (onnx#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (onnx#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (onnx#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (onnx#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (onnx#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (onnx#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (onnx#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (onnx#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (onnx#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (onnx#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (onnx#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]> Signed-off-by: Jonas Rickert <[email protected]>
AMD changes: Update lowering and tests for onnx->tosa conversions that are not upstream Partial cherry-pick of f03b287 LLVM update 43d71ba (onnx#3086) * update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (onnx#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (onnx#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (onnx#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (onnx#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (onnx#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (onnx#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (onnx#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (onnx#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (onnx#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (onnx#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (onnx#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (onnx#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (onnx#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (onnx#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (onnx#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (onnx#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (onnx#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (onnx#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (onnx#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (onnx#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (onnx#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (onnx#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (onnx#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (onnx#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (onnx#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (onnx#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (onnx#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (onnx#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (onnx#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (onnx#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (onnx#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (onnx#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]> Signed-off-by: Jonas Rickert <[email protected]>
This PR is to support QLinearMatMul on CPU. In particular it lowers ONNX.QLinearMatMul to Krnl dialect by rewriting the operation into ONNX.MatMul of i32 element type.
Currently only i8, ui8 and f32 data types are supported, and the backend tests for these data types are enabled.