Decompose Hardswish into simpler ONNX ops #3107

kumarappan-cmyk · 2025-03-31T06:25:33Z

This pass optimizes the ONNX computation graph by decomposing the HardSwish operation into a sequence of simpler ONNX operations. Since HardSwish is not natively supported in ONNX-MLIR, this transformation ensures compatibility by replacing HardSwish with fundamental arithmetic functions.

Y= HardSwish(x) = x * max(0, min(1, alpha * x + beta))

Where alpha = 1/6 and beta = 0.5

Signed-off-by: Kumarappan <[email protected]>

jenkins-droid · 2025-03-31T06:54:45Z

Can one of the admins verify this patch?

chentong319 · 2025-03-31T14:16:28Z

@jenkins-droid test it please.

chentong319 · 2025-03-31T14:29:37Z

src/Dialect/ONNX/Transforms/Decompose.cpp

+//  - Multiplies the clamped value with the original input
+
+// Create constant tensor function
+Value createConstantTensor(PatternRewriter &rewriter, Location loc, Type elementType, float value) {


This function could be a general utility function. Do you have the interest in putting it into src/Dailect/ONNX/DialectBuilder?

chentong319 · 2025-03-31T14:31:22Z

src/Dialect/ONNX/Transforms/Decompose.cpp

+                        ArrayAttr(), IntegerAttr(), ArrayAttr(), StringAttr(), ArrayAttr());
+}
+
+struct DecomposeHardSwishPattern : public OpRewritePattern<ONNXHardSwishOp> {


It is fine to define rewriting pattern in this way. I am just curious whether it is possible to define this rewriting rule with tablegen.

First of all, you do not have to use .td for this PR.

To do this particular rewriting with table gen, you may follow the code in src/Dialect/ONNX/Transform/Decompose.*, especially the following rule:
def ClipV6Pattern : Pat<
(ONNXClipV6Op $x, $maxAttr, $minAttr),
(ONNXClipV11Op $x, (ONNXConstantOpFromDenseAttr(createScalarDenseAttrRank0 $minAttr)),
(ONNXConstantOpFromDenseAttr(createScalarDenseAttrRank0 $maxAttr)))

;

Here the input of createScalarDenseAttrRank0 is an Attribute, but you need a new function to take a float value. You may create a FloatAttr of the desired constant value for createScalarDenseAttrRank0.

Hopefully, your rule will be
Def HardSwishDecompose: Pat<
(ONNXHardSwishOp:res $x),
(ONNXMulOp $x, (ONNXMaxOp (ONNXConstantOpFromDenseAttr(createScalarDenseAttrRank0 FloatAttr(0.0) )), ONNXMinOp …)

Remind that you may need to provide the output type for intermediate result, such as ONNXMaxOp, ONNXAddOp and etc.

Thanks will look into it.

I can create attributes directly
def HardSwishAlpha : NativeCodeCall<
"FloatAttr::get($_builder.getF32Type(), 1.0 / 6.0)">;

But when I have a generic function
def createFloatAttr : NativeCodeCall<
"FloatAttr::get($_builder.getF32Type(), %0)">;

I am unable to pass the desired constant value
Is it that have to have Multiple def functions itself
Or how to pass a constant value to the native code call in tablegen

In this case, the %0 is not passed to a function call. Instead, it is used by constructor directly. Therefore it is more like a template initialization. You can check the code in src/DIalect/ONNX/ONNXOps/Canonicalize.td:
class HasRankOf<int rank> : Constraint<CPred<"mlir::isa<ShapedType>($0.getType()) && " "mlir::cast<ShapedType>($0.getType()).hasRank() && " "mlir::cast<ShapedType>($0.getType()).getRank() == " # rank>>
Another approach is to pass the value to your createConstantTensor function.
Again, your current implementation is fine. Just let me know if you want just the current implementation.

Hi, For class in tablegen file I am able to pass the value as int or string but not as float.
And again in src/DIalect/ONNX/ONNXOps/Canonicalize.td
For rewriting Gemm OP there is seperate definition for the constants
def GemmAlpha : NativeCodeCall<"$_builder.getF32FloatAttr(1.0)">;
def GemmBeta : NativeCodeCall<"$_builder.getF32FloatAttr(1.0)">;
I will try what @tungld suggested

tungld · 2025-04-03T05:32:49Z

I strongly encourage you to lower this operator into KRNL for the best performance so that all primitive computations are inside the innermost loop, instead of decomposing into multiple ONNX ops.

A very similar lowering is for ONNX HardSigmoid y = max(0, min(1, alpha * x + beta)): https://github.com/onnx/onnx-mlir/blob/main/src/Conversion/ONNXToKrnl/Math/Elementwise.cpp#L619. The code would be almost similar.

Signed-off-by: Kumarappan <[email protected]>

jenkins-droid · 2025-04-03T10:07:43Z

Can one of the admins verify this patch?

kumarappan-cmyk · 2025-04-03T10:41:00Z

@chentong319 @tungld
I have added the decomposition in the krnl dialect
Earlier with the decomposition pattern in the decompose.cpp, The MLIR was

affine.for %arg1 = 0 to 10 {
affine.for %arg2 = 0 to 3 {
affine.for %arg3 = 0 to 512 {
affine.for %arg4 = 0 to 512 {
%4 = affine.load %arg0[%arg1, %arg2, %arg3, %arg4] : memref<10x3x512x512xf32>
%5 = affine.load %0[] : memref
%6 = arith.mulf %4, %5 : f32
%7 = affine.load %1[] : memref
%8 = arith.addf %6, %7 : f32
%9 = affine.load %2[] : memref
%10 = arith.minnumf %8, %9 : f32
%11 = affine.load %3[] : memref
%12 = arith.maxnumf %10, %11 : f32
%13 = affine.load %arg0[%arg1, %arg2, %arg3, %arg4] : memref<10x3x512x512xf32>
%14 = arith.mulf %13, %12 : f32
affine.store %14, %alloc[%arg1, %arg2, %arg3, %arg4] : memref<10x3x512x512xf32>
}
}
}
}

Multiple loads were there, As suggested i moved the conversion to src/Conversion/ONNXToKrnl/Math/Elementwise.cpp
Now the MLIR has fewer loads

affine.for %arg1 = 0 to 10 {
  affine.for %arg2 = 0 to 3 {
    affine.for %arg3 = 0 to 512 {
      affine.for %arg4 = 0 to 512 {
        %0 = affine.load %arg0[%arg1, %arg2, %arg3, %arg4] : memref<10x3x512x512xf32>
        %1 = arith.mulf %0, %cst_0 : f32
        %2 = arith.addf %1, %cst : f32
        %3 = arith.minnumf %2, %cst_1 : f32
        %4 = arith.maxnumf %3, %cst_2 : f32
        %5 = arith.mulf %0, %4 : f32
        affine.store %5, %alloc[%arg1, %arg2, %arg3, %arg4] : memref<10x3x512x512xf32>
      }
    }
  }
}

chentong319 · 2025-04-03T18:29:12Z

It is more efficient to lower to krnl for CPU. I'd like to keep the decomposition for the following reasons:

Support Hardswish at onnx level. It may help other user of onnx-mlir with its semantics expressed with other onnx ops.
May expose more optimization opportunities.
I'd like to know whether the op fusion in krnl lowering will bring the same efficiency.

How about keeping both decomposition and krnl lowering code and choose one of them with an option? You can set lowering to krnl as the default.

…lowering as default Signed-off-by: Kumarappan <[email protected]>

kumarappan-cmyk · 2025-04-07T12:04:37Z

@chentong319 I have added a command line option to specify with decompose-op-in-onnx to decompose in decompose.cpp

tungld

Thank you for the changes! They are very clean. Could you add some lit tests for the lowering to Krnl into the following files? Elementwise_with_canonicalize and Elementwise_with_canonicalize_O3.

tungld · 2025-04-08T06:33:13Z

src/Dialect/ONNX/DialectBuilder.cpp

@@ -116,6 +116,12 @@ Value OnnxBuilder::constantInt64(const ArrayRef<int64_t> intVals) const {
  return constant(denseAttr);
 }

+Value OnnxBuilder::constantFloat32(const ArrayRef<float> floatVals) const {
+  auto shape = RankedTensorType::get({(int64_t)floatVals.size()}, b().getF32Type());


A small nit: we avoid using (dtype) for type cast but static_cast<int64_t>(value) instead.

tungld · 2025-04-08T06:39:36Z

src/Compiler/CompilerOptions.cpp

+                       "Supported Ops - HardSwish"),
+        llvm::cl::value_desc("ONNX operation to decompose"),
+        llvm::cl::location(decomposeOpsInONNX), 
+        llvm::cl::cat(OnnxMlirOptions),


This should be OnnxMlirCommonOptions if you want to use it in both onnx-mlir and onnx-mlir-opt commands.

tungld · 2025-04-08T06:42:13Z

src/Dialect/ONNX/Transforms/Decompose.cpp

@@ -1439,6 +1500,15 @@ void onnx_mlir::getDecomposeONNXToONNXPatterns(
  patterns.insert<SoftmaxCrossEntropyPattern>(context);
  patterns.insert<SumToAddPattern>(context);

+  if (!onnx_mlir::decomposeOpsInONNX.empty()) {
+    for (const auto &op : onnx_mlir::decomposeOpsInONNX) {
+        llvm::outs() << "Decomposing ONNX operation: " << op << "\n";


Please remove this llvm::outs() or you can change to using LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << "Decomposing ONNX operation: " << op << "\n" instead.

Signed-off-by: Kumarappan <[email protected]>

tungld

LGTM! Thank you very much for the patch!

Just some nits as commented.

tungld · 2025-04-10T01:43:32Z

src/Dialect/ONNX/DialectBuilder.hpp


  template <typename OnnxOpType, typename... Args>
  OnnxOpType createTypedOpAndInferShapes(
-      mlir::Type result_ty, Args &&... args) const;
+      mlir::Type result_ty, Args &&...args) const;


These changes made the clang-format failed. I sometimes get the same problem. The local clang format suggests these changes but the Jenkins clang-format failed. Perhaps, there is a difference in clang-format version.

tungld · 2025-04-10T01:44:23Z

src/Compiler/CompilerOptions.cpp

@@ -95,6 +95,7 @@ OptReport optReport;                                   // onnx-mlir only
 bool useOldBufferization;                              // onnx-mlir only
 bool enableTiming;                                     // onnx-mlir only
 bool enableBoundCheck;                                 // onnx-mlir only
+std::vector<std::string> decomposeOpsInONNX;           // onnx-mlir only


Thank you for adding this option! For the comment, it should be common for both.

tungld · 2025-04-10T01:44:53Z

test/mlir/conversion/onnx_to_krnl/Math/Elementwise_with_canonicalize.mlir

@@ -1247,6 +1247,40 @@ func.func private @test_hardsigmoid(%arg0 : tensor<?x10xf32>) -> tensor<*xf32> {

 // -----

+func.func private @test_hardswish(%arg0: tensor<?x10xf32>) -> tensor<*xf32> {


Thanks for adding these lit tests!

Signed-off-by: Kumarappan <[email protected]>

jenkins-droid · 2025-04-10T05:26:27Z

Can one of the admins verify this patch?

tungld · 2025-04-10T05:27:50Z

@jenkins-droid test this please

tungld · 2025-04-10T07:25:21Z

@chentong319 does the newly added option meet your requirement? if so we can merge the PR.

chentong319

LGTM!

Arkar-Hema · 2025-04-10T16:58:44Z

Thanks for the review and approval! Some of the tests are still running, once they finish could you please merge the PR?

jenkins-droid · 2025-04-14T02:12:55Z

Jenkins Linux s390x Build #16480 [push] Decompose Hardswish into... started at 22:12

jenkins-droid · 2025-04-14T02:12:56Z

Jenkins Linux amd64 Build #16478 [push] Decompose Hardswish into... started at 21:12

jenkins-droid · 2025-04-14T02:13:00Z

Jenkins Linux ppc64le Build #15461 [push] Decompose Hardswish into... started at 22:12

jenkins-droid · 2025-04-14T03:43:28Z

Jenkins Linux amd64 Build #16478 [push] Decompose Hardswish into... passed after 1 hr 30 min

jenkins-droid · 2025-04-14T04:07:04Z

Jenkins Linux s390x Build #16480 [push] Decompose Hardswish into... passed after 1 hr 54 min

jenkins-droid · 2025-04-14T04:51:33Z

Jenkins Linux ppc64le Build #15461 [push] Decompose Hardswish into... aborted after 2 hr 38 min

* Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]>

* update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]>

LLVM update 43d71ba (onnx#3086) * update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (onnx#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (onnx#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (onnx#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (onnx#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (onnx#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (onnx#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (onnx#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (onnx#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (onnx#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (onnx#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (onnx#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (onnx#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (onnx#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (onnx#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (onnx#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (onnx#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (onnx#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (onnx#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (onnx#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (onnx#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (onnx#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (onnx#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (onnx#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (onnx#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (onnx#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (onnx#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (onnx#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (onnx#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (onnx#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (onnx#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (onnx#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (onnx#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]>

AMD changes: Update lowering and tests for onnx->tosa conversions that are not upstream Partial cherry-pick of f03b287 LLVM update 43d71ba (onnx#3086) * update float types, tosa, other misc changes Signed-off-by: Boyana Norris <[email protected]> * fix buildOnnxToTosaPaddingConstOp Signed-off-by: Boyana Norris <[email protected]> * fix lit tests (wip) Signed-off-by: Boyana Norris <[email protected]> * updte doc Signed-off-by: Boyana Norris <[email protected]> * use stablehlo tagged version Signed-off-by: Boyana Norris <[email protected]> * fixed more lit tests Signed-off-by: Boyana Norris <[email protected]> * fix .clang-format Signed-off-by: Boyana Norris <[email protected]> * fix lit (wip) Signed-off-by: Boyana Norris <[email protected]> * revert .clang-format change Signed-off-by: Boyana Norris <[email protected]> * fix lit tests Signed-off-by: Boyana Norris <[email protected]> * fix formatting Signed-off-by: Boyana Norris <[email protected]> * lit tests pass (except jni -- not tested) Signed-off-by: Boyana Norris <[email protected]> * manually fix formatting; can't get clang-format to do it on any of my machines Signed-off-by: Boyana Norris <[email protected]> * revert lit test changes unrelated to update Signed-off-by: Boyana Norris <[email protected]> * update llvm and stablhlo shas, misc minor updates Signed-off-by: Boyana Norris <[email protected]> * remove non-existent passes Signed-off-by: Boyana Norris <[email protected]> * lit updates (wip) Signed-off-by: Tung D. Le <[email protected]> * Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. (onnx#3065) * Bump Upsample to Opset 10 This is a non-functional change, the only difference is that Upsample was marked as deprecated with Opset 10 Signed-off-by: Rickert, Jonas <[email protected]> * Use a map of the available opset versions in onnx to select the node opset to use. Introduces a new built-time generated map that contains all versions of an operation as defined by onnx. To determine the opset version for a node/op: 1. Determine the latest valid opset version. This is the newest version in this opset-version-map that is older or equal to the current graph opset. 2. Select the newest version from the versions supported by onnx-mlir that is equal or newer to the latest valid opset version. This allows it to skip over opset versions, that have a newer backwards compatible version. Example: Versions in onnx and supported by onnx-mlir: [3, 5]. Graph opset version to node version: 3 -> 3, 4 -> 3, 5 -> 5 Versions in onnx: [7, 9, 10]. Version 10 is backwards compatible to version 9. Version supported by onnx-mlir: [7, 10]. Graph opset version to node version: 7 -> 7, 8 -> 7, 9 -> 10, 10 -> 10 Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Improve scripts (onnx#3089) Signed-off-by: Alexandre Eichenberger <[email protected]> * Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. (onnx#3064) * Add support for TensorProto::UINT4/INT4 Signed-off-by: Rickert, Jonas <[email protected]> * Upgrade onnx.Cast to opset 21 Signed-off-by: Rickert, Jonas <[email protected]> * Bump various ops to opset 21. These are all backwards compatibel version bumps, only adding support for int/uint4. Bumped ops: Flatten Identity If Loop Pad Reshape Scan Shape Size Squeeze Transpose Unsqueeze Signed-off-by: Rickert, Jonas <[email protected]> --------- Signed-off-by: Rickert, Jonas <[email protected]> * Added minimal support to do some timing of OM Runtime functionality (onnx#3095) Signed-off-by: Alexandre Eichenberger <[email protected]> * adding __errno_location call for mvs (onnx#3099) Signed-off-by: Christopher Munoz <[email protected]> * Rewriting pattern to remove WhereOp and EqualOp. (onnx#3094) Remove ONNXWhereOp and ONNXEqualOp into newly created ConcatOp. --------- Signed-off-by: Haruki Imai <[email protected]> * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation (onnx#3101) * Enable NNPA saturation by default and change the option to --nnpa-disable-saturation Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * removing weak attribute of errorno (onnx#3103) Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Fix the custom build link for docs/Docker.md (onnx#3104) Signed-off-by: JiQiu <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Python driver for torch model (onnx#3093) * implementation Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * test Signed-off-by: Chen Tong <[email protected]> * py format Signed-off-by: Chen Tong <[email protected]> * torch.compile Signed-off-by: Chen Tong <[email protected]> * refine Signed-off-by: Chen Tong <[email protected]> * add debug Signed-off-by: Chen Tong <[email protected]> * respond Signed-off-by: Chen Tong <[email protected]> * response Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * implement (onnx#3108) Signed-off-by: Chen Tong <[email protected]> * Followups for torch model driver (onnx#3106) * simplify Signed-off-by: Chen Tong <[email protected]> * complete Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> * fix Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * Fix an error in ZHighConstantPropagation for QuantizedStick (onnx#3112) Signed-off-by: Tung D. Le <[email protected]> * Add z17 for -march (onnx#3113) * done Signed-off-by: Tong Chen <[email protected]> * convert Signed-off-by: Tong Chen <[email protected]> * fix Signed-off-by: Tong Chen <[email protected]> * format Signed-off-by: Tong Chen <[email protected]> --------- Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> * Decompose Hardswish into simpler ONNX ops (onnx#3107) * Decompose and lower Hardswish Signed-off-by: Kumarappan <[email protected]> * Providing the decomposition as compile time option with krnl dialect lowering as default Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Reorder relu to maxpool optimization pass in ONNX dialect (onnx#3109) * Reorder Relu and maxpool optimization Signed-off-by: Arkar-Hema <[email protected]> * Swap Relu and maxpool only when Relu is not a consumer of conv Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Move onnx.Constant before the root op when fusing onnx ops (onnx#3119) Signed-off-by: Tung D. Le <[email protected]> * Support QLinearMatMul on CPU (onnx#3117) * Support QLinearMatMul on CPU Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * Update black-format-check.yml (onnx#3118) Signed-off-by: Andreas Fehlner <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Merge nested concat Ops optimization pass in ONNX dialect (onnx#3111) * Merging nested concat ops Signed-off-by: Arkar-Hema <[email protected]> --------- Signed-off-by: Arkar-Hema <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Enhance shape inference for ONNX Reshape (onnx#3122) * Add a special case in shape inference for reshape Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Tung D. Le <[email protected]> * update zdnn1.1.2 (onnx#3130) Signed-off-by: Sunny Anand <[email protected]> * Updating supported ops on NNPA md for z17. (onnx#3120) * starting to update new z17 NNPA ops Signed-off-by: Christopher Munoz <[email protected]> --------- Signed-off-by: Christopher Munoz <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * fix CVE-2025-32434 (onnx#3135) Signed-off-by: Sunny Anand <[email protected]> * Fuse consecutive clips pattern (onnx#3132) * Fuse consecutive clips pattern Signed-off-by: Kumarappan <[email protected]> --------- Signed-off-by: Kumarappan <[email protected]> Co-authored-by: Tung D. Le <[email protected]> * Replace deprecated applyPatternsAndFoldGreedily with applyPatternsGreedily. This functions also folds by default, so it is an NFC Signed-off-by: Rickert, Jonas <[email protected]> * Fix clang-format Signed-off-by: Rickert, Jonas <[email protected]> * Replace bufferization::createOwnershipBasedBufferDeallocationPass with mlir::createConvertBufferizationToMemRefPass Signed-off-by: Rickert, Jonas <[email protected]> * Update onnx-to-tosa reshape lit test Signed-off-by: Rickert, Jonas <[email protected]> * Move gemm_to_fc tests to gemm_to_matmul Signed-off-by: Rickert, Jonas <[email protected]> * Change tosaBuilder::mul function signature to make clear that the shift is an int8 Signed-off-by: Rickert, Jonas <[email protected]> * Disable buffer_loop_hoisting test as it gets completly optimized away Signed-off-by: Rickert, Jonas <[email protected]> * Guard against dynamic dim in result Signed-off-by: Rickert, Jonas <[email protected]> * Use resize operaton input and output type to calculate the border, instead of using the calculated numerator/denominator Signed-off-by: Rickert, Jonas <[email protected]> * Guard against linear interpolation of integer types Signed-off-by: Rickert, Jonas <[email protected]> * Add test for disallowed onnx.Resize on its with linear interpolation to tosa Signed-off-by: Rickert, Jonas <[email protected]> * Add 'Pure' annotation to some krnl ops and recreate documentation Signed-off-by: Rickert, Jonas <[email protected]> * Build stablehlo with static libs Signed-off-by: Rickert, Jonas <[email protected]> * Disable memref.prefetch since it does not work with the new bufferization Signed-off-by: Tung D. Le <[email protected]> * Conv add const where the constant is a scalar (onnx#3145) Signed-off-by: Alexandre Eichenberger <[email protected]> * added support for Celu op (onnx#3139) Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> * Fix some warnings related to stickification for NNPA (onnx#3147) Signed-off-by: Tung D. Le <[email protected]> * Removing duplicate file (onnx#3146) Signed-off-by: Christopher Munoz <[email protected]> * migrated instance/group normalization from decompose to canonicalize (onnx#3148) Signed-off-by: Alexandre Eichenberger <[email protected]> * Fusion of Matmul add covering the stacked/unstacked/bcast1/bcast23 patterns (onnx#3140) Signed-off-by: Alexandre Eichenberger <[email protected]> * Support --march=native (onnx#3134) * changes Signed-off-by: Chen Tong <[email protected]> * format Signed-off-by: Chen Tong <[email protected]> * linkage Signed-off-by: Chen Tong <[email protected]> * lib Signed-off-by: Chen Tong <[email protected]> --------- Signed-off-by: Chen Tong <[email protected]> * fix another error on s390x Signed-off-by: Tung D. Le <[email protected]> * lower Ub to LLVM since vector.shape_cast is lowered to UB Signed-off-by: Tung D. Le <[email protected]> --------- Signed-off-by: Boyana Norris <[email protected]> Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Rickert, Jonas <[email protected]> Signed-off-by: Alexandre Eichenberger <[email protected]> Signed-off-by: Christopher Munoz <[email protected]> Signed-off-by: Haruki Imai <[email protected]> Signed-off-by: JiQiu <[email protected]> Signed-off-by: Chen Tong <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Tong Chen <[email protected]> Signed-off-by: Kumarappan <[email protected]> Signed-off-by: Arkar-Hema <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]> Signed-off-by: Sunny Anand <[email protected]> Signed-off-by: logeshwaranmcw <[email protected]> Co-authored-by: Alexandre Eichenberger <[email protected]> Co-authored-by: Jonas Rickert <[email protected]> Co-authored-by: Christopher Munoz <[email protected]> Co-authored-by: Haruki Imai <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: qjivy <[email protected]> Co-authored-by: Tong Chen <[email protected]> Co-authored-by: Sunny Anand <[email protected]> Co-authored-by: kumarappan-cmyk <[email protected]> Co-authored-by: Arkar-Hema <[email protected]> Co-authored-by: Andreas Fehlner <[email protected]> Co-authored-by: logeshwaranmcw <[email protected]> Signed-off-by: Jonas Rickert <[email protected]>

kumarappan-cmyk added 3 commits March 31, 2025 02:11

decomposeHardswish

3f2e112

Signed-off-by: Kumarappan <[email protected]>

Added //Run in test case

212023b

Signed-off-by: Kumarappan <[email protected]>

Added //Run in test case

5c7f517

Signed-off-by: Kumarappan <[email protected]>

chentong319 reviewed Mar 31, 2025

View reviewed changes

kumarappan-cmyk added 3 commits April 3, 2025 05:38

Lower the Hardswish

286a16f

Signed-off-by: Kumarappan <[email protected]>

Remove test case

c7de062

Signed-off-by: Kumarappan <[email protected]>

updated readme

7a095c4

Signed-off-by: Kumarappan <[email protected]>

kumarappan-cmyk requested a review from chentong319 April 3, 2025 14:00

kumarappan-cmyk added 2 commits April 7, 2025 07:53

Providing the decomposition as compile time option with krnl dialect …

cdb6a07

…lowering as default Signed-off-by: Kumarappan <[email protected]>

Providing the decomposition as compile time option with krnl dialect …

44f9f07

…lowering as default Signed-off-by: Kumarappan <[email protected]>

tungld reviewed Apr 8, 2025

View reviewed changes

Added testcase to ONNXtoKRNL and fixed clang

9846faf

Signed-off-by: Kumarappan <[email protected]>

kumarappan-cmyk requested a review from tungld April 8, 2025 15:12

Add test case in O3 Elementwise_with_canonicalize_O3.mlir

1db966a

Signed-off-by: Kumarappan <[email protected]>

tungld approved these changes Apr 10, 2025

View reviewed changes

tungld and others added 2 commits April 10, 2025 10:45

Merge branch 'main' into decompose

26673f3

fixed nit changes

dd2b6cd

Signed-off-by: Kumarappan <[email protected]>

kumarappan-cmyk requested a review from tungld April 10, 2025 05:26

chentong319 approved these changes Apr 10, 2025

View reviewed changes

Merge branch 'main' into decompose

fea4383

tungld merged commit bd070ea into onnx:main Apr 14, 2025
7 checks passed

christopherlmunoz mentioned this pull request May 1, 2025

make onnx_mlir_supported_ops_cpu sets op HardSwish minimum opset to none #3129

Open

jorickert mentioned this pull request Jul 1, 2025

LLVM Bump to c27444ab4976dd9ff131212f87463f9945ab28d7 Xilinx/onnx-mlir#393

Open

		@@ -1247,6 +1247,40 @@ func.func private @test_hardsigmoid(%arg0 : tensor<?x10xf32>) -> tensor<*xf32> {

		// -----

		func.func private @test_hardswish(%arg0: tensor<?x10xf32>) -> tensor<*xf32> {

Decompose Hardswish into simpler ONNX ops #3107

Decompose Hardswish into simpler ONNX ops #3107

Uh oh!

Conversation

kumarappan-cmyk commented Mar 31, 2025

Uh oh!

jenkins-droid commented Mar 31, 2025

Uh oh!

chentong319 commented Mar 31, 2025

Uh oh!

chentong319 Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tungld commented Apr 3, 2025

Uh oh!

jenkins-droid commented Apr 3, 2025

Uh oh!

kumarappan-cmyk commented Apr 3, 2025

Uh oh!

chentong319 commented Apr 3, 2025

Uh oh!

kumarappan-cmyk commented Apr 7, 2025

Uh oh!

tungld left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tungld left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jenkins-droid commented Apr 10, 2025

Uh oh!

tungld commented Apr 10, 2025

Uh oh!

tungld commented Apr 10, 2025

Uh oh!

chentong319 left a comment

Choose a reason for hiding this comment

Uh oh!

Arkar-Hema commented Apr 10, 2025

Uh oh!

Uh oh!

jenkins-droid commented Apr 14, 2025

Uh oh!

jenkins-droid commented Apr 14, 2025

Uh oh!

jenkins-droid commented Apr 14, 2025

Uh oh!

jenkins-droid commented Apr 14, 2025

Uh oh!

jenkins-droid commented Apr 14, 2025

Uh oh!

jenkins-droid commented Apr 14, 2025

Uh oh!

Uh oh!

chentong319 Mar 31, 2025 •

edited

Loading