-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[AArch64] Change cost of (s|z)ext <4 x i8> -> <4 x i32> to 2. #141029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Increase the cost for zext <4 x i8> -> <4 x i32> to 2 to match codegen, which currently uses 2 instructions (commonly ushll.8h, ushll.4s). Example lowering: https://llvm.godbolt.org/z/58TEoxh4v
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-aarch64 Author: Florian Hahn (fhahn) ChangesIncrease the cost for zext <4 x i8> -> <4 x i32> to 2 to match codegen, which currently uses 2 instructions (commonly ushll.8h, ushll.4s). Example lowering: https://llvm.godbolt.org/z/58TEoxh4v Full diff: https://github.com/llvm/llvm-project/pull/141029.diff 6 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 3f10da23b3494..43d03de2f46cf 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -3161,6 +3161,8 @@ InstructionCost AArch64TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
{ISD::ZERO_EXTEND, MVT::v4i64, MVT::v4i16, 3},
{ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i32, 2},
{ISD::ZERO_EXTEND, MVT::v4i64, MVT::v4i32, 2},
+ {ISD::SIGN_EXTEND, MVT::v4i32, MVT::v4i8, 2},
+ {ISD::ZERO_EXTEND, MVT::v4i32, MVT::v4i8, 2},
{ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i8, 3},
{ISD::ZERO_EXTEND, MVT::v8i32, MVT::v8i8, 3},
{ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i16, 2},
diff --git a/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll b/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
index 7e1588f427be4..d1b93667c5506 100644
--- a/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
@@ -293,15 +293,15 @@ define void @extaddv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_16 = add <4 x i16> %zl1_8_16, %zl2_8_16
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asw_8_32 = add <4 x i32> %i32, %sw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asl_8_32 = add <4 x i32> %sl1_8_32, %sl2_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azw_8_32 = add <4 x i32> %i32, %zw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_32 = add <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %asw_8_64 = add <4 x i64> %i64, %sw_8_64
@@ -988,15 +988,15 @@ define void @extsubv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_16 = sub <4 x i16> %zl1_8_16, %zl2_8_16
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asw_8_32 = sub <4 x i32> %i32, %sw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asl_8_32 = sub <4 x i32> %sl1_8_32, %sl2_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azw_8_32 = sub <4 x i32> %i32, %zw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_32 = sub <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %asw_8_64 = sub <4 x i64> %i64, %sw_8_64
@@ -1683,15 +1683,15 @@ define void @extmulv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_16 = mul <4 x i16> %zl1_8_16, %zl2_8_16
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asw_8_32 = mul <4 x i32> %i32, %sw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asl_8_32 = mul <4 x i32> %sl1_8_32, %sl2_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azw_8_32 = mul <4 x i32> %i32, %zw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_32 = mul <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:28 CodeSize:1 Lat:1 SizeLat:1 for: %asw_8_64 = mul <4 x i64> %i64, %sw_8_64
diff --git a/llvm/test/Analysis/CostModel/AArch64/cast.ll b/llvm/test/Analysis/CostModel/AArch64/cast.ll
index 38bd98ffd343f..618d71b6c125d 100644
--- a/llvm/test/Analysis/CostModel/AArch64/cast.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/cast.ll
@@ -41,8 +41,8 @@ define void @ext() {
; CHECK-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -883,8 +883,8 @@ define i32 @load_extends() {
; CHECK-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
diff --git a/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll b/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll
index 7595abc71a9d8..1d1a5169691cd 100644
--- a/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll
@@ -582,7 +582,7 @@ define <8 x i16> @neg_dissimilar_operand_kind_0(<8 x i8> %a, <8 x i8> %b) {
}
; COST-LABEL: neg_dissimilar_operand_kind_1
-; COST-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %tmp0 = zext <4 x i8> %a to <4 x i32>
+; COST-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %tmp0 = zext <4 x i8> %a to <4 x i32>
; COST-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %tmp1 = zext <4 x i16> %b to <4 x i32>
define <4 x i32> @neg_dissimilar_operand_kind_1(<4 x i8> %a, <4 x i16> %b) {
%tmp0 = zext <4 x i8> %a to <4 x i32>
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll b/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll
index cfb130eb5ec32..9c114ca85bd3b 100644
--- a/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll
@@ -42,8 +42,8 @@ define void @ext() {
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -184,8 +184,8 @@ define void @ext() {
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -255,8 +255,8 @@ define void @ext() {
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -1809,8 +1809,8 @@ define i32 @load_extends() #0 {
; CHECK-SVE-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
@@ -1899,8 +1899,8 @@ define i32 @load_extends() #0 {
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
@@ -1944,8 +1944,8 @@ define i32 @load_extends() #0 {
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll
index 36826eb6681c8..4ef5221e93147 100644
--- a/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll
+++ b/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll
@@ -883,7 +883,7 @@ entry:
; COST-LABEL: Function: mla_v4i8_i32
-; COST: Cost: '-6'
+; COST: Cost: '-4'
define i32 @mla_v4i8_i32(ptr %x, ptr %y) "target-features"="+dotprod" {
; CHECK-LABEL: @mla_v4i8_i32(
; CHECK-NEXT: entry:
|
@llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesIncrease the cost for zext <4 x i8> -> <4 x i32> to 2 to match codegen, which currently uses 2 instructions (commonly ushll.8h, ushll.4s). Example lowering: https://llvm.godbolt.org/z/58TEoxh4v Full diff: https://github.com/llvm/llvm-project/pull/141029.diff 6 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 3f10da23b3494..43d03de2f46cf 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -3161,6 +3161,8 @@ InstructionCost AArch64TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
{ISD::ZERO_EXTEND, MVT::v4i64, MVT::v4i16, 3},
{ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i32, 2},
{ISD::ZERO_EXTEND, MVT::v4i64, MVT::v4i32, 2},
+ {ISD::SIGN_EXTEND, MVT::v4i32, MVT::v4i8, 2},
+ {ISD::ZERO_EXTEND, MVT::v4i32, MVT::v4i8, 2},
{ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i8, 3},
{ISD::ZERO_EXTEND, MVT::v8i32, MVT::v8i8, 3},
{ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i16, 2},
diff --git a/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll b/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
index 7e1588f427be4..d1b93667c5506 100644
--- a/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/arith-widening.ll
@@ -293,15 +293,15 @@ define void @extaddv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_16 = add <4 x i16> %zl1_8_16, %zl2_8_16
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asw_8_32 = add <4 x i32> %i32, %sw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asl_8_32 = add <4 x i32> %sl1_8_32, %sl2_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azw_8_32 = add <4 x i32> %i32, %zw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_32 = add <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %asw_8_64 = add <4 x i64> %i64, %sw_8_64
@@ -988,15 +988,15 @@ define void @extsubv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_16 = sub <4 x i16> %zl1_8_16, %zl2_8_16
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asw_8_32 = sub <4 x i32> %i32, %sw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asl_8_32 = sub <4 x i32> %sl1_8_32, %sl2_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azw_8_32 = sub <4 x i32> %i32, %zw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_32 = sub <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %asw_8_64 = sub <4 x i64> %i64, %sw_8_64
@@ -1683,15 +1683,15 @@ define void @extmulv4(<4 x i8> %i8, <4 x i16> %i16, <4 x i32> %i32, <4 x i64> %i
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_16 = zext <4 x i8> %i8 to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_16 = mul <4 x i16> %zl1_8_16, %zl2_8_16
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asw_8_32 = mul <4 x i32> %i32, %sw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl1_8_32 = sext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %sl2_8_32 = sext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %asl_8_32 = mul <4 x i32> %sl1_8_32, %sl2_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zw_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azw_8_32 = mul <4 x i32> %i32, %zw_8_32
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl1_8_32 = zext <4 x i8> %i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %zl2_8_32 = zext <4 x i8> %i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %azl_8_32 = mul <4 x i32> %zl1_8_32, %zl2_8_32
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %sw_8_64 = sext <4 x i8> %i8 to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:28 CodeSize:1 Lat:1 SizeLat:1 for: %asw_8_64 = mul <4 x i64> %i64, %sw_8_64
diff --git a/llvm/test/Analysis/CostModel/AArch64/cast.ll b/llvm/test/Analysis/CostModel/AArch64/cast.ll
index 38bd98ffd343f..618d71b6c125d 100644
--- a/llvm/test/Analysis/CostModel/AArch64/cast.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/cast.ll
@@ -41,8 +41,8 @@ define void @ext() {
; CHECK-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -883,8 +883,8 @@ define i32 @load_extends() {
; CHECK-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; CHECK-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
diff --git a/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll b/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll
index 7595abc71a9d8..1d1a5169691cd 100644
--- a/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/free-widening-casts.ll
@@ -582,7 +582,7 @@ define <8 x i16> @neg_dissimilar_operand_kind_0(<8 x i8> %a, <8 x i8> %b) {
}
; COST-LABEL: neg_dissimilar_operand_kind_1
-; COST-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %tmp0 = zext <4 x i8> %a to <4 x i32>
+; COST-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %tmp0 = zext <4 x i8> %a to <4 x i32>
; COST-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %tmp1 = zext <4 x i16> %b to <4 x i32>
define <4 x i32> @neg_dissimilar_operand_kind_1(<4 x i8> %a, <4 x i16> %b) {
%tmp0 = zext <4 x i8> %a to <4 x i32>
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll b/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll
index cfb130eb5ec32..9c114ca85bd3b 100644
--- a/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-cast.ll
@@ -42,8 +42,8 @@ define void @ext() {
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:3 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -184,8 +184,8 @@ define void @ext() {
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -255,8 +255,8 @@ define void @ext() {
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z2i32i64 = zext <2 x i32> undef to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i8i16 = sext <4 x i8> undef to <4 x i16>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z4i8i16 = zext <4 x i8> undef to <4 x i16>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %s4i8i32 = sext <4 x i8> undef to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %z4i8i32 = zext <4 x i8> undef to <4 x i32>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i8i64 = sext <4 x i8> undef to <4 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %z4i8i64 = zext <4 x i8> undef to <4 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %s4i16i32 = sext <4 x i16> undef to <4 x i32>
@@ -1809,8 +1809,8 @@ define i32 @load_extends() #0 {
; CHECK-SVE-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; CHECK-SVE-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; CHECK-SVE-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
@@ -1899,8 +1899,8 @@ define i32 @load_extends() #0 {
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-256-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
@@ -1944,8 +1944,8 @@ define i32 @load_extends() #0 {
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 0 for: %r11 = zext i32 %loadi32 to i64
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v0 = sext <8 x i8> %loadv8i8 to <8 x i16>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v1 = zext <8 x i8> %loadv8i8 to <8 x i16>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
-; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v2 = sext <4 x i8> %loadv4i8 to <4 x i32>
+; FIXED-MIN-2048-NEXT: Cost Model: Found costs of RThru:2 CodeSize:1 Lat:1 SizeLat:1 for: %v3 = zext <4 x i8> %loadv4i8 to <4 x i32>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v4 = sext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v5 = zext <2 x i8> %loadv2i8 to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found costs of 1 for: %v6 = sext <4 x i16> %loadv4i16 to <4 x i32>
diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll
index 36826eb6681c8..4ef5221e93147 100644
--- a/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll
+++ b/llvm/test/Transforms/SLPVectorizer/AArch64/vecreduceadd.ll
@@ -883,7 +883,7 @@ entry:
; COST-LABEL: Function: mla_v4i8_i32
-; COST: Cost: '-6'
+; COST: Cost: '-4'
define i32 @mla_v4i8_i32(ptr %x, ptr %y) "target-features"="+dotprod" {
; CHECK-LABEL: @mla_v4i8_i32(
; CHECK-NEXT: entry:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note separately is that the costs from the custom tables are only used for RThru costs, the others default to 1, which is not really accurate either, but that's the same for all entries in the lookup tables
Do you have any details about what this improves for you? I am not against it, it sounds sensible, but if you look at the cost of the load it is already 2, as it includes the cost of
I think it might have ran into problems with unrolling, when I tried it. I may give it another go, I had a big stack of patches for them that I didn't do much with. |
Increase the cost for zext <4 x i8> -> <4 x i32> to 2 to match codegen, which currently uses 2 instructions (commonly ushll.8h, ushll.4s).
Example lowering: https://llvm.godbolt.org/z/58TEoxh4v