pre-commit: PR137604 #2311

dtcxzyw · 2025-04-28T14:22:16Z

Link: llvm/llvm-project#137604
Requested by: @nikic

dtcxzyw · 2025-04-28T14:43:20Z

Diff mode

runner: ariselab-64c-v2
baseline: llvm/llvm-project@642453c
patch: llvm/llvm-project#137604
sha256: ed4f34acb2319a18d33aa63966428f43c53c3e5fe1d814c2ad639a386519a5b9
commit: 41c2ac9

8 files changed, 31583 insertions(+), 31514 deletions(-)

Improvements:
  loop-simplifycfg.NumLoopBlocksDeleted 5524 -> 5529 +0.09%
  loop-simplifycfg.NumLoopExitsDeleted 2685 -> 2686 +0.04%
  simple-loop-unswitch.NumCostMultiplierSkipped 13498 -> 13500 +0.01%
  loop-simplifycfg.NumTerminatorsFolded 8102 -> 8103 +0.01%
  loop-instsimplify.NumSimplified 164729 -> 164732 +0.00%
  simple-loop-unswitch.NumBranches 88643 -> 88644 +0.00%
  loop-rotate.NumInstrsDuplicated 2739669 -> 2739686 +0.00%
  loop-rotate.NumRotated 1047694 -> 1047698 +0.00%
  local.NumRemoved 4616585 -> 4616596 +0.00%
  correlated-value-propagation.NumPhis 1111931 -> 1111933 +0.00%
Regressions:
  correlated-value-propagation.NumCmpIntr 32 -> 26 -18.75%
  loop-rotate.NumInstrsHoisted 887 -> 886 -0.11%
  correlated-value-propagation.NumMinMax 13313 -> 13303 -0.08%
  instcombine.NegatorMaxDepthVisited 17873 -> 17872 -0.01%
  instcombine.NegatorMaxTotalValuesVisited 56168 -> 56167 -0.00%
  jump-threading.NumThreads 2370222 -> 2370216 -0.00%
  instcombine.NumSunkInst 2879758 -> 2879752 -0.00%
  gvn.NumGVNSimpl 3957132 -> 3957131 -0.00%
  instcombine.NumCombined 105540114 -> 105540097 -0.00%
  gvn.NumGVNInstr 8148373 -> 8148372 -0.00%

8 4 bench/eastl/optimized/TestBitset.ll
8 7 bench/hyperscan/optimized/buildstate.ll
2 1 bench/influxdb-rs/optimized/4bpmt5749p4g145g.ll
226 208 bench/openjdk/optimized/hb-face.ll
201 183 bench/openjdk/optimized/hb-ot-font.ll

github-actions · 2025-04-28T14:44:37Z

Summary of Major Changes

Introduction of llvm.ucmp.i32.i32 and llvm.umax.i64 Calls:
- In multiple files (TestBitset.ll, buildstate.ll, 4bpmt5749p4g145g.ll, hb-face.ll, and hb-ot-font.ll), new calls to the intrinsic functions @llvm.ucmp.i32.i32 and @llvm.umax.i64 have been added. These intrinsics are used for performing unsigned comparisons and maximum calculations, respectively. This suggests an optimization or transformation that replaces conditional logic with these intrinsics for better performance or clarity.
Phi Node Adjustments:
- Several phi nodes have been updated in various basic blocks (e.g., %for.body2160, %for.body2434, %_ZNK2OT4cmap13find_subtableEjj.exit). The changes involve renaming or reordering operands in the phi nodes, which may reflect adjustments to loop structures or control flow optimizations.
Replaced Conditional Logic with Intrinsics:
- Instances of conditional instructions like icmp samesign ugt have been replaced with calls to llvm.ucmp.i32.i32. For example, in hb-face.ll, the comparison %67 = icmp samesign ugt i32 %66, 5 is replaced by %69 = call noundef i32 @llvm.ucmp.i32.i32(i32 5, i32 %66). This indicates a transformation aimed at simplifying or optimizing conditional checks.
Memory Access Adjustments:
- In hb-face.ll and hb-ot-font.ll, memory access patterns have been modified. For instance, loads from specific offsets (e.g., %87 = load i8, ptr %86, align 1) are replaced with adjusted offsets and operations on the loaded values. This could be part of an optimization to improve cache locality or reduce redundant computations.
Control Flow Simplifications:
- Several basic blocks have been restructured, such as merging or splitting blocks (e.g., _ZL14_hb_cmp_methodIN2OT14EncodingRecordEKS1_JEEiPKvS4_DpT1_.exit.i.i.i.i.i.i.thread). Additionally, some branches have been simplified or replaced with direct calls to intrinsics, reducing the complexity of the control flow graph.

High-Level Overview

The patch primarily focuses on introducing LLVM intrinsics (llvm.ucmp.i32.i32 and llvm.umax.i64) to replace existing conditional logic and arithmetic operations. This transformation likely aims to improve performance by leveraging optimized implementations of these intrinsics. Additionally, there are adjustments to phi nodes, memory access patterns, and control flow structures, indicating broader optimizations targeting loop unrolling, branch simplification, and improved data access patterns. These changes collectively enhance the efficiency and readability of the generated IR.

model: qwen-plus-latest
CompletionUsage(completion_tokens=648, prompt_tokens=36250, total_tokens=36898, completion_tokens_details=None, prompt_tokens_details=None)

pre-commit: PR137604

b5635d7

github-actions bot mentioned this pull request Apr 28, 2025

Task submission #1312

Open

github-actions bot added 2 commits April 28, 2025 14:43

pre-commit: Update

649e2f9

pre-commit: Remap

41c2ac9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre-commit: PR137604 #2311

pre-commit: PR137604 #2311

dtcxzyw commented Apr 28, 2025

dtcxzyw commented Apr 28, 2025

github-actions bot commented Apr 28, 2025

pre-commit: PR137604 #2311

Are you sure you want to change the base?

pre-commit: PR137604 #2311

Conversation

dtcxzyw commented Apr 28, 2025

dtcxzyw commented Apr 28, 2025

Diff mode

github-actions bot commented Apr 28, 2025

Summary of Major Changes

High-Level Overview