Skip to content

Jump table annotations for Linux #112606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ardbiesheuvel
Copy link
Contributor

This patch adds jump table annotations to ELF objects so that indirect branches can be correlated with their associated jump tables by static analysis tooling operating on [partially linked] object files.

The jump table part is straight-forward:

	.section	.rodata,"a",@progbits
	.p2align	2, 0x0
.LJTI0_0:
	.long	.LBB0_1-.LJTI0_0
	.long	.LBB0_2-.LJTI0_0
	.long	.LBB0_3-.LJTI0_0
	.long	.LBB0_4-.LJTI0_0
	.long	.LBB0_5-.LJTI0_0
	.long	.LBB0_5-.LJTI0_0
.Ljt_end0:
.set $JTI0_0, .LJTI0_0
	.type	$JTI0_0,@object
	.size	$JTI0_0, .Ljt_end0-.LJTI0_0

The handling at the call site is a bit trickier. I ended up reusing the existing jump table debug info support, producing something like

	leaq	.LJTI0_0(%rip), %rcx
	movslq	(%rcx,%rax,4), %rax
	addq	%rcx, %rax
	.reloc .Ljtp0, BFD_RELOC_NONE, .LJTI0_0
.Ljtp0:
	jmpq	*%rax

which works in most cases. However, when the jump destination gets spilled to the stack, we may end up with something like

   1457c:       4a 8b 04 ed 00 00 00    mov    0x0(,%r13,8),%rax
   14583:       00 
                        14580: R_X86_64_32S     .rodata+0x88
   14584:       48 89 04 24             mov    %rax,(%rsp)

   ...
   14602:       48 8b 04 24             mov    (%rsp),%rax
                        14602: R_X86_64_NONE    .rodata+0x88
   14606:       ff e0                   jmp    *%rax

where the relocation is no longer attached to the jmp instruction but to the preceding mov instruction.

Any hints how to do this properly would be much appreciated.

@llvmbot llvmbot added backend:ARM llvm:SelectionDAG SelectionDAGISel as well labels Oct 16, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-llvm-selectiondag

Author: Ard Biesheuvel (ardbiesheuvel)

Changes

This patch adds jump table annotations to ELF objects so that indirect branches can be correlated with their associated jump tables by static analysis tooling operating on [partially linked] object files.

The jump table part is straight-forward:

	.section	.rodata,"a",@<!-- -->progbits
	.p2align	2, 0x0
.LJTI0_0:
	.long	.LBB0_1-.LJTI0_0
	.long	.LBB0_2-.LJTI0_0
	.long	.LBB0_3-.LJTI0_0
	.long	.LBB0_4-.LJTI0_0
	.long	.LBB0_5-.LJTI0_0
	.long	.LBB0_5-.LJTI0_0
.Ljt_end0:
.set $JTI0_0, .LJTI0_0
	.type	$JTI0_0,@<!-- -->object
	.size	$JTI0_0, .Ljt_end0-.LJTI0_0

The handling at the call site is a bit trickier. I ended up reusing the existing jump table debug info support, producing something like

	leaq	.LJTI0_0(%rip), %rcx
	movslq	(%rcx,%rax,4), %rax
	addq	%rcx, %rax
	.reloc .Ljtp0, BFD_RELOC_NONE, .LJTI0_0
.Ljtp0:
	jmpq	*%rax

which works in most cases. However, when the jump destination gets spilled to the stack, we may end up with something like

   1457c:       4a 8b 04 ed 00 00 00    mov    0x0(,%r13,8),%rax
   14583:       00 
                        14580: R_X86_64_32S     .rodata+0x88
   14584:       48 89 04 24             mov    %rax,(%rsp)

   ...
   14602:       48 8b 04 24             mov    (%rsp),%rax
                        14602: R_X86_64_NONE    .rodata+0x88
   14606:       ff e0                   jmp    *%rax

where the relocation is no longer attached to the jmp instruction but to the preceding mov instruction.

Any hints how to do this properly would be much appreciated.


Full diff: https://github.com/llvm/llvm-project/pull/112606.diff

5 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/AsmPrinter.h (+4)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+43-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+4-2)
  • (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.cpp (+4-1)
  • (added) llvm/test/CodeGen/ARM/stack-guard-nomovt.ll (+32)
diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h
index c9a88d7b1c015c..fabe5bc226037d 100644
--- a/llvm/include/llvm/CodeGen/AsmPrinter.h
+++ b/llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -453,6 +453,10 @@ class AsmPrinter : public MachineFunctionPass {
   /// function to the current output stream.
   virtual void emitJumpTableInfo();
 
+  /// Emit jump table annotations correlating each table with its associated
+  /// indirect branch instruction.
+  virtual void emitJumpTableAnnotation(const MachineFunction &MF, const MachineInstr &MI);
+
   /// Emit the specified global variable to the .s file.
   virtual void emitGlobalVariable(const GlobalVariable *GV);
 
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index db7adfd3b21e5f..32f1733383e331 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -162,6 +162,10 @@ static cl::opt<bool> EmitJumpTableSizesSection(
     cl::desc("Emit a section containing jump table addresses and sizes"),
     cl::Hidden, cl::init(false));
 
+static cl::opt<bool> AnnotateJumpTables("annotate-jump-tables",
+                                        cl::desc("Annotate jump tables"),
+                                        cl::Hidden, cl::init(false));
+
 STATISTIC(EmittedInsts, "Number of machine instrs printed");
 
 char AsmPrinter::ID = 0;
@@ -1528,6 +1532,25 @@ void AsmPrinter::emitPseudoProbe(const MachineInstr &MI) {
   }
 }
 
+void AsmPrinter::emitJumpTableAnnotation(const MachineFunction &MF,
+                                         const MachineInstr &MI) {
+  if (!AnnotateJumpTables || !TM.getTargetTriple().isOSBinFormatELF())
+    return;
+
+  MCSymbol *JTISymbol = GetJTISymbol(MI.getOperand(0).getImm());
+  MCSymbol *ProvenanceLabel = OutContext.createTempSymbol("jtp");
+
+  const MCExpr *OffsetExpr =
+      MCSymbolRefExpr::create(ProvenanceLabel, OutContext);
+  const MCExpr *JTISymbolExpr =
+      MCSymbolRefExpr::create(JTISymbol, OutContext);
+
+  OutStreamer->emitRelocDirective(*OffsetExpr, "BFD_RELOC_NONE",
+                                  JTISymbolExpr, SMLoc(),
+                                  *OutContext.getSubtargetInfo());
+  OutStreamer->emitLabel(ProvenanceLabel);
+}
+
 void AsmPrinter::emitStackSizeSection(const MachineFunction &MF) {
   if (!MF.getTarget().Options.EmitStackSizeSection)
     return;
@@ -1836,8 +1859,7 @@ void AsmPrinter::emitFunctionBody() {
         OutStreamer->emitRawComment("MEMBARRIER");
         break;
       case TargetOpcode::JUMP_TABLE_DEBUG_INFO:
-        // This instruction is only used to note jump table debug info, it's
-        // purely meta information.
+        emitJumpTableAnnotation(*MF, MI);
         break;
       case TargetOpcode::INIT_UNDEF:
         // This is only used to influence register allocation behavior, no
@@ -2792,6 +2814,25 @@ void AsmPrinter::emitJumpTableInfo() {
     // label differences will be evaluated at write time.
     for (const MachineBasicBlock *MBB : JTBBs)
       emitJumpTableEntry(MJTI, MBB, JTI);
+
+    if (AnnotateJumpTables && TM.getTargetTriple().isOSBinFormatELF()) {
+      // Create a temp symbol for the end of the jump table.
+      MCSymbol *JTIEndSymbol = createTempSymbol("jt_end");
+      OutStreamer->emitLabel(JTIEndSymbol);
+
+      const MCExpr *JTISymbolExpr =
+          MCSymbolRefExpr::create(JTISymbol, OutContext);
+
+      MCSymbol *JTISymbolForSize = OutContext.getOrCreateSymbol(
+          "$JTI" + Twine(MF->getFunctionNumber()) + "_" + Twine(JTI));
+      OutStreamer->emitAssignment(JTISymbolForSize, JTISymbolExpr);
+      OutStreamer->emitSymbolAttribute(JTISymbolForSize, MCSA_ELF_TypeObject);
+
+      const MCExpr *SizeExp = MCBinaryExpr::createSub(
+          MCSymbolRefExpr::create(JTIEndSymbol, OutContext), JTISymbolExpr,
+          OutContext);
+      OutStreamer->emitELFSize(JTISymbolForSize, SizeExp);
+    }
   }
 
   if (EmitJumpTableSizesSection)
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 95937886280685..e1be41a3111afb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -478,8 +478,10 @@ SDValue TargetLowering::expandIndirectJTBranch(const SDLoc &dl, SDValue Value,
                                                SDValue Addr, int JTI,
                                                SelectionDAG &DAG) const {
   SDValue Chain = Value;
-  // Jump table debug info is only needed if CodeView is enabled.
-  if (DAG.getTarget().getTargetTriple().isOSBinFormatCOFF()) {
+  const auto &Triple = DAG.getTarget().getTargetTriple();
+  // Jump table debug info is only needed if CodeView is enabled,
+  // or when adding jump table annotations to ELF objects.
+  if (Triple.isOSBinFormatCOFF() || Triple.isOSBinFormatELF()) {
     Chain = DAG.getJumpTableDebugInfo(JTI, Chain, dl);
   }
   return DAG.getNode(ISD::BRIND, dl, MVT::Other, Chain, Addr);
diff --git a/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp b/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp
index d1e07b6703a5e6..27f86389a3856a 100644
--- a/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp
+++ b/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp
@@ -264,8 +264,11 @@ void Thumb2InstrInfo::expandLoadStackGuard(
   }
 
   const auto *GV = cast<GlobalValue>((*MI->memoperands_begin())->getValue());
-  if (MF.getSubtarget<ARMSubtarget>().isTargetELF() && !GV->isDSOLocal())
+  const ARMSubtarget &Subtarget = MF.getSubtarget<ARMSubtarget>();
+  if (Subtarget.isTargetELF() && !GV->isDSOLocal())
     expandLoadStackGuardBase(MI, ARM::t2LDRLIT_ga_pcrel, ARM::t2LDRi12);
+  else if (!Subtarget.useMovt())
+    expandLoadStackGuardBase(MI, ARM::tLDRLIT_ga_abs, ARM::t2LDRi12);
   else if (MF.getTarget().isPositionIndependent())
     expandLoadStackGuardBase(MI, ARM::t2MOV_ga_pcrel, ARM::t2LDRi12);
   else
diff --git a/llvm/test/CodeGen/ARM/stack-guard-nomovt.ll b/llvm/test/CodeGen/ARM/stack-guard-nomovt.ll
new file mode 100644
index 00000000000000..6802dabfda87a6
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/stack-guard-nomovt.ll
@@ -0,0 +1,32 @@
+; RUN: llc -relocation-model=static -mattr=+no-movt < %s | FileCheck %s
+
+target triple = "thumbv7a-linux-gnueabi"
+
+define i32 @test1() #0 {
+; CHECK-LABEL: test1:
+; CHECK:       @ %bb.0:
+; CHECK-NEXT:    push	{r7, lr}
+; CHECK-NEXT:    sub.w	sp, sp, #1032
+; CHECK-NEXT:    ldr	r0, .LCPI0_0
+; CHECK-NEXT:    ldr	r0, [r0]
+; CHECK-NEXT:    str.w	r0, [sp, #1028]
+; CHECK-NEXT:    add	r0, sp, #4
+; CHECK-NEXT:    bl	foo
+; CHECK-NEXT:    ldr.w	r0, [sp, #1028]
+; CHECK-NEXT:    ldr	r1, .LCPI0_0
+; CHECK-NEXT:    ldr	r1, [r1]
+; CHECK-NEXT:    cmp	r1, r0
+; CHECK-NEXT:    ittt	eq
+; CHECK-NEXT:    moveq	r0, #0
+; CHECK-NEXT:    addeq.w	sp, sp, #1032
+; CHECK-NEXT:    popeq	{r7, pc}
+; CHECK-NEXT:  .LBB0_1:
+; CHECK-NEXT:    bl __stack_chk_fail
+  %a1 = alloca [256 x i32], align 4
+  call void @foo(ptr %a1) #3
+  ret i32 0
+}
+
+declare void @foo(ptr)
+
+attributes #0 = { nounwind sspstrong }

…ables

The Linux kernel build system performs static analysis on the ELF
objects to infer whether indirect jumps are truly function pointer
dereferences, or calls via jump tables where the set of possible
destinations is limited and decided at compile-time.

When generating position dependent x86 code for the small code model,
this is usually straight-forward, as the address of the jump table is
encoded as an immediate in the instruction, e.g.,

  jmpq   *jump_table(, %reg, 8)

and each entry in the table represents the absolute address of a jump
destination.

However, when switching to PIC codegen, or building for load-store
architectures, this usually becomes something like

  leaq   jump_table(%rip), %reg0
  movlsq (%reg0, %reg1, 4), %reg1
  addq   %reg0, %reg1
  jmpq   *%reg1

or on arm64

  adrp   xM, jump_table
  add    xM, :lo12:jump_table
  ldrsw  wN, [xM, xN, lsl llvm#2]
  add    xN, xN, xM
  br     xN

where there is no obvious correlation between the location of the jump
table and the indirect branch instruction, and where the start of each
jump table has to be known to dereference the 32-bit relative references
correctly, as they are relative to the start of the table rather than
relative to each individual entry.

Make the tooling's job easier by:
- emitting an ELF symbol that covers the jump table, so that its size
  can be discovered;
- emitting a BFD_RELOC_NONE allocation that links the symbol to the
  indirect branch instruction where the effective jump destination is
  consumed.

Signed-off-by: Ard Biesheuvel <[email protected]>
@ardbiesheuvel ardbiesheuvel force-pushed the jump-table-annotations branch from 20cf930 to 0ebcd5d Compare October 16, 2024 19:54
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 5000c688bf9dad3ed5ec98cf427b3c5160e6e74c 0ebcd5d857a0dadf4f1df766bac7ecfb602d58c7 --extensions h,cpp -- llvm/include/llvm/CodeGen/AsmPrinter.h llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
View the diff from clang-format here.
diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h
index fabe5bc226..7c95cf35b9 100644
--- a/llvm/include/llvm/CodeGen/AsmPrinter.h
+++ b/llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -455,7 +455,8 @@ public:
 
   /// Emit jump table annotations correlating each table with its associated
   /// indirect branch instruction.
-  virtual void emitJumpTableAnnotation(const MachineFunction &MF, const MachineInstr &MI);
+  virtual void emitJumpTableAnnotation(const MachineFunction &MF,
+                                       const MachineInstr &MI);
 
   /// Emit the specified global variable to the .s file.
   virtual void emitGlobalVariable(const GlobalVariable *GV);
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 4563ed98a4..f14b843219 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1542,12 +1542,10 @@ void AsmPrinter::emitJumpTableAnnotation(const MachineFunction &MF,
 
   const MCExpr *OffsetExpr =
       MCSymbolRefExpr::create(ProvenanceLabel, OutContext);
-  const MCExpr *JTISymbolExpr =
-      MCSymbolRefExpr::create(JTISymbol, OutContext);
+  const MCExpr *JTISymbolExpr = MCSymbolRefExpr::create(JTISymbol, OutContext);
 
-  OutStreamer->emitRelocDirective(*OffsetExpr, "BFD_RELOC_NONE",
-                                  JTISymbolExpr, SMLoc(),
-                                  *OutContext.getSubtargetInfo());
+  OutStreamer->emitRelocDirective(*OffsetExpr, "BFD_RELOC_NONE", JTISymbolExpr,
+                                  SMLoc(), *OutContext.getSubtargetInfo());
   OutStreamer->emitLabel(ProvenanceLabel);
 }
 

@nickdesaulniers
Copy link
Member

  • @maksfb @aaupov IIRC Bolt had to bend over backwards to identify indirect jumps via jump tables. They probably have a vested interest in this approach of leaving behind relocations that can be consumed.

Any hints how to do this properly would be much appreciated.

So it's going to be register allocation that decides to spill values to the stack. Perhaps the register allocators need to look for the existence of this and update the resulting spills.

cc @MaskRay @jyknight @efriedma-quic

Comment on lines +165 to +167
static cl::opt<bool> AnnotateJumpTables("annotate-jump-tables",
cl::desc("Annotate jump tables"),
cl::Hidden, cl::init(false));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cl::opt is not great because it usually means that the command line flag that would trigger this needs to also be passed to the linker when performing LTO. Instead, I think a better approach (though it's more code in LLVM) is to have a module level metadata to signal this codegen behavior, and have the front end emit that when the command line flag is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'll look into that. Suggestions welcome on useful examples.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fc018eb was exactly this. Perhaps we can add a new -f flag to clang for this?

(That way we have -fannotate-jump-tables/-fno-annotate-jump-tables for the compiler, rather than -mllvm -annotate-jump-tables for the compiler (and the linker for LTO)).

@@ -453,6 +453,10 @@ class AsmPrinter : public MachineFunctionPass {
/// function to the current output stream.
virtual void emitJumpTableInfo();

/// Emit jump table annotations correlating each table with its associated
/// indirect branch instruction.
virtual void emitJumpTableAnnotation(const MachineFunction &MF, const MachineInstr &MI);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be virtual if you're only providing one definition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a rhetorical question?

Copy link
Member

@nickdesaulniers nickdesaulniers Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (sorry, I could have phrased it differently).

As an example, if you look at the below method declaration for emitGlobalVariable, you'll find a definition for it in the base class AsmPrinter, but overrides are provided in various derived classes (AMDGPUAsmPrinter, ARMAsmPrinter, NVPTXAsmPrinter, PPCAsmPrinter, and WebAssemblyAsmPrinter).

Since you're not changing the functionality for any of the derived classes (i.e. no overrides) and only providing a definition for the base class, then you should not declare this method as virtual. That should avoid needing to traverse the vtable to call this method.

@efriedma-quic
Copy link
Collaborator

https://reviews.llvm.org/D149367 did something very similar on Windows; the way it ended up working is that it emits a pseudo-instruction to record the relationship between the jump and the corresponding jump table.

(See also #101962, which is sort of related, but not really the same thing.)

@efriedma-quic
Copy link
Collaborator

Oh, this is actually using the same infrastructure as D149367. In that case, not sure why you're running into trouble if the jump destination is spilled; I'm pretty sure the code was designed to handle that case correctly. CC @dpaoliello

@maksfb
Copy link
Contributor

maksfb commented Oct 16, 2024

Cool. This will be useful to BOLT and other tools. cc @jpoimboe

@dpaoliello
Copy link
Contributor

where the relocation is no longer attached to the jmp instruction but to the preceding mov instruction

Yeah, this is one of the problems with pseudo instructions: weird things can happen when instructions are inserted or combined. Ideally this would be stored as metadata on the jump, but that's not really a thing in LLVM.

You either need to teach the code that is adding the mov to place it before the pseudo instruction (which will still be fragile to other changes), or make the code looking for the pseudo instruction walk backwards from the jump to see if there are any in the current block (which is the approach I took).

@ardbiesheuvel
Copy link
Contributor Author

You either need to teach the code that is adding the mov to place it before the pseudo instruction (which will still be fragile to other changes), or make the code looking for the pseudo instruction walk backwards from the jump to see if there are any in the current block (which is the approach I took).

So in this particular case, there is the .reloc that gets emitted along with the jmp - there may be more than one of those sharing the same jump table so it belongs with the JUMP_TABLE_DEBUG_INFO op. However, the label used in the .reloc could be emitted separately, as long as the name can be constructed unambiguously.

Would that make it feasible to tweak the BR_JT to BRIND lowering (and further) to remember its BR_JT origin, and always emit a local label unconditionally? Then, whether or not that label is ever referenced is based on whether the annotations are emitted when printing the ASM, and where the .reloc ends up with respect to the jump is no longer important.

Just throwing this out as an idea - I am a total noob when it comes to LLVM hacking so I will gratefully take all the help and advice I can get.

@nickdesaulniers
Copy link
Member

Possibly related: #102411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants