Skip to content

Commit 7af2b51

Browse files
authored
[AArch64][v8.5A] Omit BTI for non-addr-taken static fns on Linux (#134669)
This is a conditional revert of cca40aa, which made LLVM's branch-target-enforcement mode generate BTI at the start of _every_ function, even in the case where the function has internal linkage and its address is never taken for use in an indirect call. The rationale was that it might turn out at link time that a direct call to the function spanned a larger distance than the range of a BL instruction (say, if the translation unit generated multiple code sections and the linker put them a very long way apart). Then the linker might insert a long-branch thunk using an indirect call instruction. SYSVABI64 has now clarified that in this situation the static linker may not assume that the target function is safe to call directly. If it needs to use this strategy, it's responsible for also generating a 'landing pad' near the target function, with a BTI followed by a direct branch, and using that as the target of the long-distance indirect call. ARM-software/abi-aa@606ce44 LLD complies with this spec as of commit 098b0d1. So if we're compiling in a mode that respects SYSVABI64, such as targeting Linux, it's safe to leave out the BTI at the start of a function with internal linkage if we can prove that its address isn't either used in an indirect call in _this_ translation unit or passed out of the object. Therefore, this patch goes back to the behavior before cca40aa, leaving out BTIs in functions that can't be called indirectly, but only if the target triple is Linux. (I wasn't able to find a more precise query for "is this a SYSVABI64-compliant platform?", but Linux certainly is, and this check at least fails in the safe direction - if in doubt, we put in all the BTIs that might be necessary.)
1 parent 1997073 commit 7af2b51

File tree

2 files changed

+25
-11
lines changed

2 files changed

+25
-11
lines changed

llvm/lib/Target/AArch64/AArch64BranchTargets.cpp

+14-6
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ bool AArch64BranchTargets::runOnMachineFunction(MachineFunction &MF) {
6565
LLVM_DEBUG(
6666
dbgs() << "********** AArch64 Branch Targets **********\n"
6767
<< "********** Function: " << MF.getName() << '\n');
68+
const Function &F = MF.getFunction();
6869

6970
// LLVM does not consider basic blocks which are the targets of jump tables
7071
// to be address-taken (the address can't escape anywhere else), but they are
@@ -78,16 +79,23 @@ bool AArch64BranchTargets::runOnMachineFunction(MachineFunction &MF) {
7879
bool HasWinCFI = MF.hasWinCFI();
7980
for (MachineBasicBlock &MBB : MF) {
8081
bool CouldCall = false, CouldJump = false;
81-
// Even in cases where a function has internal linkage and is only called
82-
// directly in its translation unit, it can still be called indirectly if
83-
// the linker decides to add a thunk to it for whatever reason (say, for
84-
// example, if it is finally placed far from its call site and a BL is not
85-
// long-range enough). PLT entries and tail-calls use BR, but when they are
82+
// If the function is address-taken or externally-visible, it could be
83+
// indirectly called. PLT entries and tail-calls use BR, but when they are
8684
// are in guarded pages should all use x16 or x17 to hold the called
8785
// address, so we don't need to set CouldJump here. BR instructions in
8886
// non-guarded pages (which might be non-BTI-aware code) are allowed to
8987
// branch to a "BTI c" using any register.
90-
if (&MBB == &*MF.begin())
88+
//
89+
// For SysV targets, this is enough, because SYSVABI64 says that if the
90+
// static linker later wants to use an indirect branch instruction in a
91+
// long-branch thunk, it's also responsible for adding a 'landing pad' with
92+
// a BTI, and pointing the indirect branch at that. However, at present
93+
// this guarantee only holds for targets complying with SYSVABI64, so for
94+
// other targets we must assume that `CouldCall` is _always_ true due to
95+
// the risk of long-branch thunks at link time.
96+
if (&MBB == &*MF.begin() &&
97+
(!MF.getSubtarget<AArch64Subtarget>().isTargetLinux() ||
98+
(F.hasAddressTaken() || !F.hasLocalLinkage())))
9199
CouldCall = true;
92100

93101
// If the block itself is address-taken, it could be indirectly branched

llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll

+11-5
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
; RUN: llc -mtriple=aarch64 -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s
1+
; RUN: llc -mtriple=aarch64-linux-gnu -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s --check-prefixes=CHECK,SYSV
2+
; RUN: llc -mtriple=aarch64-none-elf -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s --check-prefixes=CHECK,NONSYSV
23

34
define void @f0() "patchable-function-entry"="0" "branch-target-enforcement" {
45
; CHECK-LABEL: f0:
@@ -48,20 +49,25 @@ define void @f2_1() "patchable-function-entry"="1" "patchable-function-prefix"="
4849
}
4950

5051
;; -fpatchable-function-entry=1 -mbranch-protection=bti
51-
;; We add BTI c even when the function has internal linkage
52+
;; For SysV compliant targets, we don't add BTI (or create the .Lpatch0 symbol)
53+
;; because the function has internal linkage and isn't address-taken. For
54+
;; non-SysV targets, we do add the BTI, because outside SYSVABI64 there's no
55+
;; spec preventing the static linker from using an indirect call instruction in
56+
;; a long-branch thunk inserted at link time.
5257
define internal void @f1i(i64 %v) "patchable-function-entry"="1" "branch-target-enforcement" {
5358
; CHECK-LABEL: f1i:
5459
; CHECK-NEXT: .Lfunc_begin3:
5560
; CHECK: // %bb.0:
56-
; CHECK-NEXT: hint #34
57-
; CHECK-NEXT: .Lpatch1:
61+
; NONSYSV-NEXT: hint #34
62+
; NONSYSV-NEXT: .Lpatch1:
5863
; CHECK-NEXT: nop
5964
;; Other basic blocks have BTI, but they don't affect our decision to not create .Lpatch0
6065
; CHECK: .LBB{{.+}} // %sw.bb1
6166
; CHECK-NEXT: hint #36
6267
; CHECK: .section __patchable_function_entries,"awo",@progbits,f1i{{$}}
6368
; CHECK-NEXT: .p2align 3
64-
; CHECK-NEXT: .xword .Lpatch1
69+
; NONSYSV-NEXT: .xword .Lpatch1
70+
; SYSV-NEXT: .xword .Lfunc_begin3
6571
entry:
6672
switch i64 %v, label %sw.bb0 [
6773
i64 1, label %sw.bb1

0 commit comments

Comments
 (0)