Skip to content

[AArch64][v8.5A] Omit BTI for non-addr-taken static fns on Linux #134669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 14 additions & 6 deletions llvm/lib/Target/AArch64/AArch64BranchTargets.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ bool AArch64BranchTargets::runOnMachineFunction(MachineFunction &MF) {
LLVM_DEBUG(
dbgs() << "********** AArch64 Branch Targets **********\n"
<< "********** Function: " << MF.getName() << '\n');
const Function &F = MF.getFunction();

// LLVM does not consider basic blocks which are the targets of jump tables
// to be address-taken (the address can't escape anywhere else), but they are
Expand All @@ -78,16 +79,23 @@ bool AArch64BranchTargets::runOnMachineFunction(MachineFunction &MF) {
bool HasWinCFI = MF.hasWinCFI();
for (MachineBasicBlock &MBB : MF) {
bool CouldCall = false, CouldJump = false;
// Even in cases where a function has internal linkage and is only called
// directly in its translation unit, it can still be called indirectly if
// the linker decides to add a thunk to it for whatever reason (say, for
// example, if it is finally placed far from its call site and a BL is not
// long-range enough). PLT entries and tail-calls use BR, but when they are
// If the function is address-taken or externally-visible, it could be
// indirectly called. PLT entries and tail-calls use BR, but when they are
// are in guarded pages should all use x16 or x17 to hold the called
// address, so we don't need to set CouldJump here. BR instructions in
// non-guarded pages (which might be non-BTI-aware code) are allowed to
// branch to a "BTI c" using any register.
if (&MBB == &*MF.begin())
//
// For SysV targets, this is enough, because SYSVABI64 says that if the
// static linker later wants to use an indirect branch instruction in a
// long-branch thunk, it's also responsible for adding a 'landing pad' with
// a BTI, and pointing the indirect branch at that. However, at present
// this guarantee only holds for targets complying with SYSVABI64, so for
// other targets we must assume that `CouldCall` is _always_ true due to
// the risk of long-branch thunks at link time.
if (&MBB == &*MF.begin() &&
(!MF.getSubtarget<AArch64Subtarget>().isTargetLinux() ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use isOSBinFormatELF(). We could omit the landing pad for all ELF platforms.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about that because there's no ABI guarantee that it will work for all ELF platforms -- just because lld behaves in the right way, doesn't mean it's specified to work that way for all linkers.

So I've landed the patch as it is for the moment, which at least improves things in some situations. I'm happy to come back and make that generalisation if ARM-software/abi-aa#321 is fixed, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this applies to at least all ELF platforms minus bare-metal... For example, OpenBSD would like this as well. If Fuchsia deploys BTI, they could benefit from this as they only use very new lld.

But now that you have landed this, we could wait for bare-metal ABI to be changed before generalizing this...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as I mentioned in the commit message, I didn't find any existing LLVM query function that would return true for Linux, BSD, and other things expected to use SysV style shared libraries.

I could write one, and let other people extend it to include more OSes as necessary. But if we're going to require this in AAELF64, then it's probably a waste of effort to do anything that complicated -- simpler to wait for that ABI PR to land and then go with your original suggestion of just testing for ELF.

(F.hasAddressTaken() || !F.hasLocalLinkage())))
CouldCall = true;

// If the block itself is address-taken, it could be indirectly branched
Expand Down
16 changes: 11 additions & 5 deletions llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
; RUN: llc -mtriple=aarch64 -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s
; RUN: llc -mtriple=aarch64-linux-gnu -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s --check-prefixes=CHECK,SYSV
; RUN: llc -mtriple=aarch64-none-elf -aarch64-min-jump-table-entries=4 %s -o - | FileCheck %s --check-prefixes=CHECK,NONSYSV

define void @f0() "patchable-function-entry"="0" "branch-target-enforcement" {
; CHECK-LABEL: f0:
Expand Down Expand Up @@ -48,20 +49,25 @@ define void @f2_1() "patchable-function-entry"="1" "patchable-function-prefix"="
}

;; -fpatchable-function-entry=1 -mbranch-protection=bti
;; We add BTI c even when the function has internal linkage
;; For SysV compliant targets, we don't add BTI (or create the .Lpatch0 symbol)
;; because the function has internal linkage and isn't address-taken. For
;; non-SysV targets, we do add the BTI, because outside SYSVABI64 there's no
;; spec preventing the static linker from using an indirect call instruction in
;; a long-branch thunk inserted at link time.
define internal void @f1i(i64 %v) "patchable-function-entry"="1" "branch-target-enforcement" {
; CHECK-LABEL: f1i:
; CHECK-NEXT: .Lfunc_begin3:
; CHECK: // %bb.0:
; CHECK-NEXT: hint #34
; CHECK-NEXT: .Lpatch1:
; NONSYSV-NEXT: hint #34
; NONSYSV-NEXT: .Lpatch1:
; CHECK-NEXT: nop
;; Other basic blocks have BTI, but they don't affect our decision to not create .Lpatch0
; CHECK: .LBB{{.+}} // %sw.bb1
; CHECK-NEXT: hint #36
; CHECK: .section __patchable_function_entries,"awo",@progbits,f1i{{$}}
; CHECK-NEXT: .p2align 3
; CHECK-NEXT: .xword .Lpatch1
; NONSYSV-NEXT: .xword .Lpatch1
; SYSV-NEXT: .xword .Lfunc_begin3
entry:
switch i64 %v, label %sw.bb0 [
i64 1, label %sw.bb1
Expand Down