-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[AArch64,ELF] Allow implicit $d/$x at section beginning #99718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64,ELF] Allow implicit $d/$x at section beginning #99718
Conversation
Created using spr 1.3.5-bogner
@llvm/pr-subscribers-lld @llvm/pr-subscribers-backend-aarch64 Author: Fangrui Song (MaskRay) ChangesThe start state of a new section is
This new behavior decreases the .symtab size significantly:
This scheme works as long as the user can rule out some error scenarios:
The above mix-and-match scenarios aren't an issue at all for a Full diff: https://github.com/llvm/llvm-project/pull/99718.diff 5 Files Affected:
diff --git a/llvm/include/llvm/MC/MCTargetOptions.h b/llvm/include/llvm/MC/MCTargetOptions.h
index 899299fd15246..d16cb38fa0ae2 100644
--- a/llvm/include/llvm/MC/MCTargetOptions.h
+++ b/llvm/include/llvm/MC/MCTargetOptions.h
@@ -64,6 +64,8 @@ class MCTargetOptions {
// Use CREL relocation format for ELF.
bool Crel = false;
+ bool OptimizeMappingSymbols = false;
+
// If true, prefer R_X86_64_[REX_]GOTPCRELX to R_X86_64_GOTPCREL on x86-64
// ELF.
bool X86RelaxRelocations = true;
diff --git a/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h b/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
index 9d592446f3ba7..f4b4e2cbcb450 100644
--- a/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
+++ b/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
@@ -53,6 +53,8 @@ bool getSaveTempLabels();
bool getCrel();
+bool getOptimizeMappingSymbols();
+
bool getX86RelaxRelocations();
bool getX86Sse2Avx();
diff --git a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
index 813b1194b47cb..f3dcf499eb9f9 100644
--- a/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
+++ b/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
@@ -48,6 +48,7 @@ MCOPT(bool, NoDeprecatedWarn)
MCOPT(bool, NoTypeCheck)
MCOPT(bool, SaveTempLabels)
MCOPT(bool, Crel)
+MCOPT(bool, OptimizeMappingSymbols)
MCOPT(bool, X86RelaxRelocations)
MCOPT(bool, X86Sse2Avx)
MCOPT(std::string, ABIName)
@@ -134,6 +135,11 @@ llvm::mc::RegisterMCTargetOptionsFlags::RegisterMCTargetOptionsFlags() {
cl::desc("Use CREL relocation format for ELF"));
MCBINDOPT(Crel);
+ static cl::opt<bool> OptimizeMappingSymbols(
+ "optimize-mapping-symbols",
+ cl::desc("Allow mapping symbol at section beginning to be implicit"));
+ MCBINDOPT(OptimizeMappingSymbols);
+
static cl::opt<bool> X86RelaxRelocations(
"x86-relax-relocations",
cl::desc(
@@ -174,6 +180,7 @@ MCTargetOptions llvm::mc::InitMCTargetOptionsFromFlags() {
Options.MCNoTypeCheck = getNoTypeCheck();
Options.MCSaveTempLabels = getSaveTempLabels();
Options.Crel = getCrel();
+ Options.OptimizeMappingSymbols = getOptimizeMappingSymbols();
Options.X86RelaxRelocations = getX86RelaxRelocations();
Options.X86Sse2Avx = getX86Sse2Avx();
Options.EmitDwarfUnwind = getEmitDwarfUnwind();
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFStreamer.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFStreamer.cpp
index bfeca4bd5a92d..e79a17395a2bc 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFStreamer.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFStreamer.cpp
@@ -32,8 +32,10 @@
#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/MC/MCSymbolELF.h"
+#include "llvm/MC/MCTargetOptions.h"
#include "llvm/MC/MCWinCOFFStreamer.h"
#include "llvm/Support/Casting.h"
+#include "llvm/Support/CommandLine.h"
#include "llvm/Support/FormattedStream.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/TargetParser/Triple.h"
@@ -176,19 +178,29 @@ void AArch64TargetAsmStreamer::emitInst(uint32_t Inst) {
/// by MachO. Beware!
class AArch64ELFStreamer : public MCELFStreamer {
public:
+ friend AArch64TargetELFStreamer;
AArch64ELFStreamer(MCContext &Context, std::unique_ptr<MCAsmBackend> TAB,
std::unique_ptr<MCObjectWriter> OW,
std::unique_ptr<MCCodeEmitter> Emitter)
: MCELFStreamer(Context, std::move(TAB), std::move(OW),
std::move(Emitter)),
- MappingSymbolCounter(0), LastEMS(EMS_None) {}
+ MappingSymbolCounter(0), LastEMS(EMS_None) {
+ auto *TO = getContext().getTargetOptions();
+ OptimizeMappingSymbols = TO && TO->OptimizeMappingSymbols;
+ }
void changeSection(MCSection *Section, uint32_t Subsection = 0) override {
- // We have to keep track of the mapping symbol state of any sections we
- // use. Each one should start off as EMS_None, which is provided as the
- // default constructor by DenseMap::lookup.
+ // We have to keep track of the mapping symbol state of any sections we use.
+ // The initial state is EMS_A64 for text sections and EMS_Data for the
+ // others.
LastMappingSymbols[getCurrentSection().first] = LastEMS;
- LastEMS = LastMappingSymbols.lookup(Section);
+ auto It = LastMappingSymbols.find(Section);
+ if (It != LastMappingSymbols.end())
+ LastEMS = It->second;
+ else if (OptimizeMappingSymbols)
+ LastEMS = Section->isText() ? EMS_A64 : EMS_Data;
+ else
+ LastEMS = EMS_None;
MCELFStreamer::changeSection(Section, Subsection);
}
@@ -280,6 +292,7 @@ class AArch64ELFStreamer : public MCELFStreamer {
}
int64_t MappingSymbolCounter;
+ bool OptimizeMappingSymbols;
DenseMap<const MCSection *, ElfMappingSymbol> LastMappingSymbols;
ElfMappingSymbol LastEMS;
@@ -304,6 +317,21 @@ void AArch64TargetELFStreamer::finish() {
AArch64ELFStreamer &S = getStreamer();
MCContext &Ctx = S.getContext();
auto &Asm = S.getAssembler();
+
+ // If OptimizeMappingSymbols is specified, ensure that text sections end with
+ // the A64 state while non-text sections end with the data state. When
+ // sections are combined by the linker, the subsequent section will start with
+ // the right state.
+ if (S.OptimizeMappingSymbols) {
+ for (MCSection &Sec : Asm) {
+ S.switchSection(&Sec);
+ if (Sec.isText())
+ S.emitA64MappingSymbol();
+ else
+ S.emitDataMappingSymbol();
+ }
+ }
+
MCSectionELF *MemtagSec = nullptr;
for (const MCSymbol &Symbol : Asm.symbols()) {
const auto &Sym = cast<MCSymbolELF>(Symbol);
diff --git a/llvm/test/MC/AArch64/mapping-across-sections.s b/llvm/test/MC/AArch64/mapping-across-sections.s
index 6bb5a8811b57d..49151326caf6c 100644
--- a/llvm/test/MC/AArch64/mapping-across-sections.s
+++ b/llvm/test/MC/AArch64/mapping-across-sections.s
@@ -1,4 +1,5 @@
// RUN: llvm-mc -triple=aarch64 -filetype=obj %s | llvm-objdump -t - | FileCheck %s
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -optimize-mapping-symbols %s | llvm-objdump -t - | FileCheck %s --check-prefix=CHECK1
.section .text1,"ax"
add w0, w0, w0
@@ -38,3 +39,12 @@ add w0, w0, w0
// CHECK-NEXT: 0000000000000004 l .rodata 0000000000000000 $x.7
// CHECK-NEXT: 0000000000000000 l .comment 0000000000000000 $d.8
// CHECK-NOT: {{.}}
+
+// CHECK1: SYMBOL TABLE:
+// CHECK1-NEXT: 0000000000000004 l .text 0000000000000000 $d.0
+// CHECK1-NEXT: 0000000000000008 l .text 0000000000000000 $x.1
+// CHECK1-NEXT: 000000000000000c l .text 0000000000000000 $d.2
+// CHECK1-NEXT: 0000000000000004 l .rodata 0000000000000000 $x.3
+// CHECK1-NEXT: 0000000000000010 l .text 0000000000000000 $x.4
+// CHECK1-NEXT: 0000000000000008 l .rodata 0000000000000000 $d.5
+// CHECK-NOT: {{.}}
|
The new option "optimize-mapping-symbols" is for llc/llvm-mc/etc. It's desirable to coordinate with the GNU binutils project on the naming of this option. It's worth noting that the |
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
As suggested by a binutils global maintainer, Jan Beulich, While binutils aarch64 maintainer Richard hasn't yet endorsed the solution, I believe there's significant value in having this option This feature is for LLVM 20, not for 19.x release, so that I will have more time to convince Richard. |
🥳🥳 |
Ping:) |
Apologies for the delay in responding. I'm uncomfortable with symbols defined at the end of the section as these fall through into the following section. I have seen that cause trouble before with 0 sized sections as contradictory mapping symbols at the same address are not well defined by the ABI. While it is true that the most likely implementation of a linker would order mapping symbols such that a trailing mapping symbol would get sorted before a mapping symbol at offset 0 in the next section; I'm not sure we can guarantee it, for example a non-stable sort of the symbols by address could break that. I also know of at least one linker that tries to optimize mapping symbols by removing redundant mapping symbols that will give an error message if it spots two contradictory mapping symbols at the same address. I think it would be better to just omit the trailing $x mapping symbol as I expect it would cause more problems than it will solve. As Going back to the ABI perhaps it would be worth looking at symbols rather than sections. Essentially can we make a symbol of type STT_FUNC with bit 0 clear (thinking of Morello, which uses 1 for C64 instructions) an implicit $x and a symbol of type STT_OBJECT an implicit $d, with all other symbols being neutral? This would solve the trailing $d problem for compiled code and well written assembly code (people are lazy about setting the type of symbols). |
Thank you for your feedback. You suggest a scheme where the trailing $x mapping symbol is omitted to prevent potential contradictory mapping symbols at the same address.
This requires a smart linker that inserts extra mapping symbols. I believe it's very difficult for Clang to emit trailing data in a text section. While I'm unfamiliar with Morello, omitting a starting label (often the function name) is possible in hand-written assembly. This disqualifies approaches that rely on start label Is it feasible to have a cl::opt (not -Wa,--foo) to have the ending mapping symbol behavior? I understand the preference for a limited number of schemes, but I anticipate that we may end up with three distinct options as users have different needs: |
Morello definitely couldn't use this option https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst#mapping-symbols . Having said that Morello is not part of upstream LLVM and is an architecture experiment. Yes I don't think there will be a single option that is going to work for everyone.
On reflection I think putting the trailing $x in is probably the least worst optimised solution as I expect it is the least likely to cause problems. Is there a strong objection to optimising this at link time? There would be some run-time overhead to scan for redundant $x and removing them, but this would always give the same answer as the current default. |
Thanks for the table:)
True. (As aforementioned, GNU assembler's AArch32/AArch64 implementations omit mapping symbols in some cases, therefore not covering all cases
Thank you for acknowledging the benefits of the ending mapping symbol:)
Yes, to maitain relocatable file size and preserve the current linker performance. My preliminary analysis revealed that making lld smarter will cause significant performance overhead
Tangentially, Chrome is now exploring CREL. Perhaps Android will follow up as well. Reduced relocatable file size is particularly important for large projects.
|
Some thoughts after thinking about this on the weekend and talking to some colleagues. Apologies for the length of the response, a brief summary:
In more detail. The pragmatist in me says that this is a useful (possibly llvm only [1]) option for people with large builds to enable an alternative mapping symbol definition than the ABI and accept the consequences of that as well as the benefits. The idealist in me says that if this propagates out into binary distributions across multiple projects then we may have incompatibilities with GNU and while there are no plans to add one, it may rule out something like a second instruction set (like Morello or Thumb [2]). A colleague working on Bolt mentioned that there could be a case for more mapping symbols to help binary analysis, this implies that there could be more mapping symbol density options than just optimize and default. He also mentioned that recording the mapping symbol level choice in the ELF file (mechanism TBD) might be a way around the future compatibility problem, and could extend to the additional mapping symbol cases. Does recording the use of the alternative mapping symbol mechanism appeal at all? I'm still nervous about adding trailing I also don't think we can rely on the dumb linker collation order putting the trailing
If As an aside, I'm surprised the linker removal of redundant mapping symbols took so long. It looks like we already stable_partition the static symbol table in Another possibility that I think would work in the majority of AArch64 cases without costing too much extra link time, which I think could be done before the stable_partition (to speed that up) is to see if each OutputSection always has the same mapping symbol type. For example if the |
That was my first thought, too. As far as I can see, this option is essentially inventing a nonstandard variation on the Arm ELF spec, which is not fully compatible with the standard one, as the PR description mentions:
Currently, as far as I can see, the only way that a user can rule out this error scenario is to use their own memory and self-discipline to keep track of which objects (or which libraries, or which whole projects) they're building using this nonstandard flag. Surely it would be better to mark object files to indicate explicitly that they follow this nonstandard ABI. Then at the very least a linker can detect a mixture of the two and complain, even if it can't automatically compensate. That way the user doesn't have to do it all by self-discipline. |
Thanks for the comprehensive summary. I concur with your analysis, including the subsequent points. I believe some users, like mine, are willing to accept the consequences.
Yes. -fsanitize=function and
Agreed, this is a potential downside similar to --symbol-ordering-file and --call-graph-profile-sort. Fortunately, this risk seems manageable for some users. First, text sections with trailing data are extremely rare.
I have developed this hack https://github.com/MaskRay/llvm-project/tree/lld-mapsym , 5% slowdown. diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index c27ab2b67dc2..d8e4bc976016 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2172,4 +2172,9 @@ void SymbolTableBaseSection::sortSymTabSymbols() {
getParent()->info = numLocals + 1;
+ for (auto &s : symbols)
+ s.va = s.sym->getVA();
+ std::stable_sort(symbols.begin(), e,
+ [](auto &l, auto &r) { return l.va < r.va; });
+
// We want to group the local symbols by file. For that we rebuild the local
// part of the symbols vector. We do not need to care about the STT_FILE
@@ -2190,5 +2195,5 @@ void SymbolTableBaseSection::addSymbol(Symbol *b) {
// Adding a local symbol to a .dynsym is a bug.
assert(this->type != SHT_DYNSYM || !b->isLocal());
- symbols.push_back({b, strTabSec.addString(b->getName(), false)});
+ symbols.push_back({b, strTabSec.addString(b->getName(), false), 0});
}
@@ -2266,5 +2271,5 @@ template <class ELFT> void SymbolTableSection<ELFT>::writeTo(uint8_t *buf) {
if (isDefinedHere) {
eSym->st_shndx = shndx;
- eSym->st_value = sym->getVA();
+ eSym->st_value = ent.va ? ent.va : sym->getVA();
// Copy symbol size if it is a defined symbol. st_size is not
// significant for undefined symbols, so whether copying it or not is up
@@ -2492,5 +2497,5 @@ void GnuHashTableSection::addSymbols(SmallVectorImpl<SymbolTableEntry> &v) {
v.erase(mid, v.end());
for (const Entry &ent : symbols)
- v.push_back({ent.sym, ent.strTabOffset});
+ v.push_back({ent.sym, ent.strTabOffset, 0});
}
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index d4169e1e1aca..c7e385119c03 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -642,4 +642,5 @@ struct SymbolTableEntry {
Symbol *sym;
size_t strTabOffset;
+ uint64_t va;
};
This approach (deleting extra mapping symbols) is for the default case. Bailing out could hinder a significant size optimization for executables with minor data-in-code elements.
Alternatively, removing leading
The first choice has been discussed above. Technically a linker could delete The second choice "omit leading $x, no trailing $x" requires a smart linker that builds a output-section-to-symbol map ( |
You're absolutely right. The nonstandard option is invaluable for those with full control over their toolchain, no reliance on weird relocatable files, and a strong focus on minimizing both relocatable and executable sizes.
While marking object files could be beneficial, it's essential to avoid overly restrictive compatibility checks. False positives could hinder development. For example, the following code assembled with a traditional assembler could get marked to be incompatible with
(This pattern is extremely rare, and can be ruled out with certain control over the toolchain, rendering a marker unnecessary.) But pure code should not trigger incompatiblity warning.
|
I don't think we need to use an object marking to do a compatibility check, while that is possible I don't think it is necessary in the majority of the cases. What it does do is give us the chance to identify objects with the alternative marking scheme and unambiguously know that an executable section with no mapping symbols is either data or AArch64 code. This could help tools if the difference became important. While I don't want to speculate too much over how to mark an object, I'm wondering if it is something you're willing to entertain? As an aside; Arm's original ELF spec for its proprietary toolchain prior to the ABI in 2004 used to define a couple of ELF header flags:
These were rejected when the AArch32 ABI was published as specific to Arm's proprietary toolchain. |
To give you a summary of my thoughts so far:
|
Created using spr 1.3.5-bogner
Renamed to "implicit-mapsyms"
Ack.
Looks good.
I agree with your point about discouraging its use where portability is crucial. Does the last sentence read as: the AArch64 ABI might make changes to break backwards compatibility. But with what? As is, the proposed I agree that users of the opt-in option should well understand the risks.
Looks good.
Thanks for the suggestion. Improved the message at
Looks good. Recording the mapping symbol scheme might depend on
Yes, I'm open to exploring that once the base mechanism is available.
Thanks for the context. |
Just a quick comment to say thanks for the updates. I ran out of time to look today, will try and take a look tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are comments on the implementation/test only.
DenseMap<MCSection *, MCSymbol *> EndMappingSym; | ||
for (MCSection &Sec : Asm) { | ||
S.switchSection(&Sec); | ||
if (S.LastEMS == (Sec.isText() ? AArch64ELFStreamer::EMS_Data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can only happen if someone creates a named section and forgets to put "x" in the flags. It is a bit disappointing that addition of an instruction doesn't implicitly set the "x" flag on a section. It looks like both GNU as and LLVM don't do this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GNU assembler doesn't warn about instructions in data sections and LLVM follow suits: .section foo,"a"; nop
. The code is written to respect the section flags (instead of potential user intention) and add ending $d
if needed.
I'd hope that instructions in data sections lead to a warning.
DenseMap<MCSection *, size_t> Cnt; | ||
Syms.truncate(NumSyms); | ||
for (const MCSymbol *Sym : Syms) | ||
if (Sym->isInSection()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we have a count of symbols for every section. For a large object with lots of sections (coming from say LTO with -ffunction-sections) then if we expect trailing symbols to be rare, then it could be better to test if the section has a trailing mapping symbol early, and only retain counts for the sections that we are adding to.
May not be worth it for the extra complexity though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having 2 dense maps DenseMap<MCSection *, MCSymbol *> EndMappingSym; DenseMap<MCSection *, size_t> Cnt;
was wasteful. I have changed the code to use one single map.
The performance benefit should be negligible as we cannot avoid the Syms
iteration.
@@ -1,5 +1,10 @@ | |||
// RUN: llvm-mc -triple=aarch64 -filetype=obj %s | llvm-objdump -t - | FileCheck %s --match-full-lines | |||
// RUN: llvm-mc -triple=aarch64 -filetype=obj -implicit-mapsyms %s | llvm-objdump -t - | FileCheck %s --check-prefix=CHECK1 --match-full-lines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a comment on this test. I think it would be worth a test, probably in lld, or possibly via yaml2obj to test what the disassembler does when it gets two mapping symbols at the same address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added lld/test/ELF/aarch64-mapsyms-implicit.s
If another ISA like Thumb is to Arm is added to AArch64 [*] then we wouldn't be able to tell which instruction set to start with for an executable section. There are likely some implicit assumptions we can make if we're sensible, such as defaulting to AArch64 if there's no mapping symbol, and if the other ISA is used it needs a mapping symbol. I suspect that llvm tools could be made to work, however it could get messy with GNU and other tools. [*] Morello's hybrid mode does this. A subset of the CHERI instructions can be used in EM_AARCH64 after some transitioning like Arm to Thumb. The majority of experimentation with Morello has been in pure capability mode which is a separate machine from AARCH64.
I'm expecting a build attributes implementation in GNU fairly shortly for GCS assuming it can hit early enough for the GCC release cycle. It may have to slip to the next GCC release if it doesn't make it in time though. I'm hoping that we can get a LLVM implementation started for LLVM 20. |
Created using spr 1.3.5-bogner
Created clangDriver patch: #104542 (I think this patch is very meaningful and we should not make it blocked by the build attribute addition.) |
Apologies for the delay in responding, I'm aiming to take a look tomorrow, as well as the other ones in the series. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates.
Although possibly not for this patch. It looks llvm-objdump (and gnu objdump) look to reorder $x after $d when the symbols are at the same address. This looks like worse case behaviour.
I think we could address this by looking at llvm-objdump, or we could have lld filter out the clashes.
The other tool that I thought could be affected by this is Bolt. It uses mapping symbols to determine the location of constant islands. It looks to do this with CodeOffsets and DataOffsets (some references below). Are your large programs likely to be post-processed by Bolt?
Some references:
https://github.com/llvm/llvm-project/blob/main/bolt/include/bolt/Core/BinaryFunction.h#L1972
https://github.com/llvm/llvm-project/blob/main/bolt/lib/Core/BinaryEmitter.cpp#L539
https://github.com/llvm/llvm-project/blob/main/bolt/lib/Core/BinaryFunction.cpp#L1049
Apologies to keep bringing these up. What I want to avoid is people finding other llvm tools subtly broken. I don't think that these are a blocker for implicit mapsyms, but it should influence the level of warning we give for the option. We may want to call out Bolt specifically as a tool that could be affected for example.
I think the code here is good for llvm-mc. I don't have any more comments on this patch.
I've got a last couple of questions:
- Do you want to update llvm-objdump as whatever collation order the linker uses, it is going to order
$d
before$x
? I think that could be done in different patch. - If Bolt could get confused by two symbols at the same address I think we'll need to call that out in the clang driver help text. Do you think it would be worth an lld patch to remove (or make both symbols the same) when there is a clash? That would make Bolt work for a LLVM based toolchain.
# CHECK-EMPTY: | ||
# CHECK-NEXT: <$d>: | ||
# CHECK-NEXT: <$x>: | ||
# CHECK-NEXT: udf #42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This points at an issue in llvm-objdump. It could be sorting symbols at the same address by name.
Looking at llvm-readelf --symbols
for 0x201134 I can see $x before $d as we would expect, however llvm-objdump has put the $d before $x.
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000210124 0 NOTYPE LOCAL DEFAULT 1 $d
2: 0000000000210128 0 NOTYPE LOCAL DEFAULT 1 $x
3: 0000000000210130 0 NOTYPE LOCAL DEFAULT 2 $d
4: 0000000000210134 0 NOTYPE LOCAL DEFAULT 2 $x
5: 0000000000210134 0 NOTYPE LOCAL DEFAULT 2 $d
6: 0000000000210138 0 NOTYPE LOCAL DEFAULT 2 $x
7: 0000000000210120 0 NOTYPE GLOBAL DEFAULT 1 _start
This is unfortunate as it means every trailing $x
is going to result in an incorrect disassembly when there's a leading $d
.
Just a quick note that I'll be out of office from Friday 23rd. Will be back on Tuesday 27th so I may be a bit slow to respond. |
Yes, the -fsanitize=function and -fsanitize=kcfi cases that many people don't use.
Yes. The test intends to demonstrate the weakness.
No, not post-processed by BOLT.
Yes, I will take a stab.
I want to change Currently, we could only rely on the Enjoy your vacation! I'll take a flight on Friday and will be slow to respond for about 3 weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the feedback. I don't have any more feedback on this specific patch.
Hopefully we'll be able to improve llvm-objdump and possibly lld to avoid the corner cases.
Thanks for letting us know you'll be slow to respond over the next few weeks.
-Wa,-mmapsyms=implicit enables the alternative mapping symbol scheme discussed at #99718. While not conforming to the current aaelf64 ABI, the option is invaluable for those with full control over their toolchain, no reliance on weird relocatable files, and a strong focus on minimizing both relocatable and executable sizes. The option is discouraged when portability of the relocatable objects is a concern. https://maskray.me/blog/2024-07-21-mapping-symbols-rethinking-for-efficiency elaborates the risk. Pull Request: #104542
The start state of a new section is `EMS_None`, often leading to a $d/$x at offset 0. Introduce a MCTargetOption/cl::opt "implicit-mapsyms" to allow an alternative behavior (ARM-software/abi-aa#274): * Set the start state to `EMS_Data` or `EMS_A64`. * For text sections, add an ending $x only if the final data is not instructions. * For non-text sections, add an ending $d only if the final data is not data commands. ``` .section .text.1,"ax" nop // emit $d .long 42 // emit $x .section .text.2,"ax" nop ``` This new behavior decreases the .symtab size significantly: ``` % bloaty a64-2/bin/clang -- a64-0/bin/clang FILE SIZE VM SIZE -------------- -------------- -5.4% -1.13Mi [ = ] 0 .strtab -50.9% -4.09Mi [ = ] 0 .symtab -4.0% -5.22Mi [ = ] 0 TOTAL ``` --- This scheme works as long as the user can rule out some error scenarios: * .text.1 assembled using the traditional behavior is combined with .text.2 using the new behavior * A linker script combining non-text sections and text sections. The lack of mapping symbols in the non-text sections could make them treated as code, unless the linker inserts extra mapping symbols. The above mix-and-match scenarios aren't an issue at all for a significant portion of users. A text section may start with data commands in rare cases (e.g. -fsanitize=function) that many users don't care about. When combing `(.text.0; .word 0)` and `(.text.1; .word 0)`, the ending $x of .text.0 and the initial $d of .text.1 may have the same address. If both sections reside in the same file, ensure the ending symbol comes before the initial $d of .text.1, so that a dumb linker respecting the symbol order will place the ending $x before the initial $d. Disassemblers using stable sort will see both symbols at the same address, and the second will win. When section ordering mechanisms (e.g. --symbol-ordering-file, --call-graph-profile-sort, `.text : { second.o(.text) first.o(.text) }`) are involved, the initial data in a text section following a text section with trailing data could be misidentified as code, but the issue is local and the risk could be acceptable. Pull Request: llvm#99718
-Wa,-mmapsyms=implicit enables the alternative mapping symbol scheme discussed at llvm#99718. While not conforming to the current aaelf64 ABI, the option is invaluable for those with full control over their toolchain, no reliance on weird relocatable files, and a strong focus on minimizing both relocatable and executable sizes. The option is discouraged when portability of the relocatable objects is a concern. https://maskray.me/blog/2024-07-21-mapping-symbols-rethinking-for-efficiency elaborates the risk. Pull Request: llvm#104542
The start state of a new section is
EMS_None
, often leading to a$d/$x at offset 0. Introduce a MCTargetOption/cl::opt
"implicit-mapsyms" to allow an alternative behavior
(ARM-software/abi-aa#274):
EMS_Data
orEMS_A64
.This new behavior decreases the .symtab size significantly:
This scheme works as long as the user can rule out some error scenarios:
lack of mapping symbols in the non-text sections could make them
treated as code, unless the linker inserts extra mapping symbols.
The above mix-and-match scenarios aren't an issue at all for a
significant portion of users.
A text section may start with data commands in rare cases (e.g.
-fsanitize=function) that many users don't care about. When combing
(.text.0; .word 0)
and(.text.1; .word 0)
, the ending $x of .text.0and the initial $d of .text.1 may have the same address. If both
sections reside in the same file, ensure the ending symbol comes before
the initial $d of .text.1, so that a dumb linker respecting the symbol
order will place the ending $x before the initial $d.
Disassemblers using stable sort will see both symbols at the same
address, and the second will win.
When section ordering mechanisms (e.g. --symbol-ordering-file,
--call-graph-profile-sort,
.text : { second.o(.text) first.o(.text) }
)are involved, the initial data in a text section following a text
section with trailing data could be misidentified as code, but the issue
is local and the risk could be acceptable.