Skip to content

[LLVM] LTO usage errors incorrectly reported to the user as crashes/bugs #140953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bd1976bris opened this issue May 21, 2025 · 3 comments
Open
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)

Comments

@bd1976bris
Copy link
Collaborator

bd1976bris commented May 21, 2025

Usage errors in LTOBackend.cpp are incorrectly reported to users as crashes/bugs.

For example, using the official LLVM 19 toolchain binaries, specifying an overly long cache directory results in the following output on my Windows 11 machine:

> "lld-link.exe" msvc.bc /lldltocache:<too long dir name> /entry:main /out:a.out
LLVM ERROR: can't create cache directory <too long dir name>: invalid argument

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Exception Code: 0xC000001D
 #0 0x00007ff6b392cf36 xmlLinkGetData (c:\x\clang+llvm-19.1.0-x86_64-pc-windows-msvc\bin\lld-link.exe+0x180cf36)
 #1 0x00007ff6b38622d2 xmlLinkGetData (c:\x\clang+llvm-19.1.0-x86_64-pc-windows-msvc\bin\lld-link.exe+0x17422d2)
<...>

This kind of reporting is not appropriate for usage errors and recently confused one of Sony’s customers, who submitted a bug report (as instructed by the toolchain) after specifying an invalid LTO cache directory.

As another example, see the COFF LLD test lto-cache-errors.ll and note the use of not --crash in this test.

This issue is a bit more problematic on Windows than on Linux. On Windows, the use of reportFatalUsageError, which might be assumed to be a fix for this issue, produces a message asking users to report a bug.

Internal tracker: TOOLCHAIN-17744

@bd1976bris bd1976bris self-assigned this May 21, 2025
bd1976bris added a commit to bd1976bris/llvm-project that referenced this issue May 21, 2025
LLVM’s `reportFatalUsageError`, introduced recently, may emit a
stack trace and bug report prompt due to the `PrettyStackTrace`
signal handler (initialized via `InitLLVM`). On Windows, this
occurs when `sys::RunInterruptHandlers()` is invoked from
`reportFatalUsageError`.

This behavior is misleading for usage errors. For example, one
of Sony’s customers reported a bug after specifying an invalid
LTO cache directory - a clear usage error - because the
toolchain output included a stack trace and instructions to file
a bug report.

This patch suppresses `PrettyStackTrace` output for usage errors
by passing a flag to the signal handlers, indicating when the
error should not trigger crash-style diagnostics.

To test this, LTO cache directory errors have been updated to
use `reportFatalUsageError` and the existing Linux-specific test
has been replaced with a Windows-only test case that
additionally verifies no crash-style diagnostics are emitted.

LLVM Issue: llvm#140953
Internal Tracker: TOOLCHAIN-17744
bd1976bris added a commit to bd1976bris/llvm-project that referenced this issue May 21, 2025
Usage errors in `LTOBackend.cpp` were previously, misleadingly,
reported as internal crashes.

This patch updates `LTOBackend.cpp` to use
`reportFatalUsageError` for reporting usage-related issues.

LLVM Issue: llvm#140953
Internal Tracker: TOOLCHAIN-17744
@bd1976bris
Copy link
Collaborator Author

I have opened the following two PR's to address this issue:

  1. Report usage errors in LTOBackend.cpp using reportFatalUsageError: [LLVM] Use reportFatalUsageError for LTO usage errors #140955
  2. Don't emit crash-style error messages for usage errors on Windows: [LLVM][Windows] Elide PrettyStackTrace output for usage errors #140956

@bd1976bris
Copy link
Collaborator Author

Just noting that on Linux LLVM once called the signal handler for report_fatal_error but this changed in: 288999b.

bd1976bris added a commit that referenced this issue May 21, 2025
Usage errors in `LTOBackend.cpp` were previously, misleadingly, reported
as internal crashes.

This PR updates `LTOBackend.cpp` to use `reportFatalUsageError` for
reporting usage-related issues.

LLVM Issue: #140953
Internal Tracker: TOOLCHAIN-17744
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this issue May 21, 2025
…140955)

Usage errors in `LTOBackend.cpp` were previously, misleadingly, reported
as internal crashes.

This PR updates `LTOBackend.cpp` to use `reportFatalUsageError` for
reporting usage-related issues.

LLVM Issue: llvm/llvm-project#140953
Internal Tracker: TOOLCHAIN-17744
bd1976bris added a commit to bd1976bris/llvm-project that referenced this issue May 21, 2025
On Windows, LLVM’s `reportFatalUsageError` may emit a stack trace and
bug report prompt due to the `PrettyStackTrace` signal handler,
initialized via `InitLLVM`. This occurs when
`sys::RunInterruptHandlers()` is called from `reportFatalUsageError`.

This behavior is misleading for usage errors. For example, one
of Sony’s customers filed a bug after specifying an invalid LTO
cache directory - a clear usage error - because the toolchain
output included a stack trace and instructions to report a bug.

This patch suppresses `PrettyStackTrace` output for usage errors by
adding a flag to `sys::RunInterruptHandlers()` to indicate whether
signal handlers should be executed.

To test this, the existing Linux-specific LTO cache directory test has
been replaced with a Windows-only test case that also verifies no
crash-style diagnostics are emitted.

LLVM Issue: llvm#140953
Internal Tracker: TOOLCHAIN-17744
@dtcxzyw dtcxzyw added LTO Link time optimization (regular/full LTO or ThinLTO) and removed new issue labels May 22, 2025
bd1976bris added a commit to bd1976bris/llvm-project that referenced this issue May 22, 2025
On Windows, LLVM’s `reportFatalUsageError` (llvm#138251) may emit a
bug report prompt due to the `PrettyStackTrace` signal handler,
initialized via `InitLLVM`. This occurs when `RunInterruptHandlers()`
is called from `reportFatalUsageError`.

This behavior is misleading for usage errors. For example, one
of Sony’s customers filed a bug after specifying an invalid LTO
cache directory - a clear usage error - because the toolchain
output included instructions to report a bug.

This patch suppresses `PrettyStackTrace` output for usage errors by
adding a flag to `sys::RunInterruptHandlers()` to indicate whether
signal handlers should be executed.

To test this, I have modified the invalid LTO pipeline errors to call
`reportFatalUsageError`, and I have updated the existing LLD test to
additionally verify that no bug report message has been emitted.

LLVM Issue: llvm#140953
Internal Tracker: TOOLCHAIN-17744
@bd1976bris
Copy link
Collaborator Author

#140955 was reverted. Digging into why the lto-cache-errors.ll test fails non-deterministically, I suspect that reportFatalUsageError can cause a crash if threads call reportFatalUsageError simultaneously (the ThinLTO cache dir is used from a multi-threaded context). I have observed this on Linux. It may also apply on Windows.

@bd1976bris bd1976bris removed their assignment May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

No branches or pull requests

3 participants