-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[CMake] Add a cache file for building a highly-optimized LLVM toolchain #117802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The goal of these cache files is to provide a toolchain that: 1. Relies on only LLVM components (as much as possible). 2. Is highly optimized. These cache files will produce a full toolchain (clang/compiler-rt/libcxx/lld) where clang is built with LTO, PGO, and BOLT optimizations, and is statically linked with an LTO optimized libc++ and compiler-rt. I would eventually like to use these as the basis for the release builds.
@llvm/pr-subscribers-clang Author: Tom Stellard (tstellar) ChangesThe goal of these cache files is to provide a toolchain that:
These cache files will produce a full toolchain I would eventually like to use these as the basis for the release builds. Full diff: https://github.com/llvm/llvm-project/pull/117802.diff 5 Files Affected:
diff --git a/clang/cmake/caches/llvm-toolchain/stage1.cmake b/clang/cmake/caches/llvm-toolchain/stage1.cmake
new file mode 100644
index 00000000000000..13a0a7871dcd26
--- /dev/null
+++ b/clang/cmake/caches/llvm-toolchain/stage1.cmake
@@ -0,0 +1,13 @@
+# Stage 1:
+# * Build the prerequisites for stage 2.
+# * We will be building an LTO optimized libcxx in stage 2, so we need to
+# build clang and lld.
+
+
+set(CMAKE_BUILD_TYPE Release CACHE STRING "")
+set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
+set(LLVM_ENABLE_PROJECTS "clang;lld" CACHE STRING "")
+
+set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "")
+set(CLANG_BOOTSTRAP_CMAKE_ARGS -C ${CMAKE_CURRENT_LIST_DIR}/stage2.cmake CACHE BOOL "")
+set(CLANG_BOOTSTRAP_TARGETS stage3-check-all stage3-distribution stage3-install-distribution stage3-clang stage3-clang-bolt CACHE BOOL "")
diff --git a/clang/cmake/caches/llvm-toolchain/stage2.cmake b/clang/cmake/caches/llvm-toolchain/stage2.cmake
new file mode 100644
index 00000000000000..87ac7899b205ef
--- /dev/null
+++ b/clang/cmake/caches/llvm-toolchain/stage2.cmake
@@ -0,0 +1,20 @@
+# Stage 2:
+# * Build an LTO optimized libcxx, so we can staticially link it into stage 3
+# clang.
+# * Stage 3 will be PGO optimized, so we need to build clang, lld, and
+# compiler-rt in stage 2.
+
+
+set(CMAKE_BUILD_TYPE Release CACHE STRING "")
+set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
+set(LLVM_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "" FORCE)
+set(LLVM_ENABLE_PROJECTS "clang;lld" CACHE STRING "" FORCE)
+
+set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "")
+set(CLANG_BOOTSTRAP_CMAKE_ARGS -C ${CMAKE_CURRENT_LIST_DIR}/stage3-instrumented.cmake CACHE BOOL "")
+set(BOOTSTRAP_LLVM_BUILD_INSTRUMENTED IR CACHE BOOL "")
+set(CLANG_BOOTSTRAP_TARGETS stage3-check-all stage3-distribution stage3-install-distribution stage3-clang stage3-clang-bolt CACHE BOOL "")
+set(LIBCXX_STATICALLY_LINK_ABI_IN_STATIC_LIBRARY ON CACHE STRING "")
+set(RUNTIMES_CMAKE_ARGS "-DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON" CACHE STRING "")
+set(LLVM_ENABLE_LLD ON CACHE STRING "")
+#set(CLANG_DEFAULT_RTLIB compiler-rt CACHE STRING "")
diff --git a/clang/cmake/caches/llvm-toolchain/stage3-instrumented.cmake b/clang/cmake/caches/llvm-toolchain/stage3-instrumented.cmake
new file mode 100644
index 00000000000000..4570a6c758ccda
--- /dev/null
+++ b/clang/cmake/caches/llvm-toolchain/stage3-instrumented.cmake
@@ -0,0 +1,10 @@
+# Stage 3 instrumented:
+# * Build an instrumented clang, so we can generate profile data for stage 3.
+
+
+set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "")
+set(CLANG_BOOTSTRAP_CMAKE_ARGS -C ${CMAKE_CURRENT_LIST_DIR}/stage3.cmake CACHE BOOL "")
+set(CLANG_BOOTSTRAP_TARGETS clang check-all distribution install-distribution clang-bolt CACHE BOOL "")
+set(CLANG_BOLT OFF CACHE STRING "")
+
+include(${CMAKE_CURRENT_LIST_DIR}/stage3.cmake)
diff --git a/clang/cmake/caches/llvm-toolchain/stage3.cmake b/clang/cmake/caches/llvm-toolchain/stage3.cmake
new file mode 100644
index 00000000000000..a9050a64d8f633
--- /dev/null
+++ b/clang/cmake/caches/llvm-toolchain/stage3.cmake
@@ -0,0 +1,16 @@
+# Stage 3:
+# * This is the final stage.
+# * The goals is to have a clang that is LTO, PGO, and bolt optimized and also
+# statically linked to libcxx and compiler-rt.
+
+set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "")
+set(LLVM_ENABLE_RUNTIMES "compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "" FORCE)
+set(LLVM_ENABLE_PROJECTS "clang;lld;bolt;" CACHE STRING "" FORCE)
+set(LLVM_ENABLE_LLD ON CACHE STRING "")
+set(LLVM_ENABLE_LTO THIN CACHE STRING "")
+set(LLVM_ENABLE_LIBCXX ON CACHE STRING "")
+set(LLVM_STATIC_LINK_CXX_STDLIB ON CACHE STRING "")
+set(CLANG_BOLT "INSTRUMENT" CACHE STRING "")
+set(CMAKE_EXE_LINKER_FLAGS "-Wl,--emit-relocs,-znow -rtlib=compiler-rt --unwindlib=libunwind -static-libgcc" CACHE STRING "")
+set(CMAKE_SHARED_LINKER_FLAGS "-rtlib=compiler-rt --unwindlib=libunwind -static-libgcc" CACHE STRING "")
+set(LLVM_DISTRIBUTION_COMPONENTS "clang;lld;runtimes;clang-resource-headers" CACHE STRING "")
diff --git a/clang/cmake/caches/llvm-toolchain/usage.rst b/clang/cmake/caches/llvm-toolchain/usage.rst
new file mode 100644
index 00000000000000..82b6d678669643
--- /dev/null
+++ b/clang/cmake/caches/llvm-toolchain/usage.rst
@@ -0,0 +1,20 @@
+# LLVM Toolchain Build
+
+This directory contains cache files for building a complete LLVM-based toolchain.
+The resulting clang build will be LTO, PGO, and BOLT optimized and statically
+linked against libc++ and compiler-rt.
+
+The build is done in 4 stages:
+
+* Stage 1: Build clang/lld.
+* Stage 2: Build an LTO optimized libc++ with Stage 1 clang/lld.
+* Stage 3 Instrumented: Build clang with instrumentation in order to generate
+ profile data for PGO.
+* Stage 3: Build clang with LTO, PGO, and BOLT optimizations and statically link
+ with stage2 libc++ and compiler-rt.
+
+## Usage
+
+..
+ cmake -S llvm -B build -C clang/cmake/caches/llvm-toolchain/stage1.cmake
+ ninja stage3-install-distribution
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtimes build already builds the toolchain and then builds the runtimes with the just-built toolchain. Can we build the LTO-optimized libc++ as part of stage1 and have stage2 be what the current stage3 is, to avoid any redundancy? I really like the direction here otherwise.
That's exactly what we do in Fuchsia, we build runtimes in stage 1 and then use them in stage 2. |
I've updated the PR to use 3 stages instead of 4, and it now depends on #119896. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks generally good to me, though @petrhosek has more relevant experience here so his review would also be useful.
set(BOOTSTRAP_LLVM_BUILD_INSTRUMENTED IR CACHE BOOL "") | ||
set(CLANG_BOOTSTRAP_TARGETS stage2-check-all stage2-distribution stage2-install-distribution stage2-clang stage2-clang-bolt CACHE BOOL "") | ||
set(LIBCXX_STATICALLY_LINK_ABI_IN_STATIC_LIBRARY ON CACHE STRING "") | ||
set(RUNTIMES_CMAKE_ARGS "-DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON" CACHE STRING "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this use FullLTO or ThinLTO? We're explicitly building the binaries with ThinLTO, so I'd assumed libc++ would also need to be ThinLTO to be able to be statically linked into them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "") | ||
set(CLANG_BOOTSTRAP_CMAKE_ARGS -C ${CMAKE_CURRENT_LIST_DIR}/stage2-instrumented.cmake CACHE BOOL "") | ||
set(BOOTSTRAP_LLVM_BUILD_INSTRUMENTED IR CACHE BOOL "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to set this in stage2-instrumented instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not. This needs to be set in the stage1 config: https://github.com/llvm/llvm-project/blob/main/clang/CMakeLists.txt#L597
set(LLVM_STATIC_LINK_CXX_STDLIB ON CACHE STRING "") | ||
set(CLANG_BOLT "INSTRUMENT" CACHE STRING "") | ||
set(CMAKE_EXE_LINKER_FLAGS "-Wl,--emit-relocs,-znow -rtlib=compiler-rt --unwindlib=libunwind -static-libgcc" CACHE STRING "") | ||
set(CMAKE_SHARED_LINKER_FLAGS "-rtlib=compiler-rt --unwindlib=libunwind -static-libgcc" CACHE STRING "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had some issues in the past (that I can't remember the details of) if I didn't also set CMAKE_MODULE_LINKER_FLAGS.
Co-authored-by: Shoaib Meenai <[email protected]>
These cache files seem to be largely the same as PGO.cmake, PGO-stage2.cmake, PGO-stage2-instrumented.cmake, the only difference appears to be the inclusion of BOLT. Do we really need separate cache files for that? Could we just extend the existing files? |
The main difference is that this llvm-toolchain cache is statically linking both libc++ and the compiler-rt builtins into the stage2 compiler. With the PGO.cmake file, you end up with a compiler that dynamically links gcc's libstdc++ and libgcc. The goal is to create a toolchain the relies on LLVM components as much as possible and has binaries artifacts that are easy to redistribute to a wide variety of host platforms (i.e. because they have minimal dynamic library dependencies). |
The goal of these cache files is to provide a toolchain that:
These cache files will produce a full toolchain
(clang/compiler-rt/libcxx/lld) where clang is built with LTO, PGO, and BOLT optimizations, and is statically linked with an LTO optimized libc++ and compiler-rt.
I would eventually like to use these as the basis for the release builds.