Support for debugging Linux kernel with Compact Type Format (CTF) #495
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CTF is a lightweight format for representing C types. While there is an older format which seems to be currently in use by some BSDs, this pull request refers to the CTF format specified and actively maintained in the binutils project, aka CTF v3. The format can be generated by GCC, linked and deduplicated by GNU ld, and consumed by GDB, as well as DTrace for Linux. This pull request adds the ability for drgn to use CTF for a type & object finder for the Linux kernel. (The object finder relies on the existence of a symbol finder, e.g. from kallsyms).
CTF is currently used by DTrace on Oracle Linux. It is packaged as part of UEK kernels in a file named
/lib/modules/$(uname -r)/kernel/vmlinux.ctfa
. This is an out-of-tree patch carried by UEK, but we're working to upstream it. Thevmlinux.ctfa
file is a CTF "archive", which contains a 1-level hierarchy of "dicts" (dictionaries). The leaf dictionaries each represent the kernel ("vmlinux") or kernel module. They all inherit from the root dict ("shared_ctf") which contains deduplicated type definitions that can be shared among all dicts.CTF is not widely used in userspace, but there is no reason it could not be. Typically, CTF information is stored in the
.ctf
section of an ELF file. It's generated by GCC with-gctf
. Since CTF is generally intended for runtime debugging, userspace CTF relies on the dynamic symbol table, and so the linker typically only includes types for the functions and variables which are exported in this symbol table. Userspace programs might either use ld's--export-dynamic
option to include all symbols in the dynamic symbol table, or else they may use ld's--ctf-variables
option to retain non-exported type names, and rely on another source of information (e.g. the.gnu_debugdata
) to contain the addresses.The PR as-is supports kernel CTF only, though it can support simple userspace programs if you force it to. We use that provisional support for the test suite.
You can find the CTFv3 specification here, and the best documentation for the API is found in ctf-api.h.
I'd like to use this pull request as a first step to see whether the high-level approach is reasonable. I'm adding two functions to the
_drgn
module which attach either a CTF archive, or an out-of-tree kernel module, to aProgram
. These are of course private APIs, and I've shared two examples of how they could be used: via theload_ctf()
helper function, or a drgn plugin, thanks to the new API.