-
Notifications
You must be signed in to change notification settings - Fork 180
RFC Support dumps without VMCOREINFO #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
addrxlat can detect where kernel text was actually mapped to, vmlinux (debuginfo) tells us non-randomized address of _text, therefore we can calculate the KASLR offset of the kernel text. The search of _text symbol is hacked by only looking at the range of ELF load segments and taking the lowest address since _text is typically at the beginning. Then we load ELF properly applying the right bias. Thanks to this we can properly load debuginfo and read values that we need from the vmcore image itself without retorting to vmcoreinfo.
…REINFO" This reverts commit 2bd861f. New versions of libkdumpfile can parse QEMU notes with QEMUCPUState and that information can substitute VMCOREINFO of knowingly Linux guests, therefore we can accept QEMU dumps as any ELF dumps again.
ELF core without vmcoreinfo can still be a kernel vmcore (e.g. Qemu dump), try loading it with libkdumpfile (versions with commit 4d5814c ("x86_64: map QEMU CPU state ELF notes to register attributes")). Example: DRGN_USE_LIBKDUMPFILE_FOR_ELF=1 drgn -c ./qemudump -s ./vmlinux Passing userspace ELF cores to drgn may not work like expected with this changeset. kernel dumps without VMCOREINFO loaded w/out libkdumpfile are not supported.
The alternative derivation of mapping parameters with addrxlat and debuginfo may leave some formerly mandatory VMCOREINFO fields uninitialized.
If we don't know version from VMCOREINFO, try any of the vmlinux images. This is allows opening dumps without VMCOREINFO (e.g. Qemu ELF).
So this is a really interesting strategy. Thanks for this PR! I guess libkdumpfile is getting the kernel text location by inferring some things from the page tables, in the cases where the vmcoreinfo and other metadata is missing. At least, that's what I'm getting from the following code: I am a big fan of anything that allows better support for vmcores where the VMCOREINFO is missing or not easy to find. That said, we would probably need to take a careful look at areas that expect the values of In these cases, I've usually found that the issue is not so much that the VMCOREINFO is totally unavailable. It's always there in kernel memory. It's just that QEMU (or some other core dump producer) doesn't know about it. It's still buried in the kernel memory, and thus it's still present in the vmcore. But with a libkdumpfile program like this one, it's possible to search the vmcore page by page and find the data that is most likely the vmcoreinfo note. Then, you can provide that when creating the program, or with the So I wonder if we can combine the benefit of both approaches? Yours identifies the kernel text location, and if you have the vmlinux file handy, then you can use that to derive the KASLR offset. From that, you can get the address of I have a few questions for you:
|
Currently, it only works with ELF (which was my recent use case) but it shouldn't be that hard to utilize the same for kdump vmcores. |
if (range->meth == ADDRXLAT_SYS_METH_KTEXT) { | ||
prog->ktext_mapped = addr; | ||
found = true; | ||
drgn_log_debug(prog, "addrxlat found ktext at %x", addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a warning in openscanhub:
Error: COMPILER_WARNING: [#def1]
python-drgn-0.0.30+57.g5dce6d2f-build/drgn-0.0.30+57.g5dce6d2f/libdrgn/kdump.c:13: included_from: Included from here.
python-drgn-0.0.30+57.g5dce6d2f-build/drgn-0.0.30+57.g5dce6d2f/libdrgn/kdump.c: scope_hint: In function 'drgn_find_ktext'
python-drgn-0.0.30+57.g5dce6d2f-build/drgn-0.0.30+57.g5dce6d2f/libdrgn/kdump.c:160:46: warning[-Wformat=]: format '%x' expects argument of type 'unsigned int', but argument 5 has type 'addrxlat_addr_t' {aka 'long unsigned int'}
# 160 | drgn_log_debug(prog, "addrxlat found ktext at %x", addr);
# | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~
# | |
# | addrxlat_addr_t {aka long unsigned int}
python-drgn-0.0.30+57.g5dce6d2f-build/drgn-0.0.30+57.g5dce6d2f/libdrgn/log.h:46:70: note: in definition of macro 'drgn_log'
# 46 | #define drgn_log(level, prog, ...) drgn_error_log(level, prog, NULL, __VA_ARGS__)
# | ^~~~~~~~~~~
python-drgn-0.0.30+57.g5dce6d2f-build/drgn-0.0.30+57.g5dce6d2f/libdrgn/kdump.c:160:25: note: in expansion of macro 'drgn_log_debug'
# 160 | drgn_log_debug(prog, "addrxlat found ktext at %x", addr);
# | ^~~~~~~~~~~~~~
python-drgn-0.0.30+57.g5dce6d2f-build/drgn-0.0.30+57.g5dce6d2f/libdrgn/kdump.c:160:72: note: format string is defined here
# 160 | drgn_log_debug(prog, "addrxlat found ktext at %x", addr);
# | ~^
# | |
# | unsigned int
# | %lx
# 158| prog->ktext_mapped = addr;
# 159| found = true;
# 160|-> drgn_log_debug(prog, "addrxlat found ktext at %x", addr);
# 161| break;
# 162|
Shall this be fixed before merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall this be fixed before merging?
Much more needs to be fixed in this branch 😊 Let me mark it a draft.
Motivation
For instance kernel dumps from Qemu hypervisor in the ELF format may miss VMCOREINFO
Before
After
Summary