-
Notifications
You must be signed in to change notification settings - Fork 180
drgn: add runq command #513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: commands
Are you sure you want to change the base?
Conversation
…gnatures We will need these for upcoming commits, and the standard library doesn't provide them. There is a (currently unmerged) pull request to add them that we can base ours on. Signed-off-by: Omar Sandoval <[email protected]>
The default argparse.HelpFormatter wraps an entire string as one paragraph. argparse.RawDescriptionHelpFormatter and argparse.RawTextHelpFormatter use the given string verbatim. For upcoming commands, we want a middle ground: multiple paragraphs that can each be wrapped, as well as blocks of text that are not wrapped (for tables and example output, for example). Add a HelpFormatter subclass that does this with a reStructuredText-inspired syntax: This is a paragraph, which will be wrapped. |This is a line block. |It will not be wrapped. P.S. Lib/argparse.py in CPython currently says about HelpFormatter: Only the name of this class is considered a public API. All the methods provided by the class are considered an implementation detail. But we don't have much of a choice, so we do the best we can with unit tests. Signed-off-by: Omar Sandoval <[email protected]>
An upcoming change for automatically documenting commands will need this. Signed-off-by: Omar Sandoval <[email protected]>
Co-authored-by: Stephen Brennan <[email protected]>
This command is a straightforward wrapper around already existing helpers, so it serves as a good first example for future commands. Signed-off-by: Omar Sandoval <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intend to come back and do an actual review, but I just wanted to remark on the copyright header now since it's a quick and easy item.
drgn/commands/_builtin/crash/runq.py
Outdated
# This file contains code adapted from: | ||
# https://github.com/oracle-samples/drgn-tools (Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, we (Oracle Linux engineers) are already approved to submit code to drgn under its existing license. Even if you didn't originally author the drgn-tools code, you're still authorized to submit it under the DCO and our legal approval. So these lines aren't really relevant to drgn's licensing.
The only benefit to retaining them is to let readers know that there is a permissively licensed version of this code which they could use under more lenient terms than the LGPL. But as time goes on and this code evolves, that will be less and less true. Especially given that we're likely to switch to drgn's version, send fixes upstream, and eventually get rid of our own copy.
I'm not against keeping these lines. But if you added them for the purpose of covering yourself against an unknown legal requirement, then I would say you're safe to drop them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good to know. I added them to cite the sources. Let me drop them then:)
Signed-off-by: Richard Li <[email protected]>
557f137
to
089995d
Compare
def task_thread_info(task: Object) -> Object: | ||
""" | ||
Return a task's ``thread_info`` | ||
|
||
This is an equivalent to the kernel function / inline / macro | ||
``task_thread_info()``, but it must cover a wide variety of versions and | ||
configurations. | ||
|
||
:param task: Object of type ``struct task_struct *`` | ||
:returns: The ``struct thread_info *`` for this task | ||
""" | ||
if has_member(task, "thread_info"): | ||
return task.thread_info.address_of_() | ||
return cast("struct thread_info *", task.stack) | ||
|
||
|
||
def task_cpu(task: Object) -> int: | ||
""" | ||
Return the CPU on which a task is running. | ||
|
||
This is an equivalent to the kernel function ``task_cpu()``, but it covers | ||
a wide variety of variations in kernel version and configuration. It would | ||
be a bit impractical to spell out all the variants, but essentially, if | ||
there's a "cpu" field in ``struct task_struct``, then we can just use that. | ||
Otherwise, we need to get it from the ``thread_info``. | ||
|
||
:param task: Object of type ``struct task_struct *`` | ||
:retruns: The cpu as a Python int | ||
""" | ||
if has_member(task, "cpu"): | ||
return task.cpu.value_() | ||
return task_thread_info(task).cpu.value_() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these are now built-in to drgn, in drgn.helpers.linux.sched
, so we can drop these and use them directly.
def has_member(obj: Object, name: str) -> bool: | ||
""" | ||
Return true if a given object has a member with the given name. | ||
:param obj: Drgn object to check | ||
:param name: string member name to check | ||
:returns: whether the object has a member by that name | ||
""" | ||
try: | ||
obj.member_(name) | ||
return True | ||
except LookupError: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a drgn-tools way of doing things which isn't necessarily common in drgn code itself. Normally, upstream drgn code will just try to access a member, and then handle the AttributeError
by trying the alternative name.
We found that a lot of developers weren't necessarily comfortable with that approach and the has_member()
approach made them a bit more comfortable. But I think for contributing to drgn, it's best to stick with drgn's way of doing things, and not use has_member()
.
Regardless, if we were to include has_member()
in drgn, it would likely not belong in this module.
return task.cpu.value_() | ||
return task_thread_info(task).cpu.value_() | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a more general comment about the various helpers below.
The idea for these crash commands is that they'll mostly just be calling drgn helpers (from drgn.helpers.linux
) and then formatting the output. Thus, I'm thinking these would be good guidelines:
- Anything that does logic for the command should be a helper in the appropriate location. (EG many below would belong in
drgn.helpers.sched
) - Helpers generally should not be printing formatted info, instead they should be returning info to the caller. For cases where they are smoothing over incompatibilities between kernel versions, they can return a NamedTuple that gets built differently on different kernel versions. A good example of this is
drgn.helpers.printk.get_printk_records()
, which is implemented quite differently on different kernel versions, yet returns the same record structure to the caller, who can then format it and print it however they would like. - Whatever remains in the
drgn.commands._builtin.crash
package should really only deal with printing out the data, as well as formatting the--drgn
code output. Since none of that code is intended to be consumed externally, the remaining function names should probably all be prefixed by an underscore, so they're known to be private.
Omar may want to clarify or update this, since it's just my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all spot on, no notes 👍
def _parse_cpus_arg(cpus_arg: str, max_cpu: int) -> Set[int]: | ||
""" | ||
Parse argument to -c for cpu restriction (e.g. '1,3,5-8'). | ||
|
||
:param cpus_arg: str | ||
:param max_cpu: int | ||
:returns: a set of specified cpus | ||
""" | ||
cpus = set() | ||
for part in cpus_arg.split(","): | ||
part = part.strip() | ||
if "-" in part: | ||
start, end = part.split("-") | ||
for cpu in range(int(start), int(end) + 1): | ||
if 0 <= cpu < max_cpu: | ||
cpus.add(cpu) | ||
elif part: | ||
cpu = int(part) | ||
if 0 <= cpu < max_cpu: | ||
cpus.add(cpu) | ||
return cpus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, this is a useful helper on its own. It should probably belong in drgn.helpers.common.format
or something like that, and could be renamed so that it doesn't specifically mention CPUs, just ranges of integers.
print(" [no tasks queued]") | ||
|
||
|
||
def timestamp_str(ns: int) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny thought, but it may be a better API to represent the task_lastrun2now()
as a datetime.timedelta()
.
And totally unrelatedly, Python has a bulit-in function divmod()
, which returns both parts of the integer division. I guess it might be more efficient, but I think it just helps shorten and make it a bit more readable. EG:
>>> value = 123456
>>> mins, secs = divmod(value, 60)
>>> hours, mins = divmod(mins, 60)
>>> days, hours = divmod(hours, 24)
>>> days, hours, mins, secs
(1, 10, 17, 36)
@crash_command( | ||
description="Display the tasks on the run queues of each cpu.", | ||
arguments=( | ||
argument("-t", action="store_true", dest="show_timestamps"), | ||
argument("-T", action="store_true", dest="show_lag"), | ||
argument("-m", action="store_true", dest="pretty_runtime"), | ||
argument("-g", action="store_true", dest="group"), | ||
argument("-c", type=str, default="", dest="cpus"), | ||
drgn_argument, | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From playing around with the runq
command, it seems to me that a lot of these arguments are (thankfully) mutually exclusive, and crash just doesn't document that's the case. The only option that can be combined with the others is -c
.
- Each of
-t | -T | -m
are separate modes that behave similarly. They will iterate over the runqueues and print out some sort of timestamp information in various formats. They cannot be combined. - The alternative mode is to print the contents (i.e. tasks) in each runqueue, with
-g
optionally modifying the way they are printed.
It seems fair to me that we may implement each mode in a separate function, to simplify things. And we could make the mutual exclusion explicit here:
@crash_command( | |
description="Display the tasks on the run queues of each cpu.", | |
arguments=( | |
argument("-t", action="store_true", dest="show_timestamps"), | |
argument("-T", action="store_true", dest="show_lag"), | |
argument("-m", action="store_true", dest="pretty_runtime"), | |
argument("-g", action="store_true", dest="group"), | |
argument("-c", type=str, default="", dest="cpus"), | |
drgn_argument, | |
), | |
) | |
@crash_command( | |
description="Display the tasks on the run queues of each cpu.", | |
arguments=( | |
mutually_exclusive_group( | |
argument("-t", action="store_true", dest="show_timestamps"), | |
argument("-T", action="store_true", dest="show_lag"), | |
argument("-m", action="store_true", dest="pretty_runtime"), | |
argument("-g", action="store_true", dest="group"), | |
), | |
argument("-c", type=str, default="", dest="cpus"), | |
drgn_argument, | |
), | |
) |
return "%d %02d:%02d:%02d.%03d" % (days, hours, mins, secs, ms) | ||
|
||
|
||
def run_queue(prog: Program, args: argparse.Namespace) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing missing here is an implementation of the --drgn
argument. The more code we refactor into reusable drgn helpers, the easier it becomes to print the equivalent drgn code. In fact, I think starting from the --drgn
output may be a really good way to think about how you want to refactor the code into helpers. For instance, here's some pseudocode I've made up, which we may output for runq --drgn -c 1,3-5
:
from drgn.helpers.common.parse import range_list
from drgn.helpers.linux.percpu import per_cpu
from drgn.helpers.linux.sched import task_lastrun2now
for cpu in range_list("1,3-5"):
rq = per_cpu(prog["runqueues"], cpu)
runtime_ns = task_lastrun2now(rq.curr)
Starting from there, gives a good idea what helpers we'd like to have, how they'd be used, and where they should be imported from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again totally agree with the general advice. Your example does raise a small question: should --drgn
output use the arguments given in its example output (range_list("1,3-5")
) or use placeholders (range_list(cpu_list)
)? I opted for the latter for the mount
command to make the generated code more general, but an argument could be made for the former, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight preference for inserting the arguments into the script, so that ideally the output could be redirected to a file and directly sourced or run via the drgn command without modification.
But then again, substituting an Object into the text wouldn't really make sense... You could output a constructor call (Object(prog, "struct foo *", value=0xdeadbeef)
) but that doesn't seem like it would be very helpful - it wouldn't at all be applicable to other programs or situations. So considering the complications of doing this consistently, I think you're right - using a placeholder variable name seems like the best way forward. Maybe there should be a comment at the beginning, containing all the placeholders that the user should define?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe rather than a comment, we could do this, which makes it clear what's happening but also makes it easy to reassign or change:
cpu_ranges = "1,3-5"
for cpu in range_list(cpu_ranges):
...
And for mount:
task = Object(prog, "struct task_struct *", 0xdeadbeef)
# or
task = find_task(prog, 1)
...
This way things are still easy to source and run, but all the variable declarations based on input parameters are placed at the top, so they're easy to see and edit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I loved this suggestion, so I modified mount
to do this, and I'm also making the crash command test cases exec()
the output of --drgn
by default to enforce that the output is always valid and runnable.
prio_array = ( | ||
hex(runqueue.rt.active.address_ - 16) if runqueue.rt.active.address_ else 0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we subtract 16 here? And why wouldn't runqueue.rt.active.address_of_()
work here?
@brenns10 Thanks a lot for the detailed comments! Thera are a lot to digest. I will go over them one by one. |
44dc4c1
to
eb2fab3
Compare
No description provided.