-
Notifications
You must be signed in to change notification settings - Fork 148
Add Task Submission / Return Hooks #1287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 2e31305 in 3 minutes and 1 seconds
More details
- Looked at
647
lines of code in9
files - Skipped
0
files when reviewing. - Skipped posting
19
drafted comments based on config settings.
1. hamilton/lifecycle/api.py:376
-
Draft comment:
This class is functionally identical to BasePreTaskSubmission - it just adds an unnecessary layer of indirection. Please use BasePreTaskSubmission directly instead. -
class BasePreTaskSubmission (base.py)
-
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
The TaskSubmissionHook class:
- Renames pre_task_submission to run_before_task_submission
- Provides a concrete implementation of pre_task_submission that delegates to run_before_task_submission
- Adds docstrings
- Otherwise has identical functionality to BasePreTaskSubmission
This pattern is consistent with other hooks in the file that provide a more user-friendly API over the base classes.
The comment ignores that this pattern of concrete delegation methods with renamed abstract methods is used consistently throughout the codebase for all hooks. The pattern likely serves a purpose in providing a more intuitive API.
The consistent pattern across the codebase suggests this is an intentional design choice to provide a more user-friendly API layer over the base hooks, not unnecessary indirection.
The comment should be deleted as it mischaracterizes an intentional API design pattern as unnecessary indirection.
2. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:85
- Draft comment:
Avoid using hardcoded magic numbers (here 14) for the expected number of hook calls. Consider adding inline comments or deriving the expected count from the task structure so tests are less brittle. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:121
- Draft comment:
The run_id length assertion compares against the length of a TODO string. Replace it with a meaningful fixed threshold or check for valid UUID format. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:211
- Draft comment:
Magic arithmetic for expected hook call counts (2 + 5 * 2 + 1 + 1) is unclear. Consider declaring expected counts as clearly named constants or adding clarifying comments. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:448
- Draft comment:
In test_multi_hook, the expected hookCounts dictionary uses many literal numbers. Document the reasoning behind each count to improve test readability and maintainability. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:250
- Draft comment:
The use of counters to check spawning_task_ids and purposes is fine, but consider adding inline comments explaining how the expected distribution (e.g. {"expand-parallel_over": 5, None: 5}) corresponds to the execution plan. - Reason this comment was not posted:
Confidence changes required:50%
<= threshold50%
None
7. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:80
- Draft comment:
Add inline comments explaining how the expected count of 14 for 'pre_node_execute' is derived from the DAG structure. This will aid future maintainers in understanding the arithmetic behind the hook call counts. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:211
- Draft comment:
The final result is asserted to be 426; please add a clarifying comment explaining the arithmetic behind this expected output so that future readers can follow the transformation logic. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:249
- Draft comment:
Document the rationale behind the expected counts for spawning task IDs and purpose counters (e.g., why 5 for 'expand-parallel_over' and 3 for EXECUTE_SINGLE). A brief comment here would improve clarity. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
While documentation could help explain the test's expectations, this is a test file and the assertions are verifying implementation details. The values come from the specific execution flow being tested - they're not arbitrary magic numbers but rather the expected behavior of the system. The test name and context make it clear what's being tested. Adding comments about implementation details could make the tests more brittle if the underlying implementation changes.
The comment raises a valid point that these numbers may seem magical to someone unfamiliar with the system. Understanding why we expect these specific counts could help maintainers debug test failures.
However, tests should be self-documenting through clear names and structure. These counts are implementation details that may change if the execution flow changes. Adding documentation about specific counts makes tests more fragile and harder to maintain.
The comment should be deleted. The test is clear about what it's testing through its name and structure. Documenting specific assertion values would create maintenance burden and make tests more brittle.
10. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:242
- Draft comment:
The test for the post task expand hook asserts that the parameters are exactly ["0", "1", "2", "3", "4"]. Please confirm that it is intentional for these values to be strings rather than integers, and add a comment if the conversion is deliberate. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. tests/lifecycle/test_lifecycle_adapters_end_to_end_task_based.py:444
- Draft comment:
In the 'test_multi_hook' test, the hook count assertion uses a Counter over function names. Consider refactoring this check or adding detailed inline comments that document the expected call counts for each hook for enhanced clarity and future maintainability. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. tests/lifecycle/lifecycle_adapters_for_testing.py:54
- Draft comment:
The ExtendToTrackCalls wrapper iterates over lifecycle hook and method attributes to wrap them for call tracking. It would be helpful to add inline documentation here describing the wrapping mechanism and its intended behavior for future maintainers. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
13. hamilton/execution/executors.py:98
- Draft comment:
Typo: In the docstring for base_execute_task, 'diciontary' should be corrected to 'dictionary'. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
14. hamilton/execution/executors.py:102
- Draft comment:
Typo: In the comment within base_execute_task (around line 102), consider changing "its an implementation detail" to "it's an implementation detail" for clarity. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
15. hamilton/execution/executors.py:74
- Draft comment:
Typo: In the _modify_callable function definition, consider renaming the parameter 'callabl' to 'callable' for clarity, unless this naming is intentional to avoid conflict with the built-in 'callable' function. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
16. hamilton/execution/grouping.py:203
- Draft comment:
Typo found: 'Fina' should be 'Find' in the docstring. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
17. hamilton/execution/grouping.py:233
- Draft comment:
Typo found: 'thie' should be 'this' in the comment. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
18. hamilton/execution/grouping.py:393
- Draft comment:
Typo found: 'guarenteed' should be 'guaranteed' in the comment. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
19. hamilton/lifecycle/api.py:767
- Draft comment:
Typo in the docstring for run_after_task_grouping: consider removing 'the' to change 'information about the which tasks were created' to 'information about which tasks were created'. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_NREw27CKOdDAEWIw
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Ahh - I used |
Regarding the failing doc test - possibly related to See #1288 |
What's the impact here to the Hamilton Tracker? Should we implement these new functions for it? |
I am not that familiar with the Offhand, do you know if the UI has any issues tracking out-of-process execution? I see that Edit: Sorry, that's probably not super helpful. |
👍
Yes, that's in the client that sends the events - need to go one level deeper in the tracker to the client it uses. |
To chime in -- I don't think this should ipmact the hamilton tracker. That said, we may want to consider adding more to it to leverage these -- we could get a more fine-grained view/get expose more descriptive states. |
Cool. Does it makes sense to provide some implementation of these hooks in this PR too? e.g. some example using it? |
@skrawcz - I was actually planning to create a second PR with a dedicated logging adapter (based on a side conversation with @elijahbenizzy) using these new hooks. I can add that here if you'd like. |
Cool. Yeah I say having an implementation would be good because having something implemented helps prove / ground the API :) It can be in a PR off of this branch or together. |
So, the logging adapter is a bit more expansive than I originally thought. We will probably want to have a discussion. I will open a separate PR based off this branch - hopefully this Edit: Changed ETA. Sorry! 😞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good -- a few minor comments
Summary of the latest changes (I have been sitting on them)...
|
Hmm ... tests are failing because of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few small changes but I think it's almost there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry this took a bit to review. Looks good, just name change the hook and ping me, I'll merge.
@elijahbenizzy I changed the name of the hook and merged in main (probably should have rebased). |
Current pre/post task execution hooks may run on out-of-process executors potentially causing issues with certain logging and stateful adapters. This PR adds additional hooks that are run before a task is submitted to an executor and after a task future is resolved - both on the main process.
Changes
I added task submission / resolution hooks (
BasePreTaskSubmission
andBasePostTaskResolution
) and adapters (TaskSubmissionHook
andTaskResolutionHook
) to the codebase as well as updated the main task execution functionrun_graph_to_completion
to call the new hooks. In support of this I madeTaskImplementation
hashable via it'stask_id
in order to avoid a parallel structure - @elijahbenizzy perhaps in the future this could be used to a queue-based task processor?How I tested this
I added individual tests for both
TaskSubmissionHook
andTaskResolutionHook
and added them totest_multi_hook
.Notes
TaskSubmissionHook
andTaskResolutionHook
, as well some other task execution hooks, have been updated. I think we could add some more details - maybe a flow chart of when the individual hooks are called? I can add that if the current changes seems reasonable.TaskResolutionHook
currently has an optionalerror
attribute. I am on the fence as to whether or not this is needed. My thinking is that sinceTaskFuture
essentially wrapsconcurrent.futures.Future
we could forward theexception
call. Thoughts? Currentlyerror
is alwaysNone
.Checklist