-
Notifications
You must be signed in to change notification settings - Fork 955
[CHASM] Pure task processing - GetPureTasks, ExecutePureTasks #7701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
chasm/tree.go
Outdated
if task.PhysicalTaskStatus != physicalTaskStatusCreated { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm why? We only create physical task for the first pure task, so it's expected that some tasks will have TaskStatusNone.
chasm/tree.go
Outdated
|
||
// Validate the task. If the task is invalid, skip it for processing (it'll be | ||
// removed when the transaction closes). | ||
ok, err := node.validateComponentTask(validateContext, task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I don't think we can do validation of tasks first and then run all validated them. The execution of one task may invalidates another task. so we have to validate t1 -> execute t1 -> validate t2 -> execute t2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will fix that. I think moving the validate check into ExecutePureTask
probably makes more sense.
chasm/tree.go
Outdated
} | ||
|
||
// Component value must be prepared for validation to work. | ||
if err := node.prepareComponentValue(validateContext); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: we can probably reuse node.Component() with an empty ComponentRef(). Then we only need to implement access rule in that method and task processing side can benefit from it as well.
chasm/tree.go
Outdated
)) | ||
} | ||
|
||
// TODO - validateComponentTask also calls deserializeTask, should share a cached value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case, can we refactor validateComponentTask to take in a deserialized task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, will do.
chasm/tree.go
Outdated
} | ||
|
||
// Ensure this node's component value is hydrated before execution. | ||
ctx := NewContext(context.Background(), n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the timer queue processor should pass in a base context.Context.
chasm/tree.go
Outdated
|
||
// Ensure this node's component value is hydrated before execution. | ||
ctx := NewContext(context.Background(), n) | ||
if err := n.prepareComponentValue(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm the n
here is always root right? But the task maybe for a child component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, child tasks are called on the proper child Node receiver now.
chasm/tree.go
Outdated
|
||
validateContext := NewContext(context.Background(), n) | ||
for _, task := range componentAttr.GetPureTasks() { | ||
if task.ScheduledTime.AsTime().After(deadline) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use ms as the precision for doing comparison. queues.IsTimeExpired() has some more details.
Also caller can't simply use time.Now() as input, since there might be clock skew, which may cause the timer task to be executed before the scheduled time recorded in the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to truncate to the same precision as queues.
Also caller can't simply use time.Now() as input, since there might be clock skew, which may cause the timer task to be executed before the scheduled time recorded in the state.
The other PR with the caller is using t.Now()
from the timer queue struct's ShardContext, as do other methods, and I'll update it to make use of IsTimeExpired
for its physical task comparison.
//nolint:revive // type cast result is unchecked | ||
return result[0].Interface().(error) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an interesting question here. We are basically assuming for all the tasks processed, they will be invalidated at the end of the transaction.
If the validator implementation has a bug and doesn't invalid the task and the task status of that pure task happens to be Created, then we will won't generate a new physical task and the execution will get stuck.
If we blindly flip task status to be None, then we will have a infinite loop here.
It seems to be the best way is to Validate again after running the task and if the task is still validate return an internal error and DLQ the task.
Or we somehow mark the task as executed in-memory and upon close transaction, where we are already doing task Validating all the tasks, detect this case and return an error.
Not a super important issue but want to bring awareness here. Please help create a task to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Task to follow up created, OSS-4272
What changed?
GetPureTasks
,ExecutePureTask
to the CHASM Tree (Node
). These will be invoked from the timer queue executors in history service during processing of physical pure tasks.Why?
GetPureTasks
analog, as they keep a directRef
to the component they're targetting, and their physical tasks are 1:1 with logical tasks.How did you test it?