Skip to content

[CHASM] timer queue pure task processing logic #7702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: chasm_tree_pure_tasks
Choose a base branch
from

Conversation

lina-temporal
Copy link
Contributor

What changed?

  • Added physical pure task processing logic to the timer queue executors.

Why?

How did you test it?

  • Pending; PR is in draft

Potential risks

Documentation

Is hotfix candidate?

@lina-temporal lina-temporal requested review from alexshtin and yycptt May 2, 2025 02:33
Comment on lines 146 to 165
// We haven't done any work, return without committing.
if processedTimers == 0 {
return nil, nil
}

if t.config.EnableUpdateWorkflowModeIgnoreCurrent() {
return nil, wfContext.UpdateWorkflowExecutionAsPassive(ctx, t.shardContext)
}

// TODO: remove following code once EnableUpdateWorkflowModeIgnoreCurrent config is deprecated.
if mutableState.GetExecutionState().State == enumsspb.WORKFLOW_EXECUTION_STATE_COMPLETED {
// Can't use UpdateWorkflowExecutionAsPassive since it updates the current run,
// and we are operating on a closed workflow.
return nil, wfContext.SubmitClosedWorkflowSnapshot(
ctx,
t.shardContext,
historyi.TransactionPolicyPassive,
)
}
return nil, wfContext.UpdateWorkflowExecutionAsPassive(ctx, t.shardContext)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standby executor doesn't need this part I think.


tree := ms.ChasmTree()
if tree == nil {
return 0, consts.ErrStaleReference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we return an internal error here? I think this is not expected?

return 0, fmt.Errorf("failed type assertion for ChasmTree")
}

pureTasks, err := tree.GetPureTasks(t.Now())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented in the previous PR, t.Now() is not good enough. Check queues.IsTimeExpired().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I address skew between DB and physical task queue within the timer task queue executor, and then also truncate times against the reference time within the CHASM tree.

// If this line of code is reached, the task's Validate() function returned no error, which indicates
// that it is still expected to run. Return ErrTaskRetry to wait the machine to transition on the active
// cluster.
return consts.ErrTaskRetry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly concerned that the verification may never completes if new valid tasks are keep getting generated.

We probably want to add VT to the physical pure tasks and only execute/verify those tasks generated <= the VT recorded in the physical task. Can you help create a follow up task on this? Not high priority and we can address it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created OSS-4273 for follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants