-
Notifications
You must be signed in to change notification settings - Fork 5.9k
br: enable parallel restore #58724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
br: enable parallel restore #58724
Conversation
Skipping CI for Draft Pull Request. |
Hi @Tristan1900. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #58724 +/- ##
================================================
+ Coverage 73.1437% 74.6872% +1.5435%
================================================
Files 1726 1728 +2
Lines 478033 484124 +6091
================================================
+ Hits 349651 361579 +11928
+ Misses 106948 100423 -6525
- Partials 21434 22122 +688
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
c089404
to
13e1419
Compare
d6875ff
to
d33d597
Compare
/retest |
@Tristan1900: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
973c93a
to
2854159
Compare
/retest |
@Tristan1900: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 14 out of 21 changed files in this pull request and generated no comments.
Files not reviewed (7)
- br/pkg/registry/BUILD.bazel: Language not supported
- br/pkg/restore/BUILD.bazel: Language not supported
- br/pkg/task/BUILD.bazel: Language not supported
- br/tests/br_parallel_restore/run.sh: Language not supported
- br/tests/br_pitr/run.sh: Language not supported
- br/tests/br_restore_checkpoint/run.sh: Language not supported
- br/tests/run_group_br_tests.sh: Language not supported
Comments suppressed due to low confidence (1)
br/pkg/utils/schema.go:34
- The comment contains a typo ('wheterh'); it should be corrected to 'whether'.
// IsTemplateSysDB checks wheterh the dbname is temporary system database(__TiDB_BR_Temporary_mysql or __TiDB_BR_Temporary_sys).
@@ -0,0 +1,523 @@ | |||
// Copyright 2025 PingCAP, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to add unit tests for the package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me see what I can do, it's calling executor so not sure unit test is a good choice and is testing what we want
use testKit
br/pkg/registry/registration.go
Outdated
filter_strings(255), | ||
start_ts, | ||
restored_ts, | ||
upstream_cluster_id, | ||
with_sys_table, | ||
cmd(255) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for cmd
column, maybe it is better to use enum type?
for filter_strings
, I think 255 is not long enough for type Text.
ERROR 1062 (23000): Duplicate entry '1234567890123456789012345678901234567890123456789012345678901234' for key 't1.unique_registration_params'
Do you actually want KEY
instead of UNIQUE KEY
?
AND upstream_cluster_id = %%? | ||
AND with_sys_table = %%? | ||
AND cmd = %%? | ||
ORDER BY id DESC LIMIT 1` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need limit 1
because there must be only one row in the final result.
br/pkg/registry/registration.go
Outdated
} | ||
// task exists but was either created by another process or has been running | ||
// check if it's in running state | ||
runningRows, _, err := execCtx.ExecRestrictedSQL( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to select again? runningRows[0]
must be the same as rows[0]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes you are right
|
||
// ResumeOrCreateRegistration first looks for an existing registration with the given parameters. | ||
// If found and paused, it tries to resume it. Otherwise, it creates a new registration. | ||
func (r *Registry) ResumeOrCreateRegistration(ctx context.Context, info RegistrationInfo) (uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap by
r.se.ExecuteInternal("begin pessimistic;")
...
r.se.ExecuteInternal("commit/rollback;")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my initial attempt but had to fall back to current version of using multiple transaction, maybe we can discuss offline cuz based on my observation ExecRestrictedSQL
doesn't work with nested transaction, it appears to be a single transaction for admin purpose. But maybe I'm wrong.
if hasCheckpointPersisted(c, cfg) { | ||
log.Info("pausing restore task from registry", | ||
zap.Uint64("restoreId", cfg.RestoreID), zap.Error(restoreError)) | ||
if err := restoreRegistry.PauseTask(c, cfg.RestoreID); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if BR is killed because OOM? BR might not have chance to pause the task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, let me think about that
br/pkg/registry/registration.go
Outdated
return 0, errors.Annotatef(err, "failed to create new registration") | ||
} | ||
|
||
// check if a row with our parameters exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe after introduce a txn (begin ... commit), BR does not need to check again.
Signed-off-by: Wenqi Mou <[email protected]>
Signed-off-by: Wenqi Mou <[email protected]>
Signed-off-by: Wenqi Mou <[email protected]>
Signed-off-by: Wenqi Mou <[email protected]>
Signed-off-by: Wenqi Mou <[email protected]>
Signed-off-by: Wenqi Mou <[email protected]>
Signed-off-by: Wenqi Mou <[email protected]>
cfed383
to
9d64c1d
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. |
Signed-off-by: Wenqi Mou <[email protected]>
9d64c1d
to
ad82903
Compare
@Tristan1900: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #58725
Problem Summary:
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.