-
Notifications
You must be signed in to change notification settings - Fork 22
Set the database ID annotations on replicated policies #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set the database ID annotations on replicated policies #165
Conversation
This is now ready for review. |
@@ -280,6 +284,7 @@ e2e-stop-instrumented: | |||
|
|||
.PHONY: e2e-test-coverage | |||
e2e-test-coverage: E2E_TEST_ARGS = --json-report=report_e2e.json --output-dir=. | |||
e2e-test-coverage: E2E_TEST_CODE_ARGS = --compliance-api-port=8385 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is diff 8385 vs 8384
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you have the Kind cluster running, it exposes the 8384 port so it can't be reused when running e2e-test-coverage
since the controller is run outside of the cluster.
@@ -35,6 +37,35 @@ var ( | |||
ErrInvalidLabelValue = errors.New("unexpected format of label value") | |||
) | |||
|
|||
type GuttedObject struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I had to move GuttedObject to prevent an import loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anything reset the cached IDs when the database connection changes? It seems like a migration might lead to those IDs being changed, but maybe I missed something.
controllers/complianceeventsapi/complianceeventsapi_controller.go
Outdated
Show resolved
Hide resolved
log.Error( | ||
err, | ||
"Failed to get the database ID of the parent policy", | ||
"namespace", replicatedPolicy.Namespace, | ||
"name", replicatedPolicy.Name, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to imagine what the rest of the function does after this, because I was trying to think if it could just queue up the thing, and then return early, which might be cleaner/faster?
It seems like it would skip trying to connect to the database again, and just just uses whatever it might already have in its cache. But when its cache is missing something, it deletes that annotation from the policy. So in a case where the cache is empty because the controller just restarted, and it (for some reason) can't connect to the database yet, it seems like it removes all those annotations? Is that right, is that fully intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally thinking you can't guarantee accuracy if the DB is down and there is no cache entry, but you're right, we can verify the replicated policy didn't change based on the DB unique constraints and reuse the ID from the replicated policy.
I just added this.
@JustinKuli good catch! I'm now clearing the cache after a database migration. |
When the compliance events API is enabled, the replicated policies will contain the annotation policy.open-cluster-management.io/parent-policy-compliance-db-id and each of its policy-templates entries will have the policy.open-cluster-management.io/policy-compliance-db-id annotation. As part of this, resilience to database connection losses and changes was added. This is done by monitoring the database connection. If the database connection is down, the replicated policy controller will queue up reconcile requests to add the database specific annotations if the answer isn't already cached. Once the database connection is restored, the queued up reconciles will be triggered. Note that now everytime a database connection changes, a database migration is run. This is an idempotent action and it's to catch the case where the database server has been swapped out or restored to an older backup which may not have the latest database schema. Relates: https://issues.redhat.com/browse/ACM-6889 Signed-off-by: mprahl <[email protected]>
Signed-off-by: mprahl <[email protected]>
@@ -332,8 +335,77 @@ func (p *ParentPolicy) GetOrCreate(ctx context.Context, db *sql.DB) error { | |||
return getOrCreate(ctx, db, p) | |||
} | |||
|
|||
func (p ParentPolicy) key() string { | |||
return fmt.Sprintf("%s;%v;%v;%v", p.Name, p.Categories, p.Controls, p.Standards) | |||
func (p ParentPolicy) Key() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this use as an id as parent-policy in db? I cannot find insert in db
this key as ia id. If not, we can add resourceVersion
and we can use this as an id. can we?? ParentPolicy doesn't need to save status in db right?
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mprahl, yiraeChristineKim The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Note that the second commit enables the compliance API to exposed from the KinD cluster.
When the compliance events API is enabled, the replicated policies will
contain the annotation
policy.open-cluster-management.io/parent-policy-compliance-db-id and
each of its policy-templates entries will have the
policy.open-cluster-management.io/policy-compliance-db-id annotation.
As part of this, resilience to database connection losses and changes
was added. This is done by monitoring the database connection. If the
database connection is down, the replicated policy controller will queue
up reconcile requests to add the database specific annotations if the
answer isn't already cached. Once the database connection is restored,
the queued up reconciles will be triggered.
Note that now everytime a database connection changes, a database
migration is run. This is an idempotent action and it's to catch the
case where the database server has been swapped out or restored to an
older backup which may not have the latest database schema.
Relates:
https://issues.redhat.com/browse/ACM-6889