Skip to content
This repository was archived by the owner on Jul 11, 2023. It is now read-only.

Add MeshConfig upgrade when upgrading a mesh #4386

Merged

Conversation

trstringer
Copy link
Contributor

Description:

This fixes issue #4371. Prior to this fix if there was a change to the envoy sidecar image then this would not get reflected or upgraded in the mesh config. Upon further testing we cannot just patch that in because if there are missing required keys (as indicated by the CRD) then the patch itself will fail (see the Testing done section below on the missing required key error when attempting to patch).

This PR adds the functionality for a complete upgrade of the mesh config by doing two things:

  1. Compare CRD required fields with the current MeshConfig. If there are missing required keys then patch them in.
  2. Compare the envoy sidecar image. If the default has been upgraded between the old and the new version and the user has not modified the default in their mesh config, then upgrade this value.

Testing done:

Integration tests and manual testing.

Prior to this change:

 $ osm version
Version: v0.9.2; Commit: 7df9badece2f251515a2bc5e5ac79438b0c58812; Date: 2021-08-19-21:02

 $ osm install
OSM installed successfully in namespace [osm-system] with mesh name [osm]

 $ git log --oneline -1
8dccabbf (HEAD -> main, upstream/main) injector: rename iptables chains for clarity (#4379)

 $ go run ./cmd/cli mesh upgrade
OSM successfully upgraded mesh [osm] in namespace [osm-system]

Even though the mesh was upgraded, the envoy sidecar image remains the old version in the mesh config:

 $ kubectl get configmap -n osm-system preset-mesh-config -o jsonpath='{.data.preset-mesh-config\.json}' | jq -r '.sidecar.envoyImage'
envoyproxy/envoy-alpine@sha256:6502a637c6c5fba4d03d0672d878d12da4bcc7a0d0fb3f1d506982dde0039abd

 $ kubectl get meshconfig -n osm-system osm-mesh-config -o jsonpath='{.spec.sidecar.envoyImage}'
envoyproxy/envoy-alpine:v1.18.3

If you try to patch that it errors because of a missing key:

 $ kubectl patch meshconfig -n osm-system osm-mesh-config --type=json --patch='[{"op":"replace","path":"/spec/sidecar/envoyImage","value":"envoyproxy/envoy-alpine@sha256:6502a637c6c5fba4d03d0672d878d12da4bcc7a0d0fb3f1d506982dde0039abd"}]'
The MeshConfig "osm-mesh-config" is invalid: spec.certificate.certKeyBitSize: Required value

With this PR:

 $ kubectl get meshconfig -n osm-system osm-mesh-config -o jsonpath='{.spec.sidecar.envoyImage}'
envoyproxy/envoy-alpine@sha256:6502a637c6c5fba4d03d0672d878d12da4bcc7a0d0fb3f1d506982dde0039abd

 $ kubectl get meshconfig -n osm-system osm-mesh-config -o jsonpath='{.spec.certificate.certKeyBitSize}'
2048

Affected area:

Functional Area
Upgrade [X]

Please answer the following questions with yes/no.

  1. Does this change contain code from or inspired by another project?

No.

  1. Is this a breaking change?

No.

@trstringer trstringer force-pushed the thstring/config-upgrade-fix branch 2 times, most recently from a6bb189 to 2d37b9e Compare December 3, 2021 00:41
@trstringer trstringer marked this pull request as draft December 3, 2021 16:08
@nojnhuh nojnhuh linked an issue Dec 3, 2021 that may be closed by this pull request
@nojnhuh
Copy link
Contributor

nojnhuh commented Dec 3, 2021

I know this is still WIP so I haven't taken a close look yet, but a few initial thoughts:

It seems like the simplest solution here would be to default the image name/tag in the injector and leave it empty in the MeshConfig. Then the upgraded injector would implicitly start using its updated default. This was the behavior until #4131. Of the reasons @shashankram describes there, I think the biggest problem with using a default was that the default would be used if there was some error reading the value from the MeshConfig. I'd argue we should fail the injection altogether in that case. Otherwise, I don't see any issues with using a default if no value is specified in the MeshConfig.

Going forward, adding new required fields to the CRD should result in an appropriate version bump for the API (v1alpha1 -> v1alpha2) and we should either have a conversion webhook to handle both versions or force the user to update their MeshConfigs to the new API themselves. That would handle any new required fields, so working around that here doesn't seem worth it to me unless we need this behavior for upgrading v0.9 to v0.11.

@shashankram
Copy link
Member

I know this is still WIP so I haven't taken a close look yet, but a few initial thoughts:

It seems like the simplest solution here would be to default the image name/tag in the injector and leave it empty in the MeshConfig. Then the upgraded injector would implicitly start using its updated default. This was the behavior until #4131. Of the reasons @shashankram describes there, I think the biggest problem with using a default was that the default would be used if there was some error reading the value from the MeshConfig. I'd argue we should fail the injection altogether in that case. Otherwise, I don't see any issues with using a default if no value is specified in the MeshConfig.

Going forward, adding new required fields to the CRD should result in an appropriate version bump for the API (v1alpha1 -> v1alpha2) and we should either have a conversion webhook to handle both versions or force the user to update their MeshConfigs to the new API themselves. That would handle any new required fields, so working around that here doesn't seem worth it to me unless we need this behavior for upgrading v0.9 to v0.11.

The reason defaults are not to be encoded in the binary is because the image digests are not known. We should avoid going back to encoding defaults in the Go code if possible, and always infer it through the MeshConfig or other means.

@nojnhuh
Copy link
Contributor

nojnhuh commented Dec 6, 2021

The reason defaults are not to be encoded in the binary is because the image digests are not known. We should avoid going back to encoding defaults in the Go code if possible, and always infer it through the MeshConfig or other means.

The default doesn't necessarily have to be a constant in Go. We could pass it on the command line or as an environment variable when the injector starts so it can still be defined once in values.yaml like it is now.

@shashankram
Copy link
Member

The reason defaults are not to be encoded in the binary is because the image digests are not known. We should avoid going back to encoding defaults in the Go code if possible, and always infer it through the MeshConfig or other means.

The default doesn't necessarily have to be a constant in Go. We could pass it on the command line or as an environment variable when the injector starts so it can still be defined once in values.yaml like it is now.

That would work. Just want to be mindful about not hardcoding image defaults in the Go code.

ability to specify the images through environment variables

Signed-off-by: Thomas Stringer <[email protected]>
@trstringer trstringer force-pushed the thstring/config-upgrade-fix branch from 51bce67 to 9546904 Compare December 8, 2021 13:26
@trstringer
Copy link
Contributor Author

This was a complete rewrite of the original PR, so the first message in this thread is no longer completely relevant.

From the feedback, I refactored this to use environment variables if there is no setting in the MeshConfig for the following images: envoy image, envoy windows image, and the init container image.

The default was also removed from the mesh config CRD, as that would auto-populate the MeshConfig resource with the defaults.

This does not solve the problem with missing required parameters (in the original case, certKeyBitSize) but as @nojnhuh mentioned this is more of a process issue with incrementing the resource version and dealing with the changes through a conversion webhook.

@trstringer trstringer marked this pull request as ready for review December 8, 2021 13:36
@codecov-commenter
Copy link

Codecov Report

Merging #4386 (5da4e39) into main (8dccabb) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4386      +/-   ##
==========================================
+ Coverage   69.27%   69.32%   +0.05%     
==========================================
  Files         212      212              
  Lines       14455    14476      +21     
==========================================
+ Hits        10013    10035      +22     
+ Misses       4390     4389       -1     
  Partials       52       52              
Flag Coverage Δ
unittests 69.32% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/configurator/methods.go 73.38% <100.00%> (+2.85%) ⬆️
cmd/cli/namespace_add.go 83.63% <0.00%> (-0.53%) ⬇️
pkg/ticker/ticker.go 87.17% <0.00%> (+3.84%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8dccabb...5da4e39. Read the comment docs.

@trstringer trstringer requested a review from nojnhuh December 8, 2021 20:46
@nojnhuh nojnhuh merged commit 98aef6d into openservicemesh:main Dec 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrading OSM does not upgrade default values in the MeshConfig
4 participants