Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Feat integrate spiffe #1663

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

TessaIO
Copy link
Member

@TessaIO TessaIO commented Apr 14, 2024

Resolves #1186

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 14, 2024
Copy link

netlify bot commented Apr 14, 2024

Deploy Preview for kubernetes-sigs-nfd ready!

Name Link
🔨 Latest commit 9a67b7b
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-nfd/deploys/67e92c285fb2100008293afb
😎 Deploy Preview https://deploy-preview-1663--kubernetes-sigs-nfd.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch 4 times, most recently from b60483c to 2917fe1 Compare April 14, 2024 20:45
@TessaIO
Copy link
Member Author

TessaIO commented Apr 15, 2024

/cc marquiz

@k8s-ci-robot k8s-ci-robot requested a review from marquiz April 15, 2024 06:59
@ArangoGutierrez
Copy link
Contributor

Let's close #1434 ;)

Copy link
Contributor

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first comments

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2024
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch 2 times, most recently from c433995 to 5e5183d Compare April 22, 2024 16:11
@TessaIO TessaIO requested a review from ArangoGutierrez April 22, 2024 16:12
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 22, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 26, 2024
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch from 5e5183d to 4e5af99 Compare April 28, 2024 09:48
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 28, 2024
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch 2 times, most recently from 65e5b03 to aee2686 Compare April 28, 2024 09:52
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2024
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch from aee2686 to 520dd72 Compare May 12, 2024 19:27
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels May 12, 2024
@marquiz
Copy link
Contributor

marquiz commented May 24, 2024

/milestone v0.17

@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch from a73437c to ff883ec Compare February 23, 2025 12:10
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: TessaIO
Once this PR has been reviewed and has the lgtm label, please ask for approval from arangogutierrez. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 23, 2025
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch 5 times, most recently from e16d13c to fad8248 Compare February 23, 2025 12:56
@marquiz marquiz mentioned this pull request Mar 3, 2025
@ArangoGutierrez ArangoGutierrez requested a review from Copilot March 3, 2025 09:31
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR integrates SPIFFE support to sign and verify NodeFeature objects, addressing issue #1186. Key changes include:

  • Adding SPIFFE configuration options in the Helm chart and deployment templates.
  • Implementing SPIFFE client logic and cryptographic signing/verification functions (with accompanying tests) in pkg/utils/spiffe.
  • Updating nfd-worker and nfd-master to sign and verify NodeFeature objects based on SPIFFE.

Reviewed Changes

File Description
deployment/helm/node-feature-discovery/values.yaml Adds SPIFFE configuration options
pkg/utils/spiffe/spiffe.go Implements SPIFFE signing and verification functions
pkg/utils/spiffe/spiffe_test.go Provides tests for the SPIFFE implementation
docs/reference/* Updates command line references to include SPIFFE flags
pkg/nfd-worker/nfd-worker.go Integrates SPIFFE signing into the worker’s CRD creation/update
pkg/nfd-master/nfd-master.go Integrates SPIFFE verification in node feature reconciliation
cmd/nfd-worker/main.go & cmd/nfd-master/main.go Adds CLI flags for enabling SPIFFE

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch from fad8248 to ca92e5a Compare March 9, 2025 18:40
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 9, 2025
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch from ca92e5a to b1d33c2 Compare March 9, 2025 18:41
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 9, 2025
@TessaIO
Copy link
Member Author

TessaIO commented Mar 9, 2025

@ArangoGutierrez @marquiz any other comments about implementation? (I will start working on docs once we agree on the implementation)

@marquiz
Copy link
Contributor

marquiz commented Mar 12, 2025

any other comments about implementation? (I will start working on docs once we agree on the implementation)

Thank you @TessaIO for the update. The implementation generally looks feasible to me.

The only problem I see, found in my testing, is that now the NF object changes every 60s (or whatever the worker sleepInterval is set to). The data hash does not change but the signature changes -> NF is updated -> trickles down to nfd-master getting update from every node every 60s.

I think this needs to be solved. We should cache the hash (and the signature?) in nfd-worker and not re-sign if it hasn't changed, with forced re-signing every 24 hours or smth. We could also address this in a separate PR.

Any thoughts or comments @TessaIO @ArangoGutierrez?

@TessaIO TessaIO changed the title Feat integrate spiffe [WIP] Feat integrate spiffe Mar 23, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 23, 2025
@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch 2 times, most recently from 29d0a91 to 73e31a8 Compare March 23, 2025 13:58
}

dataHash := sha256.Sum256([]byte(stringifyData))
if signature, ok := hash_signature_cache[string(dataHash[:])]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the caching I started to think that the signing certificates have a TTL, and thus the signature also becomes expired at some point, right(?) What is the default ttl?

I started to wonder that it might be necessary for the nfd-worker to also (every time) validate that the signature is still valid. If not -> re-sign the data.

Comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point. I was able to reproduce it by reducing the TTL to 1m, and I found that the master wasn't able to verify the signature. I think it's necessary to pass the TTL to the worker so that each time it invalidates the cache, it re-signs again.

Comment on lines +164 to +165
path: /run/spire/agent-sockets/api.sock
type: Socket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I need come take back my earlier comment about the mounts 🤦‍♂️ If the spire-agent pod is deleted and re-created (for whatever reason) the nfd pods will permanently lose the spire sock (as the underlying socket changes), if I'm not mistaken. Thus a possible fix would be (and ditto nfd-worker).

            path: /run/spire/agent-sockets
            type: Directory

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. api.sock is a symlink to spire-agent.sock, so if spire-agent gets restarted or re-created we still have a symlink to the same file name that I think kubelet updates that periodically, right?

@TessaIO TessaIO force-pushed the feat-integrate-spiffe branch from 73e31a8 to 9a67b7b Compare March 30, 2025 11:33
@k8s-ci-robot
Copy link
Contributor

@TessaIO: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-node-feature-discovery-build-image-cross-generic 9a67b7b link true /test pull-node-feature-discovery-build-image-cross-generic

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

}

dataHash := sha256.Sum256([]byte(stringifyData))
if len(latestSignature) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd still need to check the hash too, right? Only return the cached signature if the cached hash matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Spiffe support
5 participants