Skip to content

[no-relnote] Add e2e test for firmware path traversal #1169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2025

Conversation

ArangoGutierrez
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez commented Jun 30, 2025

A container image could be crafted with a symbolic link in
/lib/firmware/nvidia/ that points to a location outside of the
container's root filesystem. When running such a container with the
NVIDIA Container Toolkit, this could potentially lead to files being
created on the host filesystem.

This change adds an end-to-end test to ensure that the toolkit is not
vulnerable to this kind of path traversal attack.

The test case covers two scenarios:

  • When using CDI, which is the default, the test ensures that the
    container runs without error and no files are created on the host.
  • When using the legacy nvidia-container-runtime-hook, the test
    ensures that the operation fails with a specific path error,
    preventing any file creation on the host.

@ArangoGutierrez ArangoGutierrez self-assigned this Jun 30, 2025
Copilot

This comment was marked as outdated.

@coveralls
Copy link

coveralls commented Jun 30, 2025

Pull Request Test Coverage Report for Build 16054009949

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 33.117%

Totals Coverage Status
Change from base Build 16051860676: 0.0%
Covered Lines: 4381
Relevant Lines: 13229

💛 - Coveralls

@ArangoGutierrez ArangoGutierrez force-pushed the b/5366608 branch 2 times, most recently from 80b5ea1 to 13ca3e3 Compare June 30, 2025 13:04
@ArangoGutierrez ArangoGutierrez force-pushed the b/5366608 branch 7 times, most recently from 8778e11 to f47a6be Compare July 1, 2025 07:45
@ArangoGutierrez ArangoGutierrez requested review from elezar and Copilot July 1, 2025 08:00
Copilot

This comment was marked as outdated.

Copilot

This comment was marked as outdated.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the test runner to propagate stderr on failures and adds an end-to-end test validating firmware file creation under CDI vs. legacy runtimes.

  • Propagate actual stderr from runner.Run when a script fails
  • Add new E2E tests to ensure CDI injection prevents host-side firmware file creation
  • Import fmt to support formatted commands in tests

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
tests/e2e/runner.go Return stderr.String() as the second value on errors
tests/e2e/nvidia-container-toolkit_test.go Introduce firmware-file-creation tests with setup and cleanup
Comments suppressed due to low confidence (2)

tests/e2e/nvidia-container-toolkit_test.go:307

  • This ignores the err return from ls; adding _, _, err := runner.Run(...) followed by Expect(err).ToNot(HaveOccurred()) will ensure any unexpected failures are caught.
			output, _, _ := runner.Run(fmt.Sprintf("ls -A %s", outputDir))

tests/e2e/nvidia-container-toolkit_test.go:277

  • [nitpick] The variable dockerBuildDockerfile holds the Dockerfile content; renaming it to something like dockerfileContent could improve clarity.
FROM ubuntu

@ArangoGutierrez ArangoGutierrez requested a review from elezar July 2, 2025 14:56
@ArangoGutierrez ArangoGutierrez changed the title [no-relnote] Add test to check CDI injection by default [no-relnote] Add test to check we don't leak firmware Jul 3, 2025
@ArangoGutierrez ArangoGutierrez changed the title [no-relnote] Add test to check we don't leak firmware [no-relnote] Add e2e test for firmware path traversal Jul 3, 2025
output, _, err := runner.Run("docker run --rm --runtime=nvidia --gpus=all firmware-test")
Expect(err).ToNot(HaveOccurred())
Expect(output).To(BeEmpty())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our other CDI tests we use:

Suggested change
output, _, err := runner.Run("docker run --rm --runtime=nvidia --gpus=runtime.nvidia.com/gpu=all firmware-test")

It("should fail when using the nvidia-container-runtime-hook", Label("legacy"), func(ctx context.Context) {
_, stderr, err := runner.Run("docker run --rm --runtime=runc --gpus=all firmware-test")
Expect(err).To(HaveOccurred())
Expect(stderr).To(ContainSubstring("nvidia-container-cli.real: mount error: path error:"))
Copy link
Member

@elezar elezar Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a test for the nvidia-container-runtime too:

Suggested change
Expect(stderr).To(ContainSubstring("nvidia-container-cli.real: mount error: path error:"))
It("should not fail when using the nvidia-container-runtime-hook", func(ctx context.Context) {
_, _, err := runner.Run("docker run --rm --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all firmware-test")
Expect(err).ToNot(HaveOccurred())
output, _, _ := runner.Run(fmt.Sprintf("ls -A %s", outputDir))
Expect(output).To(BeEmpty())
}
It("should fail when using the nvidia-container-runtime-hook", Label("legacy"), func(ctx context.Context) {

Comment on lines +347 to 354
})
})
Copy link
Member

@elezar elezar Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to move the check for empty output folders to an AfterEach so that we don't have to repeat it for every test (and also don't forget to add it to new tests).

Expect(output).To(BeEmpty())
})

It("should not fail when using the nvidia-container-runtime-hook", func(ctx context.Context) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might have been a typo in my suggestion:

Suggested change
It("should not fail when using the nvidia-container-runtime-hook", func(ctx context.Context) {
It("should not fail when using the nvidia-container-runtime", func(ctx context.Context) {

Comment on lines 326 to 328
output, _, _ := runner.Run(fmt.Sprintf("ls -A %s", outputDir))
Expect(output).To(BeEmpty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the string concatenation is a nit)

Suggested change
output, _, _ := runner.Run(fmt.Sprintf("ls -A %s", outputDir))
Expect(output).To(BeEmpty())
output, _, err := runner.Run("ls -A " + outputDir)
Expect(err).ToNot(HaveOccurred())
Expect(output).To(BeEmpty())

A container image could be crafted with a symbolic link in
/lib/firmware/nvidia/ that points to a location outside of the
container's root filesystem. When running such a container with the
NVIDIA Container Toolkit, this could potentially lead to files being
created on the host filesystem.

This change adds an end-to-end test to ensure that the toolkit is not
vulnerable to this kind of path traversal attack.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@ArangoGutierrez ArangoGutierrez merged commit 413643b into NVIDIA:main Jul 3, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants