Skip to content

Commit 40e8799

Browse files
authored
Merge pull request #5923 from crazy-max/run-device-docs
dockerfile: run device docs
2 parents 368e03c + 4da8760 commit 40e8799

File tree

1 file changed

+83
-1
lines changed

1 file changed

+83
-1
lines changed

frontend/dockerfile/docs/reference.md

Lines changed: 83 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -689,7 +689,8 @@ EOF
689689
The available `[OPTIONS]` for the `RUN` instruction are:
690690

691691
| Option | Minimum Dockerfile version |
692-
| ------------------------------- | -------------------------- |
692+
|---------------------------------|----------------------------|
693+
| [`--device`](#run---device) | 1.14-labs |
693694
| [`--mount`](#run---mount) | 1.2 |
694695
| [`--network`](#run---network) | 1.3 |
695696
| [`--security`](#run---security) | 1.1.2-labs |
@@ -707,6 +708,87 @@ guide](https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practi
707708

708709
The cache for `RUN` instructions can be invalidated by [`ADD`](#add) and [`COPY`](#copy) instructions.
709710

711+
### RUN --device
712+
713+
> [!NOTE]
714+
> Not yet available in stable syntax, use [`docker/dockerfile:1-labs`](#syntax)
715+
> version. It also needs BuildKit 0.20.0 or later.
716+
717+
```dockerfile
718+
RUN --device=name,[required]
719+
```
720+
721+
`RUN --device` allows build to request [CDI devices](https://github.com/moby/buildkit/blob/master/docs/cdi.md)
722+
to be available to the build step.
723+
724+
The device `name` is provided by the CDI specification registered in BuildKit.
725+
726+
In the following example, multiple devices are registered in the CDI
727+
specification for the `vendor1.com/device` vendor.
728+
729+
```yaml
730+
cdiVersion: "0.6.0"
731+
kind: "vendor1.com/device"
732+
devices:
733+
- name: foo
734+
containerEdits:
735+
env:
736+
- FOO=injected
737+
- name: bar
738+
annotations:
739+
org.mobyproject.buildkit.device.class: class1
740+
containerEdits:
741+
env:
742+
- BAR=injected
743+
- name: baz
744+
annotations:
745+
org.mobyproject.buildkit.device.class: class1
746+
containerEdits:
747+
env:
748+
- BAZ=injected
749+
- name: qux
750+
annotations:
751+
org.mobyproject.buildkit.device.class: class2
752+
containerEdits:
753+
env:
754+
- QUX=injected
755+
```
756+
757+
The device name format is flexible and accepts various patterns to support
758+
multiple device configurations:
759+
760+
* `vendor1.com/device`: request the first device found for this vendor
761+
* `vendor1.com/device=foo`: request a specific device
762+
* `vendor1.com/device=*`: request all devices for this vendor
763+
* `class1`: request devices by `org.mobyproject.buildkit.device.class` annotation
764+
765+
#### Example: CUDA-Powered LLaMA Inference
766+
767+
In this example we use the `--device` flag to run `llama.cpp` inference using
768+
an NVIDIA GPU device through CDI:
769+
770+
```dockerfile
771+
# syntax=docker/dockerfile:1-labs
772+
773+
FROM scratch AS model
774+
ADD https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_K_M.gguf /model.gguf
775+
776+
FROM scratch AS prompt
777+
COPY <<EOF prompt.txt
778+
Q: Generate a list of 10 unique biggest countries by population in JSON with their estimated poulation in 1900 and 2024. Answer only newline formatted JSON with keys "country", "population_1900", "population_2024" with 10 items.
779+
A:
780+
[
781+
{
782+
783+
EOF
784+
785+
FROM ghcr.io/ggml-org/llama.cpp:full-cuda-b5124
786+
RUN --device=nvidia.com/gpu=all \
787+
--mount=from=model,target=/models \
788+
--mount=from=prompt,target=/tmp \
789+
./llama-cli -m /models/model.gguf -no-cnv -ngl 99 -f /tmp/prompt.txt
790+
```
791+
710792
### RUN --mount
711793

712794
```dockerfile

0 commit comments

Comments
 (0)