689
689
The available ` [OPTIONS] ` for the ` RUN ` instruction are:
690
690
691
691
| Option | Minimum Dockerfile version |
692
- | ------------------------------- | -------------------------- |
692
+ | ---------------------------------| ----------------------------|
693
+ | [ ` --device ` ] ( #run---device ) | 1.14-labs |
693
694
| [ ` --mount ` ] ( #run---mount ) | 1.2 |
694
695
| [ ` --network ` ] ( #run---network ) | 1.3 |
695
696
| [ ` --security ` ] ( #run---security ) | 1.1.2-labs |
@@ -707,6 +708,87 @@ guide](https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practi
707
708
708
709
The cache for ` RUN ` instructions can be invalidated by [ ` ADD ` ] ( #add ) and [ ` COPY ` ] ( #copy ) instructions.
709
710
711
+ ### RUN --device
712
+
713
+ > [ !NOTE]
714
+ > Not yet available in stable syntax, use [ ` docker/dockerfile:1-labs ` ] ( #syntax )
715
+ > version. It also needs BuildKit 0.20.0 or later.
716
+
717
+ ``` dockerfile
718
+ RUN --device=name,[required]
719
+ ```
720
+
721
+ ` RUN --device ` allows build to request [ CDI devices] ( https://github.com/moby/buildkit/blob/master/docs/cdi.md )
722
+ to be available to the build step.
723
+
724
+ The device ` name ` is provided by the CDI specification registered in BuildKit.
725
+
726
+ In the following example, multiple devices are registered in the CDI
727
+ specification for the ` vendor1.com/device ` vendor.
728
+
729
+ ``` yaml
730
+ cdiVersion : " 0.6.0"
731
+ kind : " vendor1.com/device"
732
+ devices :
733
+ - name : foo
734
+ containerEdits :
735
+ env :
736
+ - FOO=injected
737
+ - name : bar
738
+ annotations :
739
+ org.mobyproject.buildkit.device.class : class1
740
+ containerEdits :
741
+ env :
742
+ - BAR=injected
743
+ - name : baz
744
+ annotations :
745
+ org.mobyproject.buildkit.device.class : class1
746
+ containerEdits :
747
+ env :
748
+ - BAZ=injected
749
+ - name : qux
750
+ annotations :
751
+ org.mobyproject.buildkit.device.class : class2
752
+ containerEdits :
753
+ env :
754
+ - QUX=injected
755
+ ` ` `
756
+
757
+ The device name format is flexible and accepts various patterns to support
758
+ multiple device configurations:
759
+
760
+ * ` vendor1.com/device`: request the first device found for this vendor
761
+ * `vendor1.com/device=foo`: request a specific device
762
+ * `vendor1.com/device=*`: request all devices for this vendor
763
+ * `class1`: request devices by `org.mobyproject.buildkit.device.class` annotation
764
+
765
+ # ### Example: CUDA-Powered LLaMA Inference
766
+
767
+ In this example we use the `--device` flag to run `llama.cpp` inference using
768
+ an NVIDIA GPU device through CDI :
769
+
770
+ ` ` ` dockerfile
771
+ # syntax=docker/dockerfile:1-labs
772
+
773
+ FROM scratch AS model
774
+ ADD https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_K_M.gguf /model.gguf
775
+
776
+ FROM scratch AS prompt
777
+ COPY <<EOF prompt.txt
778
+ Q: Generate a list of 10 unique biggest countries by population in JSON with their estimated poulation in 1900 and 2024. Answer only newline formatted JSON with keys "country", "population_1900", "population_2024" with 10 items.
779
+ A:
780
+ [
781
+ {
782
+
783
+ EOF
784
+
785
+ FROM ghcr.io/ggml-org/llama.cpp:full-cuda-b5124
786
+ RUN --device=nvidia.com/gpu=all \
787
+ --mount=from=model,target=/models \
788
+ --mount=from=prompt,target=/tmp \
789
+ ./llama-cli -m /models/model.gguf -no-cnv -ngl 99 -f /tmp/prompt.txt
790
+ ` ` `
791
+
710
792
# ## RUN --mount
711
793
712
794
` ` ` dockerfile
0 commit comments