Update documentation to build Docker image from Dockerfile instead of pulling from registry (#13057)

liu-shaojun · web-flow · commit 1d7f4a83acf5 · 2025-04-09T16:40:20.000+08:00
* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update docker_cpp_xpu_quickstart.md

* Update vllm_cpu_docker_quickstart.md

* Update docker_cpp_xpu_quickstart.md

* Update vllm_docker_quickstart.md

* Update fastchat_docker_quickstart.md

* Update docker_pytorch_inference_gpu.md
diff --git a/docker/llm/finetune/xpu/README.md b/docker/llm/finetune/xpu/README.md
@@ -16,13 +16,7 @@ With this docker image, we can use all [ipex-llm finetune examples on Intel GPU]
 
 ## 1. Prepare Docker Image
 
-You can download directly from Dockerhub like:
-
-```bash
-docker pull intelanalytics/ipex-llm-finetune-xpu:2.2.0-SNAPSHOT
-```
-
-Or build the image from source:
+TO build the image from source:
 
 ```bash
 export HTTP_PROXY=your_http_proxy
@@ -31,7 +25,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/ipex-llm-finetune-xpu:2.2.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-xpu:latest \
   -f ./Dockerfile .
 ```
 
@@ -55,7 +49,7 @@ docker run -itd \
    -v $BASE_MODE_PATH:/model \
    -v $DATA_PATH:/data/alpaca-cleaned \
    --shm-size="16g" \
-   intelanalytics/ipex-llm-finetune-xpu:2.2.0-SNAPSHOT
+   intelanalytics/ipex-llm-finetune-xpu:latest
 ```
 
 The download and mount of base model and data to a docker container demonstrates a standard fine-tuning process. You can skip this step for a quick start, and in this way, the fine-tuning codes will automatically download the needed files:
@@ -72,7 +66,7 @@ docker run -itd \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
    --shm-size="16g" \
-   intelanalytics/ipex-llm-finetune-xpu:2.2.0-SNAPSHOT
+   intelanalytics/ipex-llm-finetune-xpu:latest
 ```
 
 However, we do recommend you to handle them manually, because the download can be blocked by Internet access and Huggingface authentication etc. according to different environment, and the manual method allows you to fine-tune in a custom way (with different base model and dataset).
diff --git a/docker/llm/inference-cpp/README.md b/docker/llm/inference-cpp/README.md
@@ -13,12 +13,18 @@
 #### Setting Docker on windows
 Need to enable `--net=host`,follow [this guide](https://docs.docker.com/network/drivers/host/#docker-desktop) so that you can easily access the service running on the docker. The [v6.1x kernel version wsl]( https://learn.microsoft.com/en-us/community/content/wsl-user-msft-kernel-v6#1---building-the-microsoft-linux-kernel-v61x) is recommended to use.Otherwise, you may encounter the blocking issue before loading the model to GPU.
 
-### Pull the latest image
+### Build the Image
+To build the `ipex-llm-inference-cpp-xpu` Docker image, use the following command:
+
 ```bash
-# This image will be updated every day
-docker pull intelanalytics/ipex-llm-inference-cpp-xpu:latest
+docker build \
+  --build-arg http_proxy=.. \
+  --build-arg https_proxy=.. \
+  --build-arg no_proxy=.. \
+  --rm --no-cache -t intelanalytics/ipex-llm-inference-cpp-xpu:latest .
 ```
 
+
 ### Start Docker Container
 
 To map the `xpu` into the container, you need to specify `--device=/dev/dri` when booting the container. Select the device you are running(device type:(Max, Flex, Arc, iGPU)). And change the `/path/to/models` to mount the models. `bench_model` is used to benchmark quickly. If want to benchmark, make sure it on the `/path/to/models`.
diff --git a/docker/llm/serving/cpu/docker/README.md b/docker/llm/serving/cpu/docker/README.md
@@ -14,7 +14,7 @@ docker build \
   --build-arg http_proxy=.. \
   --build-arg https_proxy=.. \
   --build-arg no_proxy=.. \
-  --rm --no-cache -t intelanalytics/ipex-llm-serving-cpu:2.2.0-SNAPSHOT .
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-cpu:latest .
 ```
 
 ---
@@ -40,7 +40,7 @@ This ensures the container has access to the necessary models.
 Use the following command to start the container:  
 
 ```bash
-export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.2.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:latest
 
 sudo docker run -itd \
         --net=host \  # Use host networking for performance
@@ -151,4 +151,4 @@ Then, follow these steps:
        --load-in-low-bit sym_int4
    ```
 
----
+---
diff --git a/docker/llm/serving/xpu/docker/README.md b/docker/llm/serving/xpu/docker/README.md
@@ -13,7 +13,7 @@ docker build \
   --build-arg http_proxy=.. \
   --build-arg https_proxy=.. \
   --build-arg no_proxy=.. \
-  --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT .
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:latest .
 ```
 
 ---
@@ -26,11 +26,12 @@ To map the `XPU` into the container, you need to specify `--device=/dev/dri` whe
 
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:latest
 
 sudo docker run -itd \
         --net=host \
         --device=/dev/dri \
+        --privileged \
         --memory="32G" \
         --name=CONTAINER_NAME \
         --shm-size="16g" \
@@ -72,7 +73,7 @@ By default, the container is configured to automatically start the service when
 
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:latest
 
 sudo docker run -itd \
         --net=host \
@@ -112,11 +113,12 @@ If you prefer to manually start the service or need to troubleshoot, you can ove
 
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:latest
 
 sudo docker run -itd \
         --net=host \
         --device=/dev/dri \
+        --privileged \
         --memory="32G" \
         --name=CONTAINER_NAME \
         --shm-size="16g" \
diff --git a/docs/mddocs/DockerGuides/docker_cpp_xpu_quickstart.md b/docs/mddocs/DockerGuides/docker_cpp_xpu_quickstart.md
@@ -16,10 +16,16 @@
 
 Need to enable `--net=host`,follow [this guide](https://docs.docker.com/network/drivers/host/#docker-desktop) so that you can easily access the service running on the docker. The [v6.1x kernel version wsl]( https://learn.microsoft.com/en-us/community/content/wsl-user-msft-kernel-v6#1---building-the-microsoft-linux-kernel-v61x) is recommended to use.Otherwise, you may encounter the blocking issue before loading the model to GPU.
 
-### Pull the latest image
+### Build the Image
+To build the `ipex-llm-inference-cpp-xpu` Docker image, use the following command:
+
 ```bash
-# This image will be updated every day
-docker pull intelanalytics/ipex-llm-inference-cpp-xpu:latest
+cd docker/llm/inference-cpp
+docker build \
+  --build-arg http_proxy=.. \
+  --build-arg https_proxy=.. \
+  --build-arg no_proxy=.. \
+  --rm --no-cache -t intelanalytics/ipex-llm-inference-cpp-xpu:latest .
 ```
 
 ### Start Docker Container
diff --git a/docs/mddocs/DockerGuides/docker_pytorch_inference_gpu.md b/docs/mddocs/DockerGuides/docker_pytorch_inference_gpu.md
@@ -13,7 +13,12 @@ Follow the [Docker installation Guide](./docker_windows_gpu.md#install-docker) t
 
 Prepare ipex-llm-serving-xpu Docker Image:
 ```bash
-docker pull intelanalytics/ipex-llm-serving-xpu:latest
+cd docker/llm/serving/xpu/docker
+docker build \
+  --build-arg http_proxy=.. \
+  --build-arg https_proxy=.. \
+  --build-arg no_proxy=.. \
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:latest .
 ```
 
 Start ipex-llm-xpu Docker Container. Choose one of the following commands to start the container:
@@ -180,4 +185,4 @@ What is AI? [/INST]
 <</SYS>>
 
 What is AI? [/INST]  Artificial intelligence (AI) is the broader field of research and development aimed at creating machines that can perform tasks that typically require human intelligence,
-```
+```
diff --git a/docs/mddocs/DockerGuides/fastchat_docker_quickstart.md b/docs/mddocs/DockerGuides/fastchat_docker_quickstart.md
@@ -6,11 +6,16 @@ This guide demonstrates how to run `FastChat` serving with `IPEX-LLM` on Intel G
 
 Follow the instructions in this [guide](./docker_windows_gpu.md#linux) to install Docker on Linux.
 
-## Pull the latest image
+## Build the Image
+To build the `ipex-llm-serving-xpu` Docker image, use the following command:
 
 ```bash
-# This image will be updated every day
-docker pull intelanalytics/ipex-llm-serving-xpu:latest
+cd docker/llm/serving/xpu/docker
+docker build \
+  --build-arg http_proxy=.. \
+  --build-arg https_proxy=.. \
+  --build-arg no_proxy=.. \
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:latest .
 ```
 
 ## Start Docker Container
diff --git a/docs/mddocs/DockerGuides/vllm_cpu_docker_quickstart.md b/docs/mddocs/DockerGuides/vllm_cpu_docker_quickstart.md
@@ -6,13 +6,17 @@ This guide demonstrates how to run `vLLM` serving with `ipex-llm` on Intel CPU v
 
 Follow the instructions in this [guide](https://www.docker.com/get-started/) to install Docker on Linux.
 
-## Pull the latest image
 
-*Note: For running vLLM serving on Intel CPUs, you can currently use either the `intelanalytics/ipex-llm-serving-cpu:latest` or `intelanalytics/ipex-llm-serving-vllm-cpu:latest` Docker image.*
+## Build the Image
+To build the `ipex-llm-serving-cpu` Docker image, use the following command:
 
 ```bash
-# This image will be updated every day
-docker pull intelanalytics/ipex-llm-serving-cpu:latest
+cd docker/llm/serving/cpu/docker
+docker build \
+  --build-arg http_proxy=.. \
+  --build-arg https_proxy=.. \
+  --build-arg no_proxy=.. \
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-cpu:latest .
 ```
 
 ## Start Docker Container
diff --git a/docs/mddocs/DockerGuides/vllm_docker_quickstart.md b/docs/mddocs/DockerGuides/vllm_docker_quickstart.md
@@ -6,13 +6,16 @@ This guide demonstrates how to run `vLLM` serving with `IPEX-LLM` on Intel GPUs
 
 Follow the instructions in this [guide](./docker_windows_gpu.md#linux) to install Docker on Linux.
 
-## Pull the latest image
-
-*Note: For running vLLM serving on Intel GPUs, you can currently use either the `intelanalytics/ipex-llm-serving-xpu:latest` or `intelanalytics/ipex-llm-serving-vllm-xpu:latest` Docker image.*
+## Build the Image
+To build the `ipex-llm-serving-xpu` Docker image, use the following command:
 
 ```bash
-# This image will be updated every day
-docker pull intelanalytics/ipex-llm-serving-xpu:latest
+cd docker/llm/serving/xpu/docker
+docker build \
+  --build-arg http_proxy=.. \
+  --build-arg https_proxy=.. \
+  --build-arg no_proxy=.. \
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:latest .
 ```
 
 ## Start Docker Container