Skip to content

Commit 9a39429

Browse files
authored
BLD: Docker image (#855)
1 parent 3ea928b commit 9a39429

File tree

5 files changed

+167
-0
lines changed

5 files changed

+167
-0
lines changed

.dockerignore

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
doc/
2+
.idea/
3+
.github/
4+
build/
5+
xinference.egg-info/
6+
xinference/web/ui/build/
7+
xinference/web/ui/node_modules/

.github/workflows/docker-cd.yaml

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
name: Xinference CD for DockerHub
2+
3+
on:
4+
schedule:
5+
- cron: '0 18 * * *'
6+
push:
7+
tags:
8+
- '*'
9+
workflow_dispatch:
10+
11+
concurrency:
12+
group: ${{ github.workflow }}-${{ github.ref }}
13+
cancel-in-progress: true
14+
15+
jobs:
16+
build:
17+
runs-on: ubuntu-latest
18+
strategy:
19+
matrix:
20+
python-version: [ "3.10" ]
21+
steps:
22+
- name: Check out code
23+
uses: actions/checkout@v3
24+
with:
25+
fetch-depth: 0
26+
submodules: recursive
27+
28+
- name: Log in to Docker Hub
29+
uses: docker/login-action@v1
30+
with:
31+
username: ${{ secrets.DOCKERHUB_USERNAME }}
32+
password: ${{ secrets.DOCKERHUB_PASSWORD }}
33+
34+
- name: Build and push Docker image
35+
shell: bash
36+
if: ${{ github.repository == 'xorbitsai/inference' }}
37+
env:
38+
DOCKER_ORG: ${{ secrets.DOCKERHUB_USERNAME }}
39+
PY_VERSION: ${{ matrix.python-version }}
40+
run: |
41+
if [[ "$GITHUB_REF" =~ ^"refs/tags/" ]]; then
42+
export GIT_TAG=$(echo "$GITHUB_REF" | sed -e "s/refs\/tags\///g")
43+
else
44+
export GIT_BRANCH=$(echo "$GITHUB_REF" | sed -e "s/refs\/heads\///g")
45+
fi
46+
47+
if [[ -n "$GIT_TAG" ]]; then
48+
BRANCHES="$GIT_TAG"
49+
echo "Will handle tag $BRANCHES"
50+
else
51+
MAINBRANCH=$(git rev-parse --abbrev-ref HEAD)
52+
BRANCHES="$MAINBRANCH"
53+
fi
54+
55+
for branch in $BRANCHES; do
56+
if [[ -n "$GIT_TAG" ]]; then
57+
export IMAGE_TAG="$GIT_TAG"
58+
else
59+
git checkout $branch
60+
export IMAGE_TAG="nightly-$branch"
61+
fi
62+
docker build -t "$DOCKER_ORG/xinference:${IMAGE_TAG}" --progress=plain -f xinference/deploy/docker/Dockerfile .
63+
docker push "$DOCKER_ORG/xinference:${IMAGE_TAG}"
64+
done

doc/source/getting_started/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,6 @@ Getting Started
1111
installation
1212
using_xinference
1313
logging
14+
using_docker_image
1415
troubleshooting
1516
environments
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
.. _using_docker_image:
2+
3+
=======================
4+
Xinference Docker Image
5+
=======================
6+
7+
Xinference provides official images for use on Dockerhub.
8+
9+
10+
Prerequisites
11+
=============
12+
* The image can only run in an environment with GPUs and CUDA installed, because Xinference in the image relies on CUBLAS for acceleration.
13+
* CUDA must be successfully installed on the host machine. This can be determined by whether you can successfully execute the ``nvidia-smi`` command.
14+
* The CUDA version in the docker image is ``12.1``, and the CUDA version on the host machine should ideally be consistent with it. Be sure to keep the CUDA version on your host machine between ``11.8`` and ``12.2``, even if it is inconsistent.
15+
16+
17+
Docker Image
18+
============
19+
The official image of Xinference is available on DockerHub in the repository ``xprobe/xinference``.
20+
There are two kinds of image tags available:
21+
22+
* ``nightly-main``: This image is built daily from the `GitHub main branch <https://github.com/xorbitsai/inference>`_ and generally does not guarantee stability.
23+
* ``v<release version>``: This image is built each time a Xinference release version is published, and it is typically more stable.
24+
25+
26+
Dockerfile for custom build
27+
===========================
28+
If you need to build the Xinference image according to your own requirements, the source code for the Dockerfile is located at `xinference/deploy/docker/Dockerfile <https://github.com/xorbitsai/inference/tree/main/xinference/deploy/docker/Dockerfile>`_ for reference.
29+
Please make sure to be in the top-level directory of Xinference when using this Dockerfile. For example:
30+
31+
.. code-block:: bash
32+
33+
git clone https://github.com/xorbitsai/inference.git
34+
cd inference
35+
docker build --progress=plain -t test -f xinference/deploy/docker/Dockerfile .
36+
37+
38+
Image usage
39+
===========
40+
You can start Xinference in the container like this, simultaneously mapping port 9997 in the container to port 9998 on the host, enabling debug logging, and disabling vllm.
41+
42+
.. code-block:: bash
43+
44+
docker run -e XINFERENCE_DISABLE_VLLM=1 -p 9998:9997 --gpus all xprobe/xinference:v<your_version> xinference-local -H 0.0.0.0 --log-level debug
45+
46+
47+
.. warning::
48+
* The option ``--gpus`` is essential and cannot be omitted, because as mentioned earlier, the image requires the host machine to have a GPU. Otherwise, errors will occur.
49+
* The ``-H 0.0.0.0`` parameter after the ``xinference-local`` command cannot be omitted. Otherwise, the host machine may not be able to access the port inside the container.
50+
* You can add multiple ``-e`` options to introduce multiple environment variables.
51+
52+
53+
Certainly, if you prefer, you can also manually enter the docker container and start Xinference in any desired way.
54+
55+
56+
Mount your volume for loading and saving models
57+
===============================================
58+
The image does not contain any model files by default, and it downloads the models into the container.
59+
Typically, you would need to mount a directory on the host machine to the docker container, so that Xinference can download the models onto it, allowing for reuse.
60+
In this case, you need to specify a volume when running the Docker image and configure environment variables for Xinference:
61+
62+
.. code-block:: bash
63+
64+
docker run -v </on/your/host>:</on/the/container> -e XINFERENCE_HOME=</on/the/container> -p 9998:9997 --gpus all xprobe/xinference:v<your_version> xinference-local -H 0.0.0.0
65+
66+
67+
The principle behind the above command is to mount the specified directory from the host machine into the container, and then set the ``XINFERENCE_HOME`` environment variable to point to that directory inside the container.
68+
This way, all downloaded model files will be stored in the directory you specified on the host machine.
69+
You don't have to worry about losing them when the Docker container stops, and the next time you run it, you can directly use the existing models without the need for repetitive downloads.

xinference/deploy/docker/Dockerfile

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
FROM pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel
2+
3+
COPY . /opt/inference
4+
5+
ENV NVM_DIR /usr/local/nvm
6+
ENV NODE_VERSION 14.21.1
7+
8+
RUN apt-get -y update \
9+
&& apt install -y curl procps \
10+
&& mkdir -p $NVM_DIR \
11+
&& curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash \
12+
&& . $NVM_DIR/nvm.sh \
13+
&& nvm install $NODE_VERSION \
14+
&& nvm alias default $NODE_VERSION \
15+
&& nvm use default \
16+
&& apt-get -yq clean
17+
18+
ENV PATH $NVM_DIR/versions/node/v$NODE_VERSION/bin:$PATH
19+
20+
ARG PIP_INDEX=https://pypi.org/simple
21+
RUN python -m pip install --upgrade -i "$PIP_INDEX" pip && \
22+
CMAKE_ARGS="-DGGML_CUBLAS=ON" pip install -i "$PIP_INDEX" -U chatglm-cpp && \
23+
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install -i "$PIP_INDEX" -U llama-cpp-python && \
24+
cd /opt/inference && \
25+
python setup.py build_web && \
26+
pip install -i "$PIP_INDEX" ".[all]"

0 commit comments

Comments
 (0)