Skip to content

Commit 01b9294

Browse files
committed
docs: update s390x docs
Signed-off-by: Aaron Teo <[email protected]>
1 parent 2b4892e commit 01b9294

File tree

2 files changed

+32
-11
lines changed

2 files changed

+32
-11
lines changed

docs/build-s390x.md

Lines changed: 28 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,9 @@ cmake --build build --config Release -j $(nproc)
2828
```
2929

3030
**Notes**:
31-
- For faster repeated compilation, install [ccache](https://ccache.dev/)
32-
- By default, VXE/VXE2 is enabled. To disable it (not recommended):
31+
32+
- For faster repeated compilation, install [ccache](https://ccache.dev/)
33+
- By default, VXE/VXE2 is enabled. To disable it (not recommended):
3334

3435
```bash
3536
cmake -S . -B build \
@@ -41,18 +42,29 @@ cmake --build build --config Release -j $(nproc)
4142
cmake --build build --config Release -j $(nproc)
4243
```
4344

44-
- For debug builds:
45+
- By default, NNPA is enabled when available. To disable it (not recommended):
46+
47+
```bash
48+
cmake -S . -B build \
49+
-DCMAKE_BUILD_TYPE=Release \
50+
-DGGML_BLAS=ON \
51+
-DGGML_BLAS_VENDOR=OpenBLAS \
52+
-DGGML_NNPA=OFF
53+
54+
cmake --build build --config Release -j $(nproc)
55+
```
56+
57+
- For debug builds:
4558

4659
```bash
4760
cmake -S . -B build \
4861
-DCMAKE_BUILD_TYPE=Debug \
4962
-DGGML_BLAS=ON \
5063
-DGGML_BLAS_VENDOR=OpenBLAS
51-
5264
cmake --build build --config Debug -j $(nproc)
5365
```
5466

55-
- For static builds, add `-DBUILD_SHARED_LIBS=OFF`:
67+
- For static builds, add `-DBUILD_SHARED_LIBS=OFF`:
5668

5769
```bash
5870
cmake -S . -B build \
@@ -101,27 +113,33 @@ All models need to be converted to Big-Endian. You can achieve this in three cas
101113
```
102114

103115
For example,
116+
104117
```bash
105118
python3 gguf-py/gguf/scripts/gguf_convert_endian.py granite-3.3-2b-instruct-le.f16.gguf BIG
106119
mv granite-3.3-2b-instruct-le.f16.gguf granite-3.3-2b-instruct-be.f16.gguf
107120
```
108121

109122
**Notes:**
123+
110124
- The GGUF endian conversion script may not support all data types at the moment and may fail for some models/quantizations. When that happens, please try manually converting the safetensors model to GGUF Big-Endian via Step 2.
111125

112126
## IBM Accelerators
113127

114128
### 1. SIMD Acceleration
115129

116-
Only available in IBM z15 or later system with the `-DGGML_VXE=ON` (turned on by default) compile flag. No hardware acceleration is possible with llama.cpp with older systems, such as IBM z14 or EC13. In such systems, the APIs can still run but will use a scalar implementation.
130+
Only available in IBM z15 or later system with the `-DGGML_VXE=ON` (turned on by default) compile flag. No hardware acceleration is possible with llama.cpp with older systems, such as IBM z14/arch12. In such systems, the APIs can still run but will use a scalar implementation.
131+
132+
### 2. NNPA Vector Intrinsics Acceleration
117133

118-
### 2. zDNN Accelerator
134+
Only available in IBM z16 or later system with the `-DGGML_NNPA=ON` (turned on when available) compile flag. No hardware acceleration is possible with llama.cpp with older systems, such as IBM z15/arch13. In such systems, the APIs can still run but will use a scalar implementation.
119135

120-
*Only available in IBM z16 or later system. No direction at the moment.*
136+
### 3. zDNN Accelerator
121137

122-
### 3. Spyre Accelerator
138+
_Only available in IBM z16 or later system. No direction at the moment._
123139

124-
*No direction at the moment.*
140+
### 4. Spyre Accelerator
141+
142+
_No direction at the moment._
125143

126144
## Performance Tuning
127145

@@ -154,4 +172,3 @@ IBM VXE/VXE2 SIMD acceleration depends on the BLAS implementation. It is strongl
154172
2. **Other Questions**
155173
156174
Please reach out directly to [[email protected]](mailto:[email protected]).
157-

docs/build.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -557,6 +557,10 @@ ninja
557557

558558
To read documentation for how to build on Android, [click here](./android.md)
559559

560+
## IBM Z & LinuxONE
561+
562+
To read documentation for how to build on IBM Z & LinuxONE, [click here](./build-s390x.md)
563+
560564
## Notes about GPU-accelerated backends
561565

562566
The GPU may still be used to accelerate some parts of the computation even when using the `-ngl 0` option. You can fully disable GPU acceleration by using `--device none`.

0 commit comments

Comments
 (0)