Skip to content

Commit 04dc552

Browse files
committed
add instruction to merge large mdoel gguf file. Also update example model url
1 parent bd52241 commit 04dc552

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,19 @@ npm install jsonrepair
2222
3. Download a compatible Rubra GGUF model:
2323
For example:
2424
```
25-
wget https://huggingface.co/rubra-ai/Llama-3-8b-function-calling-alpha-v1.gguf/resolve/main/Llama-3-8b-function-calling-alpha-v1.gguf
25+
wget https://huggingface.co/rubra-ai/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/rubra-meta-llama-3-8b-instruct.Q6_K.gguf
2626
```
2727

28+
**Info**
29+
For large multi-part model files, such as [rubra-meta-llama-3-70b-instruct_Q6_K-0000*-of-00003.gguf](https://huggingface.co/rubra-ai/Meta-Llama-3-70B-Instruct-GGUF/tree/main), use the following command to merge them before proceeding to the next step:
30+
```
31+
./llama-gguf-split --merge rubra-meta-llama-3-70b-instruct_Q6_K-0000*-of-00003.gguf rubra-meta-llama-3-70b-instruct_Q6_K.gguf
32+
```
33+
This will merge multi-part model files to one gguf file `rubra-meta-llama-3-70b-instruct_Q6_K.gguf`.
34+
2835
4. start openai compatible server:
2936
```
30-
./llama-server -ngl 37 -m Llama-3-8b-function-calling-alpha-v1.gguf --port 1234 --host 0.0.0.0 -c 8000 --chat-template llama3
37+
./llama-server -ngl 37 -m rubra-meta-llama-3-8b-instruct.Q6_K.gguf --port 1234 --host 0.0.0.0 -c 8000 --chat-template llama3
3138
```
3239

3340
5. Test the server, ensure it is available:

0 commit comments

Comments
 (0)