@@ -37,19 +37,19 @@ git clone https://huggingface.co/openai/clip-vit-large-patch14-336
37
37
2 . Install the required Python packages:
38
38
39
39
``` sh
40
- pip install -r tools/llava /requirements.txt
40
+ pip install -r tools/mtmd /requirements.txt
41
41
```
42
42
43
43
3 . Use ` llava_surgery.py ` to split the LLaVA model to LLaMA and multimodel projector constituents:
44
44
45
45
``` sh
46
- python ./tools/llava /llava_surgery.py -m ../llava-v1.5-7b
46
+ python ./tools/mtmd /llava_surgery.py -m ../llava-v1.5-7b
47
47
```
48
48
49
49
4 . Use ` convert_image_encoder_to_gguf.py ` to convert the LLaVA image encoder to GGUF:
50
50
51
51
``` sh
52
- python ./tools/llava /convert_image_encoder_to_gguf.py -m ../clip-vit-large-patch14-336 --llava-projector ../llava-v1.5-7b/llava.projector --output-dir ../llava-v1.5-7b
52
+ python ./tools/mtmd /convert_image_encoder_to_gguf.py -m ../clip-vit-large-patch14-336 --llava-projector ../llava-v1.5-7b/llava.projector --output-dir ../llava-v1.5-7b
53
53
```
54
54
55
55
5 . Use ` examples/convert_legacy_llama.py ` to convert the LLaMA part of LLaVA to GGUF:
@@ -69,12 +69,12 @@ git clone https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b
69
69
2 ) Install the required Python packages:
70
70
71
71
``` sh
72
- pip install -r tools/llava /requirements.txt
72
+ pip install -r tools/mtmd /requirements.txt
73
73
```
74
74
75
75
3 ) Use ` llava_surgery_v2.py ` which also supports llava-1.5 variants pytorch as well as safetensor models:
76
76
``` console
77
- python tools/llava /llava_surgery_v2.py -C -m ../llava-v1.6-vicuna-7b/
77
+ python tools/mtmd /llava_surgery_v2.py -C -m ../llava-v1.6-vicuna-7b/
78
78
```
79
79
- you will find a llava.projector and a llava.clip file in your model directory
80
80
@@ -88,7 +88,7 @@ curl -s -q https://huggingface.co/cmp-nct/llava-1.6-gguf/raw/main/config_vit.jso
88
88
89
89
5 ) Create the visual gguf model:
90
90
``` console
91
- python ./tools/llava /convert_image_encoder_to_gguf.py -m vit --llava-projector vit/llava.projector --output-dir vit --clip-model-is-vision
91
+ python ./tools/mtmd /convert_image_encoder_to_gguf.py -m vit --llava-projector vit/llava.projector --output-dir vit --clip-model-is-vision
92
92
```
93
93
- This is similar to llava-1.5, the difference is that we tell the encoder that we are working with the pure vision model part of CLIP
94
94
0 commit comments