Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the huggingface token parameter, and modify the file path in llama.cpp repo #761

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

melodyliu1986
Copy link
Contributor

@melodyliu1986 melodyliu1986 commented Aug 15, 2024

I want to use the mistralai/Mistral-7B-Instruct-v0.2 model, and found there are no gguf files in HuggingFace, then I decided to use the ./convert_models functions to convert the model. I found there are some issues exist:

  1. 401 Client Error
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66b1862a-1bc229376e7f3f4020a3c951;60195d59-03d1-4f26-b3ce-d3b04c2fe2b4)
Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/36d7e540e651b68dac59394d9c3381651df7fb01/.gitattributes

So I added the optional HF_TOKEN=<YOUR_HF_TOKEN_ID> parameter in the code. If the users want to download the public model, there is no token needed; If the users want to download the private model, they need to use the huggingface token;
Impacted files: README.md, download_huggingface.py, run.sh

  1. No convert.py and quantize files under llama.cpp
python: can't open file '/opt/app-root/src/converter/llama.cpp/convert.py': [Errno 2] No such file or directory
run.sh: line 23: llama.cpp/quantize: No such file or directory

If we go to https://github.com/ggerganov/llama.cpp.git, we can find the convert.py has been deprecated and moved to examples/convert_legacy_llama.py. I am not sure if I should just keep the line "python llama.cpp/convert-hf-to-gguf.py /opt/app-root/src/converter/converted_models/$hf_model_url", I just replace the convert.py with the correct path. also for llama.cpp/quantize

Impacted file: run.sh

  1. No image name was specified in the README.md

So I added "converter" in the "podman run" command.

  1. Modified the huggingface token parameter to UI web.
    Impacted file: ui.py

Here is my testing after the modification:

$ podman run -it --rm -v models:/converter/converted_models -e HF_MODEL_URL=mistralai/Mistral-7B-Instruct-v0.2 -e HF_TOKEN=*** -e QUANTIZATION=Q4_K_M -e KEEP_ORIGINAL_MODEL="False" localhost/converter

README.md: 100%|███████████████████████████████████████████████████████████████████████████████████| 5.47k/5.47k [00:00<00:00, 21.9MB/s]
.gitattributes: 100%|██████████████████████████████████████████████████████████████████████████████| 1.52k/1.52k [00:00<00:00, 8.79MB/s]
model.safetensors.index.json: 100%|█████████████████████████████████████████████████████████████████| 25.1k/25.1k [00:00<00:00, 357kB/s]
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████| 596/596 [00:00<00:00, 3.67MB/s]
generation_config.json: 100%|███████████████████████████████████████████████████████████████████████████| 111/111 [00:00<00:00, 621kB/s]
pytorch_model.bin.index.json: 100%|████████████████████████████████████████████████████████████████| 23.9k/23.9k [00:00<00:00, 72.1MB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████| 414/414 [00:00<00:00, 6.70MB/s]
tokenizer.model: 100%|████████████████████████████████████████████████████████████████████████████████| 493k/493k [00:00<00:00, 861kB/s]
tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████| 2.10k/2.10k [00:00<00:00, 12.7MB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████| 1.80M/1.80M [00:02<00:00, 630kB/s]
model-00001-of-00003.safetensors: 100%|████████████████████████████████████████████████████████████| 4.94G/4.94G [52:42<00:00, 1.56MB/s]
model-00003-of-00003.safetensors: 100%|██████████████████████████████████████████████████████████| 4.54G/4.54G [1:01:03<00:00, 1.24MB/s]
pytorch_model-00001-of-00003.bin: 100%|██████████████████████████████████████████████████████████| 4.94G/4.94G [1:05:53<00:00, 1.25MB/s]
pytorch_model-00002-of-00003.bin: 100%|██████████████████████████████████████████████████████████| 5.00G/5.00G [1:06:22<00:00, 1.26MB/s]
model-00002-of-00003.safetensors: 100%|██████████████████████████████████████████████████████████| 5.00G/5.00G [1:07:19<00:00, 1.24MB/s]
pytorch_model-00003-of-00003.bin: 100%|██████████████████████████████████████████████████████████| 5.06G/5.06G [1:07:36<00:00, 1.25MB/s]
Fetching 16 files: 100%|█████████
INFO:convert:Loading model file /opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/model-00001-of-00003.safetensorsmodel-00002-of-00003.bin:  99%|██████████████████████████████████████████████████████████▍| 4.95G/5.00G [5:50:49<03:48, 222kB/s]
INFO:convert:Loading model file /opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/model-00001-of-00003.safetensorsmodel-00002-of-00003.bin: 100%|███████████████████████████████████████████████████████████| 5.00G/5.00G [5:54:12<00:00, 229kB/s]
....
INFO:convert:Loading model file /opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/model-00002-of-00003.safetensors
INFO:convert:Loading model file /opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/model-00003-of-00003.safetensors
INFO:convert:params = Params(n_vocab=32000, n_embd=4096, n_layer=32, n_ctx=32768, n_ff=14336, n_head=32, n_head_kv=8, n_experts=None, n_experts_used=None, f_norm_eps=1e-05, rope_scaling_type=None, f_rope_freq_base=1000000.0, f_rope_scale=None, n_ctx_orig=None, rope_finetuned=None, ftype=None, path_model=PosixPath('/opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2'))
INFO:convert:Loaded vocab file PosixPath('/opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/tokenizer.model'), type 'spm'
INFO:convert:model parameters count : (7241732096, 7241732096, 0) (7.2B)
INFO:convert:Vocab info: <SentencePieceVocab with 32000 base tokens and 0 added tokens>
INFO:convert:Special vocab info: <SpecialVocab with 0 merges, special tokens {'bos': 1, 'eos': 2, 'unk': 0}, add special tokens {'bos': True, 'eos': False}>
INFO:convert:Writing /opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/Mistral-7B-Instruct-v0.2-F32.gguf, format 0
......
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:/opt/app-root/src/converter/converted_models/mistralai/Mistral-7B-Instruct-v0.2/Mistral-7B-Instruct-v0.2-F32.gguf: n_tensors = 291, total_size = 29.0G
INFO:convert:[  1/291] Writing tensor token_embd.weight                      | size  32000 x   4096  | type F32  | T+   0
INFO:convert:[  2/291] Writing tensor blk.0.attn_norm.weight                 | size   4096           | type F32  | T+   1
INFO:convert:[  3/291] Writing tensor blk.0.ffn_down.weight                  | size   4096 x  14336  | type F32  | T+   1
INFO:convert:[  4/291] Writing tensor blk.0.ffn_gate.weight                  | size  14336 x   4096  | type F32  | T+   1
....

Here is the UI web testing with a public model(no token is needed)
Screenshot at 2024-08-14 17-35-34

@melodyliu1986
Copy link
Contributor Author

@rhatdan @MichaelClifford

I made some mistakes in the previous PR #741, so I closed it. Is there anyway to delete 741?

I made changes according to your comments, please review it again.

@rhatdan
Copy link
Member

rhatdan commented Aug 19, 2024

LGTM

@MichaelClifford
Copy link
Collaborator

Great! Thanks for making those changes @melodyliu1986 Will re-review now.

Copy link
Collaborator

@MichaelClifford MichaelClifford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MichaelClifford MichaelClifford merged commit e82e739 into containers:main Aug 19, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants