Skip to content

The project cannot run #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jauhgia opened this issue Mar 3, 2025 · 0 comments
Open

The project cannot run #11

jauhgia opened this issue Mar 3, 2025 · 0 comments

Comments

@jauhgia
Copy link

jauhgia commented Mar 3, 2025

Here is my error message

python app.py
/data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:410: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
  @custom_fwd
/data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:418: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
  @custom_bwd
/data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:461: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
  @custom_fwd(cast_inputs=torch.float16)
CUDA extension not installed.
CUDA extension not installed.
Try importing flash-attention for faster inference...
Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████| 3/3 [00:00<00:00,  4.38it/s]
Some weights of the model checkpoint at Qwen/Qwen-7B-Chat-Int4 were not used when initializing QWenLMHeadModel: ['transformer.h.0.mlp.w1.bias', 'transformer.h.31.attn.c_proj.bias', 'transformer.h.6.mlp.w1.bias', 'transformer.h.25.mlp.w1.bias', 'transformer.h.14.attn.c_proj.bias', 'transformer.h.7.mlp.w2.bias', 'transformer.h.13.mlp.w2.bias', 'transformer.h.22.mlp.w2.bias', 'transformer.h.21.mlp.w1.bias', 'transformer.h.27.mlp.c_proj.bias', 'transformer.h.16.mlp.w2.bias', 'transformer.h.23.mlp.c_proj.bias', 'transformer.h.12.mlp.w1.bias', 'transformer.h.30.attn.c_proj.bias', 'transformer.h.11.mlp.c_proj.bias', 'transformer.h.12.attn.c_proj.bias', 'transformer.h.31.mlp.c_proj.bias', 'transformer.h.7.attn.c_proj.bias', 'transformer.h.17.mlp.c_proj.bias', 'transformer.h.21.mlp.c_proj.bias', 'transformer.h.26.mlp.w1.bias', 'transformer.h.8.mlp.w1.bias', 'transformer.h.3.mlp.w2.bias', 'transformer.h.9.attn.c_proj.bias', 'transformer.h.11.mlp.w2.bias', 'transformer.h.29.mlp.c_proj.bias', 'transformer.h.7.mlp.w1.bias', 'transformer.h.3.mlp.c_proj.bias', 'transformer.h.1.mlp.c_proj.bias', 'transformer.h.16.attn.c_proj.bias', 'transformer.h.15.attn.c_proj.bias', 'transformer.h.22.mlp.w1.bias', 'transformer.h.17.mlp.w2.bias', 'transformer.h.25.mlp.w2.bias', 'transformer.h.27.mlp.w1.bias', 'transformer.h.0.mlp.w2.bias', 'transformer.h.1.attn.c_proj.bias', 'transformer.h.2.mlp.w1.bias', 'transformer.h.19.mlp.w2.bias', 'transformer.h.18.mlp.c_proj.bias', 'transformer.h.16.mlp.c_proj.bias', 'transformer.h.23.mlp.w1.bias', 'transformer.h.20.mlp.w1.bias', 'transformer.h.24.mlp.c_proj.bias', 'transformer.h.13.mlp.w1.bias', 'transformer.h.25.mlp.c_proj.bias', 'transformer.h.3.mlp.w1.bias', 'transformer.h.12.mlp.w2.bias', 'transformer.h.18.attn.c_proj.bias', 'transformer.h.6.attn.c_proj.bias', 'transformer.h.2.attn.c_proj.bias', 'transformer.h.0.attn.c_proj.bias', 'transformer.h.9.mlp.w1.bias', 'transformer.h.2.mlp.w2.bias', 'transformer.h.13.mlp.c_proj.bias', 'transformer.h.19.attn.c_proj.bias', 'transformer.h.4.mlp.w2.bias', 'transformer.h.18.mlp.w1.bias', 'transformer.h.16.mlp.w1.bias', 'transformer.h.20.attn.c_proj.bias', 'transformer.h.1.mlp.w1.bias', 'transformer.h.31.mlp.w1.bias', 'transformer.h.12.mlp.c_proj.bias', 'transformer.h.27.attn.c_proj.bias', 'transformer.h.4.attn.c_proj.bias', 'transformer.h.5.mlp.w1.bias', 'transformer.h.17.attn.c_proj.bias', 'transformer.h.15.mlp.w2.bias', 'transformer.h.19.mlp.c_proj.bias', 'transformer.h.14.mlp.w1.bias', 'transformer.h.20.mlp.w2.bias', 'transformer.h.5.attn.c_proj.bias', 'transformer.h.10.mlp.w1.bias', 'transformer.h.29.mlp.w2.bias', 'transformer.h.11.attn.c_proj.bias', 'transformer.h.8.mlp.c_proj.bias', 'transformer.h.28.mlp.c_proj.bias', 'transformer.h.9.mlp.c_proj.bias', 'transformer.h.23.attn.c_proj.bias', 'transformer.h.25.attn.c_proj.bias', 'transformer.h.22.mlp.c_proj.bias', 'transformer.h.4.mlp.w1.bias', 'transformer.h.24.mlp.w2.bias', 'transformer.h.6.mlp.w2.bias', 'transformer.h.8.attn.c_proj.bias', 'transformer.h.5.mlp.w2.bias', 'transformer.h.2.mlp.c_proj.bias', 'transformer.h.4.mlp.c_proj.bias', 'transformer.h.21.mlp.w2.bias', 'transformer.h.14.mlp.w2.bias', 'transformer.h.17.mlp.w1.bias', 'transformer.h.24.attn.c_proj.bias', 'transformer.h.28.attn.c_proj.bias', 'transformer.h.20.mlp.c_proj.bias', 'transformer.h.30.mlp.w1.bias', 'transformer.h.18.mlp.w2.bias', 'transformer.h.10.attn.c_proj.bias', 'transformer.h.28.mlp.w1.bias', 'transformer.h.23.mlp.w2.bias', 'transformer.h.6.mlp.c_proj.bias', 'transformer.h.9.mlp.w2.bias', 'transformer.h.10.mlp.c_proj.bias', 'transformer.h.11.mlp.w1.bias', 'transformer.h.13.attn.c_proj.bias', 'transformer.h.29.mlp.w1.bias', 'transformer.h.1.mlp.w2.bias', 'transformer.h.3.attn.c_proj.bias', 'transformer.h.22.attn.c_proj.bias', 'transformer.h.28.mlp.w2.bias', 'transformer.h.30.mlp.c_proj.bias', 'transformer.h.31.mlp.w2.bias', 'transformer.h.14.mlp.c_proj.bias', 'transformer.h.26.mlp.c_proj.bias', 'transformer.h.24.mlp.w1.bias', 'transformer.h.7.mlp.c_proj.bias', 'transformer.h.30.mlp.w2.bias', 'transformer.h.15.mlp.c_proj.bias', 'transformer.h.5.mlp.c_proj.bias', 'transformer.h.27.mlp.w2.bias', 'transformer.h.21.attn.c_proj.bias', 'transformer.h.10.mlp.w2.bias', 'transformer.h.8.mlp.w2.bias', 'transformer.h.26.mlp.w2.bias', 'transformer.h.0.mlp.c_proj.bias', 'transformer.h.26.attn.c_proj.bias', 'transformer.h.19.mlp.w1.bias', 'transformer.h.15.mlp.w1.bias', 'transformer.h.29.attn.c_proj.bias']
- This IS expected if you are initializing QWenLMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing QWenLMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Found modules on cpu/disk. Using Exllama/Exllamav2 backend requires all the modules to be on GPU. Setting `disable_exllama=True`
Traceback (most recent call last):
  File "/data2/home/xxx/project_one/BotChat-main/app.py", line 22, in <module>
    'chatglm2-6b-int4': HFChatModel('/data2/home/xxx/project_one/BotChat-main/THUDM/chatglm2-6b-int4', system_prompt=default_system_prompt),
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/home/xxx/project_one/BotChat-main/botchat/chat_api/hf_chat.py", line 97, in __init__
    model = LoadModel.from_pretrained(model_path, trust_remote_code=True, device_map='cpu')
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 567, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.chatglm2-6b-int4.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieConfig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, GitConfig, GlmConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, PhimoeConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, ZambaConfig, Zamba2Config, QWenConfig.

Here is my system version

NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Here is the version information of my Python library

accelerate                    1.4.0
aiofiles                      23.2.1
aiohappyeyeballs              2.4.6
aiohttp                       3.11.13
aiosignal                     1.3.2
annotated-types               0.7.0
anyio                         4.8.0
asttokens                     2.0.5
attrs                         25.1.0
auto_gptq                     0.7.1
bitsandbytes                  0.45.3
BotChat                       0.1.0       /data2/home/xxx/project_one/BotChat-main
certifi                       2025.1.31
charset-normalizer            3.4.1
click                         8.1.8
comm                          0.2.1
contourpy                     1.3.1
cpm-kernels                   1.0.11
cycler                        0.12.1
datasets                      3.3.2
debugpy                       1.8.11
decorator                     5.1.1
dill                          0.3.8
distro                        1.9.0
einops                        0.8.1
et_xmlfile                    2.0.0
executing                     0.8.3
fastapi                       0.115.8
ffmpy                         0.5.0
filelock                      3.17.0
fonttools                     4.56.0
frozenlist                    1.5.0
fsspec                        2024.12.0
gekko                         1.2.1
gradio                        5.19.0
gradio_client                 1.7.2
h11                           0.14.0
httpcore                      1.0.7
httpx                         0.28.1
huggingface-hub               0.29.1
idna                          3.10
ipykernel                     6.29.5
ipython                       8.30.0
jedi                          0.19.2
Jinja2                        3.1.5
jiter                         0.8.2
jupyter_client                8.6.3
jupyter_core                  5.7.2
kiwisolver                    1.4.8
markdown-it-py                3.0.0
MarkupSafe                    2.1.5
matplotlib                    3.10.1
matplotlib-inline             0.1.6
mdurl                         0.1.2
mpmath                        1.3.0
multidict                     6.1.0
multiprocess                  0.70.16
nest-asyncio                  1.6.0
networkx                      3.4.2
numpy                         2.2.3
nvidia-cublas-cu12            12.4.5.8
nvidia-cuda-cupti-cu12        12.4.127
nvidia-cuda-nvrtc-cu12        12.4.127
nvidia-cuda-runtime-cu12      12.4.127
nvidia-cudnn-cu12             9.1.0.70
nvidia-cufft-cu12             11.2.1.3
nvidia-curand-cu12            10.3.5.147
nvidia-cusolver-cu12          11.6.1.9
nvidia-cusparse-cu12          12.3.1.170
nvidia-cusparselt-cu12        0.6.2
nvidia-nccl-cu12              2.21.5
nvidia-nvjitlink-cu12         12.4.127
nvidia-nvtx-cu12              12.4.127
openai                        1.64.0
openpyxl                      3.1.5
optimum                       1.24.0
orjson                        3.10.15
packaging                     24.2
pandas                        2.2.3
parso                         0.8.4
peft                          0.14.0
pexpect                       4.8.0
pillow                        11.1.0
pip                           25.0
platformdirs                  3.10.0
prompt-toolkit                3.0.43
propcache                     0.3.0
psutil                        5.9.0
ptyprocess                    0.7.0
pure-eval                     0.2.2
pyarrow                       19.0.1
pydantic                      2.10.6
pydantic_core                 2.27.2
pydub                         0.25.1
Pygments                      2.15.1
pyparsing                     3.2.1
python-dateutil               2.9.0.post0
python-multipart              0.0.20
pytz                          2025.1
PyYAML                        6.0.2
pyzmq                         26.2.0
regex                         2024.11.6
requests                      2.32.3
rich                          13.9.4
rouge                         1.0.1
ruff                          0.9.7
safehttpx                     0.1.6
safetensors                   0.5.3
seaborn                       0.13.2
semantic-version              2.10.0
sentencepiece                 0.2.0
setuptools                    75.8.0
shellingham                   1.5.4
six                           1.16.0
sniffio                       1.3.1
stack-data                    0.2.0
starlette                     0.45.3
sympy                         1.13.1
tabulate                      0.9.0
termcolor                     2.5.0
tiktoken                      0.9.0
tokenizers                    0.21.0
tomlkit                       0.13.2
torch                         2.6.0
tornado                       6.4.2
tqdm                          4.67.1
traitlets                     5.14.3
transformers                  4.49.0
transformers-stream-generator 0.0.5
triton                        3.2.0
typer                         0.15.1
typing_extensions             4.12.2
tzdata                        2025.1
urllib3                       2.3.0
uvicorn                       0.34.0
wcwidth                       0.2.5
websockets                    15.0
wheel                         0.45.1
xxhash                        3.5.0
yarl                          1.18.3


    CUDA Version: 12.1    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant