We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python app.py /data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:410: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead. @custom_fwd /data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:418: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead. @custom_bwd /data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:461: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead. @custom_fwd(cast_inputs=torch.float16) CUDA extension not installed. CUDA extension not installed. Try importing flash-attention for faster inference... Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention `loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████| 3/3 [00:00<00:00, 4.38it/s] Some weights of the model checkpoint at Qwen/Qwen-7B-Chat-Int4 were not used when initializing QWenLMHeadModel: ['transformer.h.0.mlp.w1.bias', 'transformer.h.31.attn.c_proj.bias', 'transformer.h.6.mlp.w1.bias', 'transformer.h.25.mlp.w1.bias', 'transformer.h.14.attn.c_proj.bias', 'transformer.h.7.mlp.w2.bias', 'transformer.h.13.mlp.w2.bias', 'transformer.h.22.mlp.w2.bias', 'transformer.h.21.mlp.w1.bias', 'transformer.h.27.mlp.c_proj.bias', 'transformer.h.16.mlp.w2.bias', 'transformer.h.23.mlp.c_proj.bias', 'transformer.h.12.mlp.w1.bias', 'transformer.h.30.attn.c_proj.bias', 'transformer.h.11.mlp.c_proj.bias', 'transformer.h.12.attn.c_proj.bias', 'transformer.h.31.mlp.c_proj.bias', 'transformer.h.7.attn.c_proj.bias', 'transformer.h.17.mlp.c_proj.bias', 'transformer.h.21.mlp.c_proj.bias', 'transformer.h.26.mlp.w1.bias', 'transformer.h.8.mlp.w1.bias', 'transformer.h.3.mlp.w2.bias', 'transformer.h.9.attn.c_proj.bias', 'transformer.h.11.mlp.w2.bias', 'transformer.h.29.mlp.c_proj.bias', 'transformer.h.7.mlp.w1.bias', 'transformer.h.3.mlp.c_proj.bias', 'transformer.h.1.mlp.c_proj.bias', 'transformer.h.16.attn.c_proj.bias', 'transformer.h.15.attn.c_proj.bias', 'transformer.h.22.mlp.w1.bias', 'transformer.h.17.mlp.w2.bias', 'transformer.h.25.mlp.w2.bias', 'transformer.h.27.mlp.w1.bias', 'transformer.h.0.mlp.w2.bias', 'transformer.h.1.attn.c_proj.bias', 'transformer.h.2.mlp.w1.bias', 'transformer.h.19.mlp.w2.bias', 'transformer.h.18.mlp.c_proj.bias', 'transformer.h.16.mlp.c_proj.bias', 'transformer.h.23.mlp.w1.bias', 'transformer.h.20.mlp.w1.bias', 'transformer.h.24.mlp.c_proj.bias', 'transformer.h.13.mlp.w1.bias', 'transformer.h.25.mlp.c_proj.bias', 'transformer.h.3.mlp.w1.bias', 'transformer.h.12.mlp.w2.bias', 'transformer.h.18.attn.c_proj.bias', 'transformer.h.6.attn.c_proj.bias', 'transformer.h.2.attn.c_proj.bias', 'transformer.h.0.attn.c_proj.bias', 'transformer.h.9.mlp.w1.bias', 'transformer.h.2.mlp.w2.bias', 'transformer.h.13.mlp.c_proj.bias', 'transformer.h.19.attn.c_proj.bias', 'transformer.h.4.mlp.w2.bias', 'transformer.h.18.mlp.w1.bias', 'transformer.h.16.mlp.w1.bias', 'transformer.h.20.attn.c_proj.bias', 'transformer.h.1.mlp.w1.bias', 'transformer.h.31.mlp.w1.bias', 'transformer.h.12.mlp.c_proj.bias', 'transformer.h.27.attn.c_proj.bias', 'transformer.h.4.attn.c_proj.bias', 'transformer.h.5.mlp.w1.bias', 'transformer.h.17.attn.c_proj.bias', 'transformer.h.15.mlp.w2.bias', 'transformer.h.19.mlp.c_proj.bias', 'transformer.h.14.mlp.w1.bias', 'transformer.h.20.mlp.w2.bias', 'transformer.h.5.attn.c_proj.bias', 'transformer.h.10.mlp.w1.bias', 'transformer.h.29.mlp.w2.bias', 'transformer.h.11.attn.c_proj.bias', 'transformer.h.8.mlp.c_proj.bias', 'transformer.h.28.mlp.c_proj.bias', 'transformer.h.9.mlp.c_proj.bias', 'transformer.h.23.attn.c_proj.bias', 'transformer.h.25.attn.c_proj.bias', 'transformer.h.22.mlp.c_proj.bias', 'transformer.h.4.mlp.w1.bias', 'transformer.h.24.mlp.w2.bias', 'transformer.h.6.mlp.w2.bias', 'transformer.h.8.attn.c_proj.bias', 'transformer.h.5.mlp.w2.bias', 'transformer.h.2.mlp.c_proj.bias', 'transformer.h.4.mlp.c_proj.bias', 'transformer.h.21.mlp.w2.bias', 'transformer.h.14.mlp.w2.bias', 'transformer.h.17.mlp.w1.bias', 'transformer.h.24.attn.c_proj.bias', 'transformer.h.28.attn.c_proj.bias', 'transformer.h.20.mlp.c_proj.bias', 'transformer.h.30.mlp.w1.bias', 'transformer.h.18.mlp.w2.bias', 'transformer.h.10.attn.c_proj.bias', 'transformer.h.28.mlp.w1.bias', 'transformer.h.23.mlp.w2.bias', 'transformer.h.6.mlp.c_proj.bias', 'transformer.h.9.mlp.w2.bias', 'transformer.h.10.mlp.c_proj.bias', 'transformer.h.11.mlp.w1.bias', 'transformer.h.13.attn.c_proj.bias', 'transformer.h.29.mlp.w1.bias', 'transformer.h.1.mlp.w2.bias', 'transformer.h.3.attn.c_proj.bias', 'transformer.h.22.attn.c_proj.bias', 'transformer.h.28.mlp.w2.bias', 'transformer.h.30.mlp.c_proj.bias', 'transformer.h.31.mlp.w2.bias', 'transformer.h.14.mlp.c_proj.bias', 'transformer.h.26.mlp.c_proj.bias', 'transformer.h.24.mlp.w1.bias', 'transformer.h.7.mlp.c_proj.bias', 'transformer.h.30.mlp.w2.bias', 'transformer.h.15.mlp.c_proj.bias', 'transformer.h.5.mlp.c_proj.bias', 'transformer.h.27.mlp.w2.bias', 'transformer.h.21.attn.c_proj.bias', 'transformer.h.10.mlp.w2.bias', 'transformer.h.8.mlp.w2.bias', 'transformer.h.26.mlp.w2.bias', 'transformer.h.0.mlp.c_proj.bias', 'transformer.h.26.attn.c_proj.bias', 'transformer.h.19.mlp.w1.bias', 'transformer.h.15.mlp.w1.bias', 'transformer.h.29.attn.c_proj.bias'] - This IS expected if you are initializing QWenLMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing QWenLMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Found modules on cpu/disk. Using Exllama/Exllamav2 backend requires all the modules to be on GPU. Setting `disable_exllama=True` Traceback (most recent call last): File "/data2/home/xxx/project_one/BotChat-main/app.py", line 22, in <module> 'chatglm2-6b-int4': HFChatModel('/data2/home/xxx/project_one/BotChat-main/THUDM/chatglm2-6b-int4', system_prompt=default_system_prompt), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data2/home/xxx/project_one/BotChat-main/botchat/chat_api/hf_chat.py", line 97, in __init__ model = LoadModel.from_pretrained(model_path, trust_remote_code=True, device_map='cpu') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data2/home/xxx/miniconda3/envs/botchat/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 567, in from_pretrained raise ValueError( ValueError: Unrecognized configuration class <class 'transformers_modules.chatglm2-6b-int4.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieConfig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, GitConfig, GlmConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, PhimoeConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, ZambaConfig, Zamba2Config, QWenConfig.
NAME="Ubuntu" VERSION="18.04.6 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04.6 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic
accelerate 1.4.0 aiofiles 23.2.1 aiohappyeyeballs 2.4.6 aiohttp 3.11.13 aiosignal 1.3.2 annotated-types 0.7.0 anyio 4.8.0 asttokens 2.0.5 attrs 25.1.0 auto_gptq 0.7.1 bitsandbytes 0.45.3 BotChat 0.1.0 /data2/home/xxx/project_one/BotChat-main certifi 2025.1.31 charset-normalizer 3.4.1 click 8.1.8 comm 0.2.1 contourpy 1.3.1 cpm-kernels 1.0.11 cycler 0.12.1 datasets 3.3.2 debugpy 1.8.11 decorator 5.1.1 dill 0.3.8 distro 1.9.0 einops 0.8.1 et_xmlfile 2.0.0 executing 0.8.3 fastapi 0.115.8 ffmpy 0.5.0 filelock 3.17.0 fonttools 4.56.0 frozenlist 1.5.0 fsspec 2024.12.0 gekko 1.2.1 gradio 5.19.0 gradio_client 1.7.2 h11 0.14.0 httpcore 1.0.7 httpx 0.28.1 huggingface-hub 0.29.1 idna 3.10 ipykernel 6.29.5 ipython 8.30.0 jedi 0.19.2 Jinja2 3.1.5 jiter 0.8.2 jupyter_client 8.6.3 jupyter_core 5.7.2 kiwisolver 1.4.8 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.10.1 matplotlib-inline 0.1.6 mdurl 0.1.2 mpmath 1.3.0 multidict 6.1.0 multiprocess 0.70.16 nest-asyncio 1.6.0 networkx 3.4.2 numpy 2.2.3 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-cusparselt-cu12 0.6.2 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127 openai 1.64.0 openpyxl 3.1.5 optimum 1.24.0 orjson 3.10.15 packaging 24.2 pandas 2.2.3 parso 0.8.4 peft 0.14.0 pexpect 4.8.0 pillow 11.1.0 pip 25.0 platformdirs 3.10.0 prompt-toolkit 3.0.43 propcache 0.3.0 psutil 5.9.0 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 19.0.1 pydantic 2.10.6 pydantic_core 2.27.2 pydub 0.25.1 Pygments 2.15.1 pyparsing 3.2.1 python-dateutil 2.9.0.post0 python-multipart 0.0.20 pytz 2025.1 PyYAML 6.0.2 pyzmq 26.2.0 regex 2024.11.6 requests 2.32.3 rich 13.9.4 rouge 1.0.1 ruff 0.9.7 safehttpx 0.1.6 safetensors 0.5.3 seaborn 0.13.2 semantic-version 2.10.0 sentencepiece 0.2.0 setuptools 75.8.0 shellingham 1.5.4 six 1.16.0 sniffio 1.3.1 stack-data 0.2.0 starlette 0.45.3 sympy 1.13.1 tabulate 0.9.0 termcolor 2.5.0 tiktoken 0.9.0 tokenizers 0.21.0 tomlkit 0.13.2 torch 2.6.0 tornado 6.4.2 tqdm 4.67.1 traitlets 5.14.3 transformers 4.49.0 transformers-stream-generator 0.0.5 triton 3.2.0 typer 0.15.1 typing_extensions 4.12.2 tzdata 2025.1 urllib3 2.3.0 uvicorn 0.34.0 wcwidth 0.2.5 websockets 15.0 wheel 0.45.1 xxhash 3.5.0 yarl 1.18.3 CUDA Version: 12.1
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Here is my error message
Here is my system version
Here is the version information of my Python library
The text was updated successfully, but these errors were encountered: