Skip to content

将FunAudioLLM-APP/voice_chat/app.py里的函数改写为接口后,内存和磁盘占用跑满导致电脑卡死 #9

Open
@JV-X

Description

@JV-X

你好,我在Windows11的WSL2下安装了FunAudioLLM-APP环境,并执行了voice_chat下的app.py文件,执行成功了,然后我尝试将app.py里的文本转语音和语音转文本的函数复制出来到一个server.py里,尝试将他们改写为fastapi或flask的接口,出现了以下问题:

  1. 首先资源占用情况,首先是资源情况:内存占用上升,跑到百分之百,然后内存满后磁盘占用也跑满了,磁盘跑满后我的vs code就断开了和wsl的连接,并且电脑响应速度变得很慢,这个过程中显卡占用没有上升,我已经设置了"CUDA_VISIBLE_DEVICES" = "0",
  2. 我在代码里加入了一些调试信息,在FunAudioLLM-APP/cosyvoice/cosyvoice/cli/frontend.py第52行加入了traceback.print_stack()这句代码,然后运行我的脚本,我发现这个方法调用了两次,第一次是正常执行,第二次就是在不知道哪个地方启动了一个进程执行到了这里

日志如下:

(funaudio) hygx@hygx:~/code/funaudiollm-app/FunAudioLLM-APP/voice_chat$ python server.py
2025-02-18 10:24:57,118 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found.
2025-02-18 10:24:57,118 - modelscope - INFO - Loading ast index from /home/hygx/.cache/modelscope/ast_indexer
2025-02-18 10:24:57,264 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 20c375857ea53f3bb9f252cbf4c9cf58 and a total number of 980 components indexed
transformer is not installed, please install it if you want to use related modules
/home/hygx/anaconda3/envs/funaudio/lib/python3.8/site-packages/torch/_jit_internal.py:726: FutureWarning: ignore(True) has been deprecated. TorchScript will now drop the function call on compilation. Use torch.jit.unused now. {}
  warnings.warn(
/home/hygx/anaconda3/envs/funaudio/lib/python3.8/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
  deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
2025-02-18 10:25:03.834324912 [W:onnxruntime:, session_state.cc:1162 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-02-18 10:25:03.834361929 [W:onnxruntime:, session_state.cc:1164 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
00000000000000000000000000000000000000000000000000000000000000000000
111111111111111111111111111
  File "server.py", line 52, in <module>
    cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M-Instruct')
  File "/home/hygx/code/funaudiollm-app/FunAudioLLM-APP/voice_chat/../cosyvoice/cosyvoice/cli/cosyvoice.py", line 30, in __init__
    self.frontend = CosyVoiceFrontEnd(configs['get_tokenizer'],
  File "/home/hygx/code/funaudiollm-app/FunAudioLLM-APP/voice_chat/../cosyvoice/cosyvoice/cli/frontend.py", line 54, in __init__
    traceback.print_stack()
load leagacy transf breakmodel
load leagacy transf breakmodel
text.cc: festival_Text_init
open voice lang map failed
break model index not valid
Loading remote code successfully: ./sensevoice/model.py
2025-02-18 10:25:11,448 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2025-02-18 10:25:11,448 - modelscope - INFO - Use user-specified model revision: master
INFO:     Will watch for changes in these directories: ['/home/hygx/code/funaudiollm-app/FunAudioLLM-APP/voice_chat']
INFO:     Uvicorn running on http://0.0.0.0:5987 (Press CTRL+C to quit)
INFO:     Started reloader process [338184] using WatchFiles
2025-02-18 10:25:13,625 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found.
2025-02-18 10:25:13,625 - modelscope - INFO - Loading ast index from /home/hygx/.cache/modelscope/ast_indexer
2025-02-18 10:25:13,685 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 20c375857ea53f3bb9f252cbf4c9cf58 and a total number of 980 components indexed
transformer is not installed, please install it if you want to use related modules
/home/hygx/anaconda3/envs/funaudio/lib/python3.8/site-packages/torch/_jit_internal.py:726: FutureWarning: ignore(True) has been deprecated. TorchScript will now drop the function call on compilation. Use torch.jit.unused now. {}
  warnings.warn(
/home/hygx/anaconda3/envs/funaudio/lib/python3.8/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
  deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
2025-02-18 10:25:20.475357737 [W:onnxruntime:, session_state.cc:1162 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-02-18 10:25:20.475410315 [W:onnxruntime:, session_state.cc:1164 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
00000000000000000000000000000000000000000000000000000000000000000000
111111111111111111111111111
  File "<string>", line 1, in <module>
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/hygx/anaconda3/envs/funaudio/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/hygx/code/funaudiollm-app/FunAudioLLM-APP/voice_chat/server.py", line 52, in <module>
    cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M-Instruct')
  File "/home/hygx/code/funaudiollm-app/FunAudioLLM-APP/voice_chat/../cosyvoice/cosyvoice/cli/cosyvoice.py", line 30, in __init__
    self.frontend = CosyVoiceFrontEnd(configs['get_tokenizer'],
  File "/home/hygx/code/funaudiollm-app/FunAudioLLM-APP/voice_chat/../cosyvoice/cosyvoice/cli/frontend.py", line 54, in __init__
    traceback.print_stack()
load leagacy transf breakmodel
load leagacy transf breakmodel
text.cc: festival_Text_init
open voice lang map failed
break model index not valid

我的代码:

import re

import torch
import os

os.environ['CUDA_VISIBLE_DEVICES'] = "0"

from fastapi import FastAPI, File, UploadFile
from typing import Optional
from fastapi.responses import StreamingResponse, FileResponse, JSONResponse
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import uvicorn

from http import HTTPStatus
import dashscope
from dashscope import Generation
from dashscope.api_entities.dashscope_response import Role
from typing import List, Optional, Tuple, Dict
from uuid import uuid4
from modelscope import HubApi
import torchaudio
import sys
sys.path.insert(1, "../cosyvoice")
sys.path.insert(1, "../sensevoice")
sys.path.insert(1, "../cosyvoice/third_party/AcademiCodec")
sys.path.insert(1, "../cosyvoice/third_party/Matcha-TTS")
sys.path.insert(1, "../")
from utils.rich_format_small import format_str_v2
from cosyvoice.cli.cosyvoice import CosyVoice
from cosyvoice.utils.file_utils import load_wav

from funasr import AutoModel

app = FastAPI()


class TextToSpeechRequest(BaseModel):
    text: str


REPLY_FILES_DIR = "reply"
TMP_FILES_DIR = "tmp"

os.makedirs(REPLY_FILES_DIR, exist_ok=True)
os.makedirs(TMP_FILES_DIR, exist_ok=True)


dashscope.api_key = "sk-xxxxxx"

speaker_name = '中文女'
cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M-Instruct')

sense_voice_model = AutoModel(model="iic/SenseVoiceSmall",
                  vad_model="fsmn-vad",
                  vad_kwargs={"max_single_segment_time": 30000},
                  trust_remote_code=True, device="cuda:0", remote_code="./sensevoice/model.py")


def text_to_speech(text):
    pattern = r"生成风格:\s*([^;]+);播报内容:\s*(.+)"
    match = re.search(pattern, text)
    if match:
        style = match.group(1).strip()
        content = match.group(2).strip()
        tts_text = f"{style}<endofprompt>{content}"
        print(f"生成风格: {style}")
        print(f"播报内容: {content}")
    else:
        print("No match found")
        tts_text = text

    # text_list = preprocess(text)
    text_list = [tts_text]
    for i in text_list:
      output = cosyvoice.inference_sft(i, speaker_name)
      yield (22050, output['tts_speech'].numpy().flatten())


def asr(file_path):
    '''file_path可以是一个URL或者本地路径的wav文件'''
    # samplerate, data = audio
    # file_path = f"./tmp/asr_{uuid4()}.wav"

    # torchaudio.save(file_path, torch.from_numpy(data).unsqueeze(0), samplerate)

    res = sense_voice_model.generate(
        input=file_path,
        cache={},
        language="zh",
        text_norm="woitn",
        batch_size_s=0,
        batch_size=1,
        hotword='华衣共享'
    )
    
    return res[0]['text']

@app.post("/tts/")
async def tts(request: TextToSpeechRequest):
    """
    将文本转换为音频并保存为 WAV 文件。
    返回音频文件的路径。
    """
    print(11111111111111111111111111111111111)

    text = request.text.strip()
    if not text:
        raise HTTPException(status_code=400, detail="文本不能为空")
    
    sample_rate, speech_data = text_to_speech(text)
    file_name = f"tts_tmp_file.wav"
    file_path = os.path.join(os.path.join(os.getcwd(), REPLY_FILES_DIR), file_name)
    print(f'file path is:{file_path}')
    # 将文本写入文件
    torchaudio.save(file_path, speech_data, sample_rate)
    # 设置音频文件保存路径

    # 确保文件存在
    if not os.path.exists(file_path):
        raise HTTPException(status_code=500, detail="音频生成失败")

    # 返回音频文件
    return FileResponse(file_path, media_type="audio/wav", filename="output.wav")

@app.post("/asr/")
async def funaudio_asr(file: UploadFile = File(...)):
    print(11111111111111111111111111111111111)
    # 保存上传的音频文件到临时文件
    temp_audio_path = "temp_audio.wav"
    with open(temp_audio_path, "wb") as temp_audio_file:
        temp_audio_file.write(file.file.read())

    text = asr(temp_audio_path)
    # 删除临时文件
    os.remove(temp_audio_path)

    # 返回识别结果
    return JSONResponse(content={"text": text}, status_code=200)

if __name__ == "__main__":
    # uvicorn.run("server:app", host="0.0.0.0", port=7999, reload=True, ssl_keyfile="./key.pem", ssl_certfile="./cert.pem")
    uvicorn.run("server:app", host="0.0.0.0", port=5987, reload=True)

这个脚本放在FunAudioLLM-APP/voice_chat/server.py后,将api_key填入之后执行这个脚本就可以直接运行python server.py就会复现这个bug了

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions