ubuntu 本地安装Fish Speech 文字转语音

ubuntu 环境安装Fish Speech文本转语音项目,解决过程中cmake编译器版本不匹配问题

defagi avatar
  • defagi
  • 1 min read

Ubuntu 22.04 安装 Fish Speech

Fish Speech 在线体验 Fish Speech Github

GPU 内存: 4GB (用于推理), 8GB (用于微调)

fish-speech-gpu

git clone https://github.com/fishaudio/fish-speech.git

cd fish-speech

# 创建一个 python 3.10 虚拟环境
conda create -n fish-speech python=3.10
conda activate fish-speech

# 安装 pytorch
pip3 install torch torchvision torchaudio

# 安装 fish-speech
pip3 install -e .

# (Ubuntu / Debian 用户) 安装 sox
apt install libsox-dev

安装报错

pip3 install -e . 安装的过程中出现报错:

subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-install-db7hqr_7/samplerate_db8da1023e214d099ff3d70d7d0951fc', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/pip-install-db7hqr_7/samplerate_db8da1023e214d099ff3d70d7d0951fc/build/lib.linux-x86_64-cpython-310/', '-DPYTHON_EXECUTABLE=/home/vaitk/miniconda3/envs/fish-speech/bin/python', '-DCMAKE_BUILD_TYPE=Release', '-DPACKAGE_VERSION_INFO=0.2.1']' returned non-zero exit status 1.
[end of output]
  
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for samplerate
Successfully built fish-speech
Failed to build samplerate
ERROR: Could not build wheels for samplerate, which is required to install pyproject.toml-based projects

查看详细的错误日志:

pip3 install -e . --verbose

错误信息: 根据详细的错误信息,问题似乎出在 LTO(链接时优化)版本不匹配上。

lto1: fatal error: bytecode stream in file ‘CMakeFiles/python-samplerate.dir/src/samplerate.cpp.o’ generated with LTO version 11.3 instead of the expected 12.0

看上去是CMake 使用的编译器版本和系统上安装的编译器版本不匹配。

解决方法

在 Conda 环境中,使用的是 Conda 提供的编译器。可以通过安装 conda-forge 提供的编译器来确保版本一致性:

conda install -c conda-forge compilers
CC=$CONDA_PREFIX/bin/x86_64-conda_cos6-linux-gnu-gcc
CXX=$CONDA_PREFIX/bin/x86_64-conda_cos6-linux-gnu-g++
cmake -DCMAKE_C_COMPILER=$CC -DCMAKE_CXX_COMPILER=$CXX ..

下载模型

cd fish-speech 
huggingface-cli download fishaudio/fish-speech-1.2 --local-dir checkpoints/fish-speech-1.2

WebUI 推理

fish-speech-gpu

你可以使用以下命令来启动 WebUI:

python -m tools.webui \
    --llama-checkpoint-path "checkpoints/fish-speech-1.2" \
    --decoder-checkpoint-path "checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth" \
    --decoder-config-name firefly_gan_vq

推荐