Kaldi环境配置与Aishell训练
- 游戏开发
- 2025-09-14 10:33:02

一、项目来源
代码来源:kaldi-asr/kaldi: kaldi-asr/kaldi is the official location of the Kaldi project. (github )
官网文档:Kaldi: The build process (how Kaldi is compiled) (kaldi-asr.org)
踩着我的同门@李思成-CSDN博客填上的坑kaldi环境配置与aishell实践_kaldi 编译-CSDN博客踏进新的坑里,Kaldi不是终端用户软件,也没有安装包,安装Kaldi包括编译其代码和准备必要的工具与运行环境。
二、环境配置
1、安装系统开发库的依赖
编译Kaldi之前,需要检查和安装Kaldi依赖的系统开发库,可以通过在Kaldi的tools目录下执行extras/check_dependencies.sh脚本来检查。
~/project/kaldi/tools$ extras/check_dependencies.sh extras/check_dependencies.sh: subversion is not installed extras/check_dependencies.sh: python2.7 is not installed extras/check_dependencies.sh: Some prerequisites are missing; install them using the command: sudo apt-get install subversion python2.7经过检查,我需要安装python2.7、python3和subversion
cd /home/chy524/kaldi/tools/ # 由于依赖库需要python2.7和python3,在tools目录下创建python文件夹,在文件夹中建立符号链接 mkdir -p python/ # 我还没有安装python2.7和python3,我在根目录下安装python2.7和python3 cd ~ mkdir -p ~/python # 下载python2.7和python3.9 wget .python.org/ftp/python/2.7.18/Python-2.7.18.tgz wget .python.org/ftp/python/3.9.0/Python-3.9.0.tgz # 解压源码包 tar -xvzf Python-2.7.18.tgz tar -xvzf Python-3.9.0.tgz # 编译并安装, cd Python-2.7.18 ./configure --prefix=$HOME/python/python2.7 make make install cd ~/python_versions/Python-3.9.0 ./configure --prefix=$HOME/python/python3 make make install # 安装完后,更新环境变量,打开.bashrc文件,在文件末尾添加以下内容 export PATH=$HOME/python/python2.7/bin:$PATH export PATH=$HOME/python/python3/bin:$PATH # 保存并关闭文件,然后执行 source ~/.bashrc # 回到建立符号链接的python文件夹 cd /home/chy524/kaldi/tools/python/ # 建立符号链接 ln -s ~/python/python2.7/bin/python ~/python/python2.7/bin/python2 ln -s ~/python/python3/bin/python ~/python/python3/bin/python3安装subversion,且我没有sudo权限
# 下载subversion wget archive.apache.org/dist/subversion/subversion-1.14.1.tar.gz # 解压源代码包 tar -xvzf subversion-1.14.1.tar.gz cd subversion-1.14.1 # 安装 Subversion,我这里失败了,因为error: no suitable APR found,所以后面这些都执行不了 #./configure --prefix=$HOME/software/subversion # make # make install # 将 Subversion 添加到环境变量中 # export PATH=$HOME/software/subversion/bin:$PATH # 更新环境变量 # source ~/.bashrc安装过程出现error: no suitable APR found报错,提示Subversion 在配置过程中没有找到所需的 Apache Portable Runtime (APR) 库。
configure: Apache Portable Runtime (APR) library configuration checking for APR... no configure: WARNING: APR not found The Apache Portable Runtime (APR) library cannot be found. Please install APR on this system and configure Subversion with the appropriate --with-apr option. You probably need to do something similar with the Apache Portable Runtime Utility (APRUTIL) library and then configure Subversion with both the --with-apr and --with-apr-util options. configure: error: no suitable APR found因此需要安装 APR 库以及 APR Utility(APRUTIL)库,才能成功编译和安装 Subversion,可以从 Apache APR 官方网站 下载最新的源代码
# 下载APR与APRUTIL到~/software中 cd ~/software/ wget downloads.apache.org//apr/apr-1.7.0.tar.gz wget downloads.apache.org//apr/apr-util-1.6.1.tar.gz # 解压文件 tar -xvf apr-1.7.0.tar.gz tar -xvf apr-util-1.6.1.tar.gz # 编译并安装APR与APRUTIL cd apr-1.7.0 ./configure --prefix=$HOME/software/apr make make install cd ../apr-util-1.6.1 ./configure --prefix=$HOME/software/apr-util --with-apr=$HOME/software/apr # 在apr-util中进行make时,出现问题:缺少了expat.h头文件,所以这后面执行不了 # make # make install在apr-util中进行make时,出现问题:缺少了expat.h头文件
-I/home/chy524/software/apr-util-1.6.3/include/private -I/home/chy524/software/apr/include/apr-1 -o xml/apr_xml.lo -c xml/apr_xml.c && touch xml/apr_xml.lo xml/apr_xml.c:35:10: fatal error: expat.h: No such file or directory 35 | #include <expat.h> | ^~~~~~~~~ compilation terminated. make[1]: *** [/home/chy524/software/apr-util-1.6.3/build/rules.mk:207: xml/apr_xml.lo] Error 1 make[1]: Leaving directory '/s6home/chy524/software/apr-util-1.6.3' make: *** [/home/chy524/software/apr-util-1.6.3/build/rules.mk:119: all-recursive] Error 1expat.h是Expat XML 解析库的头文件,而 APR-UTIL 在某些功能(比如 XML 处理)中依赖于它,所以需要安装Expat库
# 安装Expat到software中 cd ~/software/ wget github /libexpat/libexpat/releases/download/R_2_4_9/expat-2.4.9.tar.bz2 # 解压和编译Expat tar -xvf expat-2.4.9.tar.bz2 cd expat-2.4.9 # 编译并安装到用户目录 ./configure --prefix=$HOME/software/expat/ make make install # 配置APR-UTIL使用Expat cd apr-util-1.6.3 ./configure --prefix=$HOME/software/apr-util --with-apr=$HOME/software/apr --with-expat=$HOME/software/expat make make install随后又回到subversion进行安装,出现了缺失serf库和sqlite3库,就不贴报错了,跟前面差不多,我继续解决报错了。安装serf库,从官网下载Download Apache Serf Sources,serf 是一个库,而不是可执行程序,所以没有 bin/ 目录。
# 安装serf库 cd ~/software/ wget archive.apache.org/dist/serf/serf-1.3.10.tar.bz2 tar -xvf serf-1.3.10.tar.bz2 cd serf-1.3.10 # 编译serf库,源代码中没有configure文件,不能像上面那样很简单就编译,只能通过提供的的readme文档进行编译,serf 使用的是 SCons 构建系统,而不是传统的 configure 脚本,可以从文件名 SConstruct 看出来,SConstruct 是 SCons 用来定义构建规则的文件。 # 下载 scons-local mkdir -p ~/software/scons-local cd ~/software/scons-local wget http://prdownloads.sourceforge.net/scons/scons-local-2.3.0.tar.gz tar -xvzf scons-local-2.3.0.tar.gz # 检查安装是否成功 scons --version # 构建Apache Serf # 我的OPENSSL通过openssl version -a进行路径查找,在/home/chy524/miniconda3/ssl中(如果没有就下载OPENSSL),修改OPENSSL的路径 cd ~/software/serf-1.3.10 scons APR=$HOME/software/apr \ APU=$HOME/software/apr-util \ OPENSSL=$HOME/miniconda3 \ PREFIX=$HOME/software/serf # 安装Serf scons PREFIX=$HOME/software/serf install echo 'export PATH=$HOME/software/serf/bin:$PATH' >> ~/.bashrc source ~/.bashrc安装sqlite3库
cd ~/software/ # 下载 SQLite 源码 wget .sqlite.org/2015/sqlite-amalgamation-3081101.zip # 解压 unzip sqlite-amalgamation-3081101.zip # 重命名文件夹为 sqlite-amalgamation mv sqlite-amalgamation-3081101 /home/chy524/software/subversion-1.14.1/sqlite-amalgamation # sqlite3就下载好了再次回去编译subversion,发现找不到 LZ4 库,根据提示信息,直接修改参数就好,加一行--with-lz4=internal
cd ~/software/subversion-1.14.1 ./configure --prefix=$HOME/software/subversion --with-apr=$HOME/software/apr --with-apr-util=$HOME/software/apr-util --with-serf=$HOME/software/serf --with-sqlite=$HOME/software/sqlite-amalgamation make # make install ...... ...... checking for inflate in -lz... yes configure: lz4 configuration without pkg-config checking for LZ4_compress_default in -llz4... no configure: error: Subversion requires LZ4 >= r129, or use --with-lz4=internal后面还出现一个库也需要改为内部库,直接加上就好,然后没有错误之后,最终执行安装好所有的前置配置,最终安装命令为
cd ~/software/subversion-1.14.1 ./configure --prefix=$HOME/software/subversion --with-apr=$HOME/software/apr --with-apr-util=$HOME/software/apr-util --with-serf=$HOME/software/serf --with-sqlite=$HOME/software/sqlite-amalgamation --with-lz4=internal --with-(这里还有一个什么忘记了)=internal make make install到此就完成了安装依赖的系统开发库
2、安装依赖的第三方工具库
无法访问GitHub,需要下载第三方工具库压缩包到tools目录下,并在tools/Makefile中调整库的版本号,然后执行make命令进行编译。
压缩包链接: pan.baidu /s/1la0nmku2zBxa1lRKiUTWcw?pwd=flho
修改tools/Makefile中的版本号:
# SHELL += -x CXX ?= g++ CC ?= gcc # used for sph2pipe # CXX = clang++ # Uncomment these lines... # CC = clang # ...to build with Clang. WGET ?= wget OPENFST_VERSION ?= 1.7.2 CUB_VERSION ?= 1.8.0 # No '?=', since there exists only one version of sph2pipe. SPH2PIPE_VERSION = 2.5 # SCTK official repo does not have version tags. Here's the mapping: # 2.4.9 = 659bc36; 2.4.10 = d914e1b; 2.4.11 = 20159b5. SCTK_GITHASH = 2.4.12执行make进行编译
cd ~/project/kaldi/tools/ make3. 安装语音模型工具IRSTLM / SRILM / Kaldi_lm
安装irstlm.sh
cd ~/project/kaldi/tools/ # 将install_irstlm.sh中 github /irstlm-team/irstlm.git改为 gitclone.con/github /irstlm-team/irstlm.git extras/install_irstlm.sh # 安装完成,加载环境变量 source ../tools/env.sh安装kaldi_lm.sh
cd ~/project/kaldi/tools/ # 将install_kaldi_lm.sh中 github /danpovey/kaldi_lm.git改为 gitclone /github /danpovey/kaldi_lm.git extras/install_irstlm.sh # 安装完成,加载环境变量 source ../tools/env.sh安装srilm.sh,使用手动安装
# 安装srilm之前先安装依赖lbfgs库 cd ~/project/kaldi/tools/ extras/install_liblbfgs.sh # 在tools/目录下载压缩包 # pan.baidu /s/1la0nmku2zBxa1lRKiUTWcw?pwd=flho # 解压并改名 tar -xvzf srilm-1.7.3.tar.gz mv srilm-1.7.3/ srilm/ #将下方shell内容复制粘贴到install_srilm.sh中,随后运行 extras/install_srilm.sh # 安装完成,加载环境变量 source ../tools/env.sh修改install_srilm.sh中内容
#!/usr/bin/env bash current_path=`pwd` current_dir=`basename "$current_path"` if [ "tools" != "$current_dir" ]; then echo "You should run this script in tools/ directory!!" exit 1 fi if [ ! -d liblbfgs-1.10 ]; then echo Installing libLBFGS library to support MaxEnt LMs bash extras/install_liblbfgs.sh || exit 1 fi ! command -v gawk > /dev/null && \ echo "GNU awk is not installed so SRILM will probably not work correctly: refusing to install" && exit 1; # Skip SRILM download and extraction # if [ ! -f srilm.tgz ] && [ ! -f srilm.tar.gz ] && [ ! -d srilm ]; then # if [ $# -ne 4 ]; then # echo "SRILM download requires some information about you" # echo # echo "Usage: $0 <name> <organization> <email> <address>" # exit 1 # fi # srilm_url="http:// .speech.sri /projects/srilm/srilm_download2.php" # post_data="file=1.7.3&name=$1&org=$2&email=$3&address=$4&license=on" # if ! wget --post-data "$post_data" -O ./srilm.tar.gz "$srilm_url"; then # echo 'There was a problem downloading the file.' # echo 'Check your internet connection and try again.' # exit 1 # fi # if [ ! -s srilm.tar.gz ]; then # echo 'The file is empty. There was a problem downloading the file.' # exit 1 # fi # fi mkdir -p srilm cd srilm # Skip extraction of SRILM package # if [ -f ../srilm.tgz ]; then # tar -xvzf ../srilm.tgz || exit 1 # Old SRILM format # elif [ -f ../srilm.tar.gz ]; then # tar -xvzf ../srilm.tar.gz || exit 1 # Changed format type from tgz to tar.gz # fi # Ensure SRILM directory exists # if [ ! -d srilm ]; then # echo 'The SRILM directory does not exist. There was a problem with the extraction.' # exit 1 # fi # if [ ! -f RELEASE ]; then # echo 'The file RELEASE does not exist. There was a problem extracting.' # exit 1 # fi major=`gawk -F. '{ print $1 }' RELEASE` minor=`gawk -F. '{ print $2 }' RELEASE` micro=`gawk -F. '{ print $3 }' RELEASE` if [ $major -le 1 ] && [ $minor -le 7 ] && [ $micro -le 1 ]; then echo "Detected version 1.7.1 or earlier. Applying patch." patch -p0 < ../extras/srilm.patch fi # set the SRILM variable in the top-level Makefile to this directory. cp Makefile tmpf cat tmpf | gawk -v pwd=`pwd` '/SRILM =/{printf("SRILM = %s\n", pwd); next;} {print;}' \ > Makefile || exit 1 rm tmpf mtype=`sbin/machine-type` echo HAVE_LIBLBFGS=1 >> common/Makefile.machine.$mtype grep ADDITIONAL_INCLUDES common/Makefile.machine.$mtype | \ sed 's|$| -I$(SRILM)/../liblbfgs-1.10/include|' \ >> common/Makefile.machine.$mtype grep ADDITIONAL_LDFLAGS common/Makefile.machine.$mtype | \ sed 's|$| -L$(SRILM)/../liblbfgs-1.10/lib/ -Wl,-rpath -Wl,$(SRILM)/../liblbfgs-1.10/lib/|' \ >> common/Makefile.machine.$mtype make || exit cd .. ( [ ! -z "${SRILM}" ] && \ echo >&2 "SRILM variable is aleady defined. Undefining..." && \ unset SRILM [ -f ./env.sh ] && . ./env.sh [ ! -z "${SRILM}" ] && \ echo >&2 "SRILM config is already in env.sh" && exit wd=`pwd` wd=`readlink -f $wd || pwd` echo "export SRILM=$wd/srilm" dirs="\${PATH}" for directory in $(cd srilm && find bin -type d ) ; do dirs="$dirs:\${SRILM}/$directory" done echo "export PATH=$dirs" ) >> env.sh echo >&2 "Installation of SRILM finished successfully" echo >&2 "Please source the tools/env.sh in your path.sh to enable it" 4. 安装线性代数库OpenBLAS,不可访问github的情况 cd ~/project/kaldi/tools/ # 下载OPENBLAS_VERSION进行手动编译 git clone github /xianyi/OpenBLAS.git cd OpenBLAS # 适配 CPU 核心数并加速编译 make -j 8 make PREFIX=/home/chy524/project/kaldi/tools/OpenBLAS-0.3.13/install install # 添加到环境变量 echo 'export LD_LIBRARY_PATH=/home/chy524/project/kaldi/tools/OpenBLAS-0.3.13/install/lib:$LD_LIBRARY_PATH' >> ~/.bashrc echo 'export LIBRARY_PATH=/home/chy524/project/kaldi/tools/OpenBLAS-0.3.13/install/lib:$LIBRARY_PATH' >> ~/.bashrc echo 'export C_INCLUDE_PATH=/home/chy524/project/kaldi/tools/OpenBLAS-0.3.13/install/include:$C_INCLUDE_PATH' >> ~/.bashrc source ~/.bashrc 5.安装CUDA11.6,代码实践——准备阶段-CSDN博客有安装CUDA12.1的步骤,感觉还是11.6用的比较多 nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_Mar__8_18:18:20_PST_2022 Cuda compilation tools, release 11.6, V11.6.124 Build cuda_11.6.r11.6/compiler.31057947_0创建运行所需的虚拟环境
# 创建conda虚拟环境 conda create -n kaldi python=3.8 conda activate kaldi # 安装Pytorch conda install cudatoolkit=11.6 -c pytorch -c conda-forge 6.编译Kaldi代码 cd ~/project/kaldi/src # --cudatk-dir和--openblas-root需要为自己的地址,--cuda-arch=-arch需要自己的GPU版本 ./configure --shared --use-cuda=yes --cudatk-dir=/home/chy524/cuda/cuda-11.6/ --cuda-arch=-arch=sm_86 --mathlib=OPENBLAS --openblas-root=/home/chy524/project/kaldi/tools/OpenBLAS-0.3.13/install/ # 处理 Kaldi 项目中所有的依赖关系,-j 8 参数可以让 make 使用 8 个并行线程加快这个过程。 make depend -j 8 # 开始编译 Kaldi make -j 8三、跑Aishell数据集
1、数据准备部分,这些在终端都可以直接运行
# s5是语音识别模块,s5中的run.sh包含了这一模块的所有流程和代码,这是数据准备部分 data=/home/chy524/data/aishell # 改成自己存放数据的地址 data_url= .openslr.org/resources/33 . ./cmd.sh # 我之前下载好了数据,就没运行这个 local/download_and_untar.sh $data $data_url data_aishell || exit 1; local/download_and_untar.sh $data $data_url resource_aishell || exit 1; # 准备词典 local/aishell_prepare_dict.sh $data/resource_aishell || exit 1; # 准备数据, local/aishell_data_prep.sh $data/data_aishell/wav $data/data_aishell/transcript || exit 1; # 创建一个包含语言模型和词典的语言目录 # --position-dependent-phones false:指定是否使用与位置相关的音素。 # data/local/dict:包含字典的目录,字典包含词到音素的映射。 # "<SPOKEN_NOISE>":用于表示未在字典中定义的噪声或其他非词汇性项。 # data/local/lang:中间输出目录。 # data/lang:最终的语言目录输出 utils/prepare_lang.sh --position-dependent-phones false data/local/dict \ "<SPOKEN_NOISE>" data/local/lang data/lang || exit 1;2、语言模型(LM)训练,这个在终端可以直接运行,很快
local/aishell_train_lms.sh3、语言模型的格式化,这个在终端可以直接运行,很快
使用 utils/format_lm.sh 将语言模型(LM)转换为所需的 FST 格式,并生成 data/lang_test。
utils/format_lm.sh data/lang data/local/lm/3gram-mincount/lm_unpruned.gz \ data/local/dict/lexicon.txt data/lang_test4、生成 MFCC 和音高特征,这个开始需要在run.sh中注释掉其他部分代码,训练会生成文件及日志,建议进行提交训练,因为需要时间蛮久的
mfccdir=mfcc for x in train dev test; do steps/make_mfcc_pitch.sh --cmd "$train_cmd" --nj 10 data/$x exp/make_mfcc/$x $mfccdir || exit 1; steps/compute_cmvn_stats.sh data/$x exp/make_mfcc/$x $mfccdir || exit 1; utils/fix_data_dir.sh data/$x || exit 1; doneexp/make_mfcc/train:训练集日志
exp/make_mfcc/dev:验证集日志
exp/make_mfcc/test:测试集日志
后面我都没有跑了,CPU跑任务都太久了,不过有同门的博客可以学习kaldi环境配置与aishell实践_kaldi 编译-CSDN博客 blog.csdn.net/weixin_46560570/article/details/141109113?spm=1001.2014.3001.5502
5、单音素模型
# 注释run.sh其他内容,运行./run.sh # Train a monophone model on delta features. steps/train_mono.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/mono || exit 1; # Decode with the monophone model. utils/mkgraph.sh data/lang_test exp/mono exp/mono/graph || exit 1; steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 \ exp/mono/graph data/dev exp/mono/decode_dev steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 \ exp/mono/graph data/test exp/mono/decode_test # Get alignments from monophone system. steps/align_si.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/mono exp/mono_ali || exit 1;6、三音素模型(Tri1)
# 注释run.sh其他内容,运行./run.sh # Train the first triphone pass model tri1 on delta + delta-delta features. steps/train_deltas.sh --cmd "$train_cmd" \ 2500 20000 data/train data/lang exp/mono_ali exp/tri1 || exit 1; # decode tri1 utils/mkgraph.sh data/lang_test exp/tri1 exp/tri1/graph || exit 1; steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 \ exp/tri1/graph data/dev exp/tri1/decode_dev steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 \ exp/tri1/graph data/test exp/tri1/decode_test # align tri1 steps/align_si.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/tri1 exp/tri1_ali || exit 1;7、更高级的三音素模型(Tri2)
# train tri2 [delta+delta-deltas] steps/train_deltas.sh --cmd "$train_cmd" \ 2500 20000 data/train data/lang exp/tri1_ali exp/tri2 || exit 1; # decode tri2 utils/mkgraph.sh data/lang_test exp/tri2 exp/tri2/graph steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 \ exp/tri2/graph data/dev exp/tri2/decode_dev steps/decode.sh --cmd "$decode_cmd" --config conf/decode.config --nj 10 \ exp/tri2/graph data/test exp/tri2/decode_test # Align training data with the tri2 model. steps/align_si.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/tri2 exp/tri2_ali || exit 1;8、Tri3a 模型
# Train the second triphone pass model tri3a on LDA+MLLT features. steps/train_lda_mllt.sh --cmd "$train_cmd" \ 2500 20000 data/train data/lang exp/tri2_ali exp/tri3a || exit 1; # Run a test decode with the tri3a model. utils/mkgraph.sh data/lang_test exp/tri3a exp/tri3a/graph || exit 1; steps/decode.sh --cmd "$decode_cmd" --nj 10 --config conf/decode.config \ exp/tri3a/graph data/dev exp/tri3a/decode_dev steps/decode.sh --cmd "$decode_cmd" --nj 10 --config conf/decode.config \ exp/tri3a/graph data/test exp/tri3a/decode_test # align tri3a with fMLLR steps/align_fmllr.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/tri3a exp/tri3a_ali || exit 1;9 、Tri4a模型
# Train the third triphone pass model tri4a on LDA+MLLT+SAT features. # From now on, we start building a more serious system with Speaker # Adaptive Training (SAT). steps/train_sat.sh --cmd "$train_cmd" \ 2500 20000 data/train data/lang exp/tri3a_ali exp/tri4a || exit 1; # decode tri4a utils/mkgraph.sh data/lang_test exp/tri4a exp/tri4a/graph steps/decode_fmllr.sh --cmd "$decode_cmd" --nj 10 --config conf/decode.config \ exp/tri4a/graph data/dev exp/tri4a/decode_dev steps/decode_fmllr.sh --cmd "$decode_cmd" --nj 10 --config conf/decode.config \ exp/tri4a/graph data/test exp/tri4a/decode_test # align tri4a with fMLLR steps/align_fmllr.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/tri4a exp/tri4a_ali10、Tri5a模型
# Train tri5a, which is LDA+MLLT+SAT # Building a larger SAT system. You can see the num-leaves is 3500 and tot-gauss is 100000 steps/train_sat.sh --cmd "$train_cmd" \ 3500 100000 data/train data/lang exp/tri4a_ali exp/tri5a || exit 1; # decode tri5a utils/mkgraph.sh data/lang_test exp/tri5a exp/tri5a/graph || exit 1; steps/decode_fmllr.sh --cmd "$decode_cmd" --nj 10 --config conf/decode.config \ exp/tri5a/graph data/dev exp/tri5a/decode_dev || exit 1; steps/decode_fmllr.sh --cmd "$decode_cmd" --nj 10 --config conf/decode.config \ exp/tri5a/graph data/test exp/tri5a/decode_test || exit 1; # align tri5a with fMLLR steps/align_fmllr.sh --cmd "$train_cmd" --nj 10 \ data/train data/lang exp/tri5a exp/tri5a_ali || exit 1;11、nnet3模型
修改配置文件
# 若提示 This script is intended to be used with GPUs but you have not compiled Kaldi with CUDA If you want to use GPUs (and have them), go to src/, and configure and make on a machine where "nvcc" is installed. # 解决方法 cd kaldi/src/ 修改configure文件 # 将你的/usr/local/cuda /home/chy524/cuda/cuda-11.6地址添加进去 function configure_cuda { # Check for CUDA toolkit in the system if [ ! -d "$CUDATKDIR" ]; then for base in /usr/local/share/cuda /usr/local/cuda /home/chy524/cuda/cuda-11.6 /usr/; do if [ -f $base/bin/nvcc ]; then CUDATKDIR=$base fi done fi # 再次编译kaldi ./configure --shared --use-cuda=yes --cudatk-dir=/home/chy524/cuda/cuda-11.6/ --cuda-arch=-arch=sm_86 --mathlib=OPENBLAS --openblas-root=/home/chy524/project/kaldi/tools/OpenBLAS-0.3.13/install/ make -j clean depend make -j 16正式开始前需要将s5/local/nnet3/run_tdnn.sh中 --use-gpu true 修改为 --use-gpu wait
# nnet3 local/nnet3/run_tdnn.sh12、chain模型
# chain local/chain/run_tdnn.sh13、得到结果
# getting results (see RESULTS file) for x in exp/*/decode_test; do [ -d $x ] && grep WER $x/cer_* | utils/best_wer.sh; done 2>/dev/null for x in exp/*/*/decode_test; do [ -d $x ] && grep WER $x/cer_* | utils/best_wer.sh; done 2>/dev/nullKaldi环境配置与Aishell训练由讯客互联游戏开发栏目发布,感谢您对讯客互联的认可,以及对我们原创作品以及文章的青睐,非常欢迎各位朋友分享到个人网站或者朋友圈,但转载请说明文章出处“Kaldi环境配置与Aishell训练”