阴法明,赵 焱,赵力.连续音素的改进深信度网络的识别算法*[J].,2019,38(1):39-44 |
连续音素的改进深信度网络的识别算法* |
Phoneme recognition based on deep belief network |
投稿时间:2018-04-25修订日期:2018-12-29 |
中文摘要: |
为提高连续语音识别中的音素识别率,提出一种基于改进并行回火训练的受限波尔兹曼机的音素识别算法。首先,算法利用经过等能量划分后的改进并行回火来训练受限玻尔兹曼机,接着将受限玻尔兹曼机堆叠组成一个深信度网络,从而作为深度神经网络预训练的基础模型,然后通过软最大化层输出,得到用于音素状态后验概率检测的深度神经网络。接着利用少量的标签数据,根据反向传播算法对网络权重进行微调。最后将所得后验概率作为隐马尔科夫的发射概率,然后利用Viterbi解码器实现音素识别。在TIMIT语料库上的实验表明,相比于传统的对比散度类算法提高了约4.5%。在不增加计算量的情况下比原始并行回火算法提高约1%。 |
英文摘要: |
In order to improve the accuracy of phoneme recognition in continuous speech recognition, in this paper, a modified parallel tempering (PT)algorithm applied totrain the Restricted Boltzmann Machine is proposed. Firstly, Restricted Boltzmann Machine(RBM) is trained in light of Metropolis-Hasting for parallel tempering sampling, then stacking up RBMs to form a deep belief network(DBN) as the basis for DNN pre-training ,then by adding an output layer called “softmax” to the network, a deep neural network detecting the posterior probability of phoneme can be created. Subsequently, Backward Propagation algorithm is applied to fine-tune the weights discriminatively with less label data. Finally the sequence of the predicted probability distribution is fed into a standard Viterbi decoder. The experiments show that the proposed method has a better performance on the TIMIT dataset than traditional ways.Its recognition rate is higher 4.5%than CD,and 1% than original PT without more computation. |
DOI:10.11684/j.issn.1000-310X.2019.01.006 |
中文关键词:并行回火受限玻尔兹曼机深信度网络音素识别 |
英文关键词:Parallel temperingRestricted Boltzmann MachineDeep belief networkPhoneme recognition |
基金项目:国家自然科学基金项目 (61571106) |
|
摘要点击次数:1081 |
全文下载次数:852 |
查看全文查看/发表评论下载PDF阅读器 |
相关附件:查重结果修改说明1修改说明1修改说明2 | --> 关闭 |