基于预训练语言表示模型的汉语韵律结构预测\r\n\t\t

删除或更新信息，请邮件至freekaoyan#163.com(#换成@)

本站小编 Free考研考试/2022-01-16

\r张鹏远^{1, 2}，卢春晖^{1, 2}，王睿敏\r^{1, 2}\r
\r
AuthorsHTML:\r张鹏远^{1, 2}，卢春晖^{1, 2}，王睿敏\r^{1, 2}\r
\r
AuthorsListE:\rZhang Pengyuan ^{1, 2}，Lu Chunhui^{1, 2}，Wang Ruimin\r^{1, 2}\r
\r
AuthorsHTMLE:\rZhang Pengyuan ^{1, 2}，Lu Chunhui^{1, 2}，Wang Ruimin\r^{1, 2}\r
\r
Unit:\r1. 中国科学院声学研究所语言声学与内容理解重点实验室，北京 100190；
2. 中国科学院大学电子电器与通信工程学院，北京 100049\r
\r
Unit_EngLish:\r\r1. Key Laboratory of Speech Acoustics and Content Understanding，Institute of Acoustics，Chinese Academy of Sciences，Beijing 100190，China；\r
\r\r2. School of Electronic，\rElectrical and Communication Engineering，University of Chinese Academy of Sciences，Beijing 100049，China\r
\r
Abstract_Chinese:\r\r韵律结构预测作为语音合成系统中的一个关键步骤，其结果直接影响合成语音的自然度和可懂度．本文提出了一种基于预训练语言表示模型的韵律结构预测方法，以字为建模单位，在预训练语言模型的基础上对每个韵律层级设置了独立的输出层，利用韵律标注数据对预训练模型进行微调．另外在此基础上额外增加了分词任务，通过多任务学习的方法对各韵律层级间的关系及韵律与词间的关系建模，实现对输入文本各级韵律边界的同时预测．实验首先证明了多输出结构设置的合理性及使用预训练模型的有效性，并验证了分词任务的加入可以进一步提升模型性能；将最优的结果与设置的两个基线模型相比，在韵律词和韵律短语预测的\r\rF\r\r_\r1\r\r值上与条件随机场模型相比分别有\r2.48\r%\r和\r4.50\r%\r的绝对提升，而与双向长短时记忆网络相比分别有\r6.2\r%\r和\r5.4\r%\r的绝对提升；最后实验表明该方法可以在保证预测性能的同时减少对训练数据量的需求．\r\r
\r
Abstract_English:\r\rProsodic structure prediction is an indispensable step in the text-to-speech system\r，\rand its results directly influence the naturalness and intelligibility of synthesized speech\r．\rIn this study\r，\ra prosodic structure prediction method based on a pretrained language representation model was proposed\r．\rOn the basis of the pretrained language representation model\r，\ra separate output layer was set for each prosody level\r，\rwith character as the modeling unit\r．\rThen\r，\rthe model was fine-tuned with prosody labeled data\r．\rTo achieve the simultaneous prediction of different prosodic levels in input text\r，\ra word segmentation task was additionally introduced and the multitask learning method was used to model the relationship between the multilevel prosody and lexicon words\r．\rThe experimental results prove the rationality of a multi-output structure and the effectiveness of using a pretrained language representation model and verify that adding the word segmentation task can further improve model performance\r．\rWhen comparing the best result to the baseline conditional random field model\r，\rsignificant improvements of 2.48\r%\r and 4.50\r%\r were observed for the F₁ scores of prosodic word prediction and prosodic phrase prediction\r，\rrespectively\r．\rBy contrast\r，\rwhen comparing the best result to the baseline bidirectional long short-term memory model\r，\rmore significant improvements of 6.2\r%\r and 5.4\r%\r were observed for the F₁ scores of prosodic word prediction and prosodic phrase prediction\r，\rrespectively\r．\rFinally\r，\rthe experiments show that the proposed method considerably reduces the demand for training data while maintaining an excellent prediction performance\r．\r\r
\r
Keyword_Chinese:韵律结构预测；预训练语言表示模型；多任务学习；语音合成\r

Keywords_English:prosodic structure prediction；pretrained language representation model；multitask learning；speech synthesis\r

PDF全文下载地址:http://xbzrb.tju.edu.cn/#/digest?ArticleID=6421

基于预训练语言表示模型的汉语韵律结构预测\r\n\t\t

本站小编 Free考研考试/2022-01-16

相关话题/结构 汉语

领限时大额优惠券,享本站正版考研考试资料!

基于EEMD-小波阈值去噪的桥梁结构模态参数识别\r\n\t\t

嵌入DenseNet 结构和空洞卷积模块的改进YOLO v3 火灾检测算法

压缩气体扩散层微观结构中氧气传输过程的研究

连续弯道水槽水流结构与床面形态试验研究

海上风机结构振动监测试验与特性分析\t\t

环境温度对圆拱形钢结构模态频率的影响研究\t\t

带有p型岛的超低导通电阻绝缘体上硅器件新结构\r\n\t\t

海上风电筒型结构基础层状地基参数优化反演\r\n\t\t

基于代理模型的碾压混凝土坝坝体渗控结构多目标优化\r\n\t\t

对接粘接结构的扭转疲劳损伤行为研究\r\n\t\t