|
|
基于音高映射合成语音的汉语双字调声调训练 |
解焱陆1, 张蓓2, 张劲松1,2 |
1. 北京语言大学 信息科学学院, 北京 100083; 2. 北京语言大学 对外汉语研究中心, 北京 100083 |
|
Tone training for Mandarin two-syllable words based on pitch projection synthesized speech |
XIE Yanlu1, ZHANG Bei2, ZHANG Jinsong1,2 |
1. College of Information Science, Beijing Language and Culture University, Beijing 100083, China; 2. Center for Studies of Chinese as a Second Language, Beijing Language and Culture University, Beijing 100083, China |
|
摘要该文使用音高映射方法,通过选择合适的标准语音,合成出音段、音色保持不变,而只是声调变为标准语音声调的教学语音用于声调训练,减少了语音信号中声调信息之外的复杂变化带来的信息冗余与干扰。以汉语双字调的合成语音为实验材料,对日本被试进行了声调训练实验。训练结果表明:合成语音方法在声调的感知和产出的相对进步率,以及泛化产出的效果上都优于标准语音方法,远好于没有训练的对照组,大部分实验结果差异在统计上具有显著性。实验结果佐证了语音学习时存在人脑的选择性注意机制,为将合成语音方法集成到计算机辅助汉语声调教学系统,提供了实验和理论依据。 |
关键词 :语音教学,语音习得,语音合成,音高映射,声调 |
Abstract:This study uses the pitch projection method to synthesize teaching speech with the appropriate standard voice. The teaching speech is synthesized by turning lexicon tones in the learners' speech into standard tones, while keeping the segments and timbie unchanged. This simplifies the complex variations in the speech signal except for the tones. Then, the system is used for tone training Japanese students based on the synthesized Mandarin two-syllable words. The training results show that this synthesized speech method is superior to a standard voice method with improved perception and production, as well as generalized production. The training results for the synthesized speech method are far better than a control group without training. Most of the results are statistically significant. Tests also show the existence of a selective attention mechanism in the human brain when learning speech. Thus, this study provides an experimental and theoretical basis for speech synthesized methods to be integrated into computer-assisted Mandarin tone learning systems. |
Key words:phonetic teachinglanguage learningspeech synthesispitch projectiontone |
收稿日期: 2016-06-19 出版日期: 2017-02-21 |
|
[1] | TANG Min, WANG Chao, Seneff S. Voice transformations:From speech synthesis to mammalian vocalizations[J]. Proc of the Eurospeech, 2002, 18:357-360. |
[2] | Probst K, Ke Y, Eskenazi M. Enhancing foreign language tutors:In search of the golden speaker[J]. Speech Communication, 2002, 37(3):161-173. |
[3] | Nosofsky R M. Attention and learning processes in the identification and categorization of integral stimuli[J]. Journal of Experimental Psychology:Learning, Memory, and Cognition, 1987, 13(1):87-108. |
[4] | Felps D, Bortfeld H, Gutierrez-Osuna R. Foreign accent conversion in computer assisted pronunciation training[J]. Speech Communication, 2009, 51(10):920-932. |
[5] | Rodríguez W R, Saz O, Lleida E. A prelingual tool for the education of altered voices[J]. Speech Communication, 2012, 54(5):583-600. |
[6] | ZHAO Sixuan, Koh S N, Luke K K. Accent reduction for computer-aided language learning[C]//2012 IEEE Proceedings of the 20th European Signal Processing Conference (EUSIPCO). Bucharest, 2012:335-339. |
[7] | XIE Yanlu, ZHANG Jinsong, SHI Shuju. Standard speaker selection in speech synthesis for Mandarin tone learning[C]//Proceedings of the 2012 International Conference on Information Technology and Software Engineering. Heidelberg, 2013:375-383. |
[8] | Peabody M, Seneff S. Towards automatic tone correction in non-native Mandarin[C]//International Symposium on Chinese Spoken Language Processing. Singapore, 2006:602-613. |
[9] | Martin P. WinPitch LTL Ⅱ, a multimodal pronunciation software[C]//InSTIL/ICALL. Venice, 2004. |
[10] | 宋益丹. 对外汉语声调教学策略探索[J]. 语言教学与研究, 2009(3):48-53.SONG Yidan. Strategies on teaching tones in Chinese as a foreign language[J]. Language Teaching and Linguistic Studies, 2009(3):48-53. (in Chinese) |
[11] | Hussein H, WEI Si, Mixdorff H, et al. Development of a computer-aided language learning system for Mandarin-tone recognition and pronunciation error detection[C]//Proceedings of the Speech Prosody. Chicago, 2010. |
[12] | Kawahara H, Masuda-Katsuse I, De Cheveigne A. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F<sub>0</sub> extraction:Possible role of a repetitive structure in sounds[J]. Speech Communication, 1999, 27(3):187-207. |
[13] | CHAO Yuen Ren. A Grammar of Spoken Chinese[M]. Berkeley and Los Angeles:University of California Press, 1968. |
[14] | 薛晶晶. 美国和泰国学习者汉语普通话阳平与上声习得的实验研究[D]. 北京:北京大学, 2013. XUE Jingjing. The Study on Mandarin Tone 2 and Tone 3 by American and Thai Speakers[D]. Beijing:Peking University, 2013. (in Chinese) |
[15] | 太田裕子.日本学生汉语普通话两字调的发音和感知研究[D]. 北京:北京语言大学, 2011.Ota Yuko. A study of Production and Perception of Tone Sandhi of Chinese Disyllables by Japanese Students[D]. Beijing:Beijing Language and Culture University, 2011. (in Chinese) |