删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

残差网络在婴幼儿哭声识别中的应用

本站小编 Free考研考试/2022-01-03

谢湘,,
张立强,
王晶
北京理工大学信息与电子学院 ??北京 ??100081
基金项目:国家自然科学基金(61473041, 11590772, 61571044)

详细信息
作者简介:谢湘:男,1976年生,副教授,研究方向为语音识别
张立强:男,1995年生,硕士生,研究方向为语音人格感知
王晶:女,1980年生,副教授,研究方向为音频信号处理
通讯作者:谢湘 xiexiang@bit.edu.cn
中图分类号:TP391.42

计量

文章访问数:2344
HTML全文浏览量:607
PDF下载量:80
被引次数:0
出版历程

收稿日期:2018-03-23
修回日期:2018-09-04
网络出版日期:2018-09-11
刊出日期:2019-01-01

Application of Residual Network to Infant Crying Recognition

Xiang XIE,,
Liqiang ZHANG,
Jing WANG
School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
Funds:The National Natural Science Foundation of China (61473041, 11590772, 61571044)


摘要
摘要:该文使用语谱图结合残差网络的深度学习模型进行婴幼儿哭声的识别,使用婴幼儿哭声与非哭声样本比例均衡的语料库,经过五折交叉验证,与支持向量机(SVM),卷积神经网络(CNN),基于Gammatone滤波器的听觉谱残差网络(GT-Resnet)3种模型相比,基于语谱图的残差网络取得了最优结果,F1-score达到0.9965,满足实时性要求,证明了语谱图在婴幼儿哭声识别任务中能直观地反映声学特征,基于语谱图的残差网络是解决婴幼儿哭声识别任务的优秀方法。
关键词:婴儿哭声识别/
深度学习/
残差网络/
语谱图
Abstract:The deep learning model based on the residual network and the spectrogram is used to recognize infant crying. The corpus has balanced proportion of infant crying and non-crying samples. Finally, through the 5-fold cross validation, compared with three models of Support Vector Machine (SVM), Convolutional Neural Network (CNN) and the cochleagram residual network based on Gammatone filters (GT-Resnet), the spectrogram based residual network gets the best F1-score of 0.9965 and satisfies requirements of real time. It is proved that the spectrogram can react acoustics features intuitively and comprehensively in the recognition of infant crying. The residual network based on spectrogram is a good solution to infant crying recognition problem.
Key words:Infant crying recognition/
Deep learning/
Residual network/
Spectrogram



PDF全文下载地址:

https://jeit.ac.cn/article/exportPdf?id=b95473c5-3261-4ad6-8acf-6fcaaea1d15b
相关话题/网络 北京理工大学 比例 信息 优秀