\r黄志清,贾 翔,郭一帆,张 菁\r
\r
AuthorsHTML:\r黄志清,贾 翔,郭一帆,张 菁\r
\r
AuthorsListE:\rHuang Zhiqing,Jia Xiang,Guo Yifan,Zhang Jing\r
\r
AuthorsHTMLE:\rHuang Zhiqing,Jia Xiang,Guo Yifan,Zhang Jing\r
\r
Unit:\r北京工业大学信息学部,北京100022\r
\r
Unit_EngLish:\rFaculty of Information Science,Beijing University of Technology,Beijing 100022,China\r
\r
Abstract_Chinese:\r\r光学乐谱识别是音乐信息检索中一项重要技术,音符识别是乐谱识别及其关键的部分\r.\r针对目前乐谱图像音符识别精度低、步骤冗杂等问题,设计了基于深度学习的端到端音符识别模型\r.\r该模型利用深度卷积神经网络,以整张乐谱图像为输入,直接输出音符的时值和音高\r.\r在数据预处理上,通过解析\rMusicXML\r文件获得模型训练所需的乐谱图像和对应的标签数据,标签数据是由音符音高、音符时值和音符坐标组成的向量,因此模型通过训练来学习标签向量将音符识别任务转化为检测、分类任务\r.\r之后添加噪声、随机裁剪等数据增强方法来增加数据的多样性,使得训练出的模型更加鲁棒;在模型设计上,基于\rdarknet53\r基础网络和特征融合技术,设计端到端的目标检测模型来识别音符\r.\r用深度神经网络\rdarknet53\r提取乐谱图像特征图,让该特征图上的音符有足够大的感受野,之后将神经网络上层特征图和该特征图进行拼接,完成特征融合使得音符有更明显的特征纹理,从而让模型能够检测到音符这类小物体\r.\r该模型采用多任务学习,同时学习音高、时值的分类任务和音符坐标的回归任务,提高了模型的泛化能力\r.\r最后在\rMuseScore\r生成的测试集上对该模型进行测试,音符识别精度高,可以达到\r 0.96\r的时值准确率和\r 0.98\r的音高准确率\r.\r\r
\r
Abstract_English:\r\rOptical music recognition\r(\rOMR\r)\ris an important technology in music information retrieval\r.\rNote recognition is the key part of music score recognition\r.\rIn view of the low accuracy of notes recognition and the cumbersome steps of the recognition of music score image\r,\ran end-to-end note recognition model based on deep learning is designed\r.\rThe model uses the deep convolutional neural network to input the whole score image as the input\r,\rand directly outputs the duration and pitch of the note\r.\rIn data preprocessing\r,\rthe music image and the corresponding tag data required for model training were obtained by parsing the MusicXML file\r,\rthe label data was a vector composed of note pitch\r,\rnote duration and note coordinates\r,\rtherefore\r,\rthe model learned the label vector through training to transform the note recognition task into detection and classification tasks\r.\rData enhancement methods such as noise and random cropping were added to increase the diversity of data\r,\rwhich made the trained model more robust\r.\rIn the model design\r,\rbased on the darknet53 basic network and feature fusion technology\r,\ran end-to-end target detection model was designed to recognize the notes\r.\rThe deep neural network darknet53 was used to extract the feature image of the music image\r,\rso that the notes on the feature map had a large enough receptive field\r,\rand then the upper layer feature map of the neural network and the feature map were spliced\r,\rand the feature fusion is completed to make the note have more obvious feature and texture\r,\rallowing the model to detect small objects such as notes\r.\rThe model adopted multi-task learning\r,\rand learned the pitch and duration classification task and note coordinates task\r,\rwhich improved the generalization ability of the model\r.\rFinally\r,\rthe model was tested on the test set generated by MuseScore\r.\rThe note recognition accuracy is high\r,\rand the duration accuracy of 0.96 and the pitch accuracy of 0.98 can be achieved\r.\r\r
\r
Keyword_Chinese:光学乐谱识别;音符识别;深度学习;端到端;目标检测\r
Keywords_English:optical music recognition;note recognition;deep learning;end-to-end;object detection\r
PDF全文下载地址:http://xbzrb.tju.edu.cn/#/digest?ArticleID=6475
删除或更新信息,请邮件至freekaoyan#163.com(#换成@)
基于深度学习的端到端乐谱音符识别\r\n\t\t
本站小编 Free考研考试/2022-01-16
相关话题/乐谱 音符
从音符中寻找审美教育之利器
文献详情从音符中寻找审美教育之利器文献类型:报纸作者:张放[1]机构:[1]中国人民大学艺术学院报纸名称:中国教育报版位:007报纸日期:2016-04-14人气指数:1浏览次数:1作者其他论文卢梭德性统治思想浅析.张放.东岳论丛.2014,35(4),159-165.试析梁启超的国家主义思想.张放 ...中国人民大学科研学术 中国人民大学 辅仁网 2017-07-052015年中国生态主义的"主音符"
文献详情2015年中国生态主义的"主音符"文献类型:期刊作者:张云飞[1]机构:中国人民大学国家发展与战略研究院年:2016期刊名称:人民论坛期:3页码范围:60-61增刊:增刊收录情况:中文核心期刊要目总览所属部门:马克思主义学院语言:中文ISSN:1004-3381链接地址:http://d.g ...中国人民大学科研学术 中国人民大学 辅仁网 2017-07-05基于HMM的自动音符切分识别的研究
文献详情基于HMM的自动音符切分识别的研究外文标题:TheStudyofAutomaticNotesSegmentationandRecognitionBasedonHMM文献类型:期刊作者:许洁萍[1]刘怡[2]胡楠[3]机构:[1]中国人民大学信息学院计算机系,中国人民大学信息学院计算机系,中国 ...中国人民大学科研学术 中国人民大学 辅仁网 2017-06-30