删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

基于时空感知级联神经网络的视频前背景分离\r\n\t\t

本站小编 Free考研考试/2022-01-16

\r杨敬钰1,师 雯1,李 坤2,宋晓林1,岳焕景\r1\r
\r
AuthorsHTML:\r杨敬钰1,师 雯1,李 坤2,宋晓林1,岳焕景\r1\r
\r
AuthorsListE:\rYang Jingyu1,Shi Wen 1,Li Kun 2,Song Xiaolin 1,Yue Huanjing \r1\r
\r
AuthorsHTMLE:\rYang Jingyu1,Shi Wen 1,Li Kun 2,Song Xiaolin 1,Yue Huanjing \r1\r
\r
Unit:\r\r1. 天津大学电气自动化与信息工程学院,天津 300072;\r
\r\r2. 天津大学计算机科学与技术学院,天津 300350\r
\r
\r
Unit_EngLish:\r1. School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China;
2. School of Computer Science and Technology,Tianjin University,Tianjin 300350,China\r
\r
Abstract_Chinese:\r\r针对在复杂情景下视频前背景分离技术中存在的前景泄露问题,设计开发了一个端对端的二级级联深度卷积神经网络,实现了对输入视频序列进行精确的前景和背景分离.所提网络由一级前景检测子网络和二级背景重建子网络串联而成.一级网络融合时间和空间信息,其输入包含\r2\r个部分:第\r1\r个部分是\r3\r张连续的彩色\rRGB\r视频帧,分别为上一帧、当前帧和下一帧;第\r2\r个部分是\r3\r张与彩色视频帧相对应的光流图.一级前景检测子网络通过结合\r2\r部分输入对视频序列中运动的前景进行精确检测,生成二值化的前景掩膜.该部分网络是一个编码器\r-\r解码器网络:编码器采用\rVGG16\r的前\r5\r个卷积块,用来提取两部分输入的特征图,并在经过每一个卷积层后对两类特征图进行特征融合;解码器由\r5\r个反卷积模块构成,通过学习特征空间到图像空间的映射,从而生成当前帧的二值化的前景掩膜.二级网络包含\r3\r个部分:编码器、传输层和解码器.二级网络能够利用当前帧和生成的前景掩膜对缺失的背景图像进行高质量的修复重建.实验结果表明,本文所提时空感知级联卷积神经网络在公共数据集上取得了较其他方法更好的结果,能够应对各种复杂场景,具有较强的通用性和泛化能力,且前景检测和背景重建结果显著超越多种现有方法.\r\r
\r
Abstract_English:\r\rSeparation of foreground and background in video clips presented various problems\r,\rsuch as foreground leakage\r.\rTo solve these problems\r,\rthis paper proposed an end-to-end cascading deep convolutional neural network\r,\rwhich can accurately separate foreground and background in video clips\r.\rThe proposed method included foreground detection and background reconstruction sub-network\r,\rand they were cascaded\r.\rThe first network fused time and space information\r,\rand its input consisted of two parts\r:\rthe first part included three consecutive RGB video frames\r,\rthe previous\r,\rcurrent and next frames\r;\rthe second part included three optical flow maps corresponding to RGB video frames\r.\rThese two inputs were combined by the first sub-network in order to detect moving objects and generate a binary foreground mask\r.\rThe foreground detection sub-network was a multi-input encoder-decoder network\r:\rthe encoder was the first five convolution blocks of VGG16 to extract the feature maps of two inputs\r.\rThese two types of feature maps were fused after each convolution layer\r.\rThe decoder consisted of five transpose convolution layers that could generate a binary mask for the current frame through learning a projection from the feature space to the image space\r.\rThe background reconstruction sub-network contained three parts\r:\rthe encoder\r,\rthe transmitter and the decoder\r,\rwhich took the generated mask and the current frame to reconstruct the background pixels occluded by the foreground\r.\rExperimental results showed that the proposed spatiotemporal fused cascade convolutional neural network has achieved better performance on the public dataset than other methods and can handle various complex scenarios\r.\rForeground detection and background reconstruction results greatly outperformed the existing state-of-the-art methods\r.\r\r
\r
Keyword_Chinese:背景重建;运动物体检测;卷积神经网络;光流\r

Keywords_English:background reconstruction;moving objects detection;convolutional neural network;optical flow\r


PDF全文下载地址:http://xbzrb.tju.edu.cn/#/digest?ArticleID=6472
相关话题/神经网络 背景