作者:王小玉,林鹏
Authors:WANGXiaoyu,LINPeng摘要:无人驾驶领域的 一个重要问题就是在低功耗移动电子设备上怎样运行实时高精度语义分割模型 。 由 于现有语义分割算法参数量过多、内存占用巨大导致很难满足无人驾驶等现实应用的问题 , 并且在影响语义分割 模型的精度和推理速度的众多因素中 , 空间信息和上下文特征尤为重要 , 并且很难同时兼顾 。针对该问题提出采 用不完整的 ResNet18 作为骨干网络 ,ResNet18 是 一个轻量级的模型 ,参数量较少 , 占用内存不大 ; 同时采用双边语 义分割模型的技术 ,在两条路径上添加通道空间双重注意力机制 , 来获取更多的上下文信息和空间信息的想法。 另外还采用了精炼上下文信息的注意力优化模块 ,和融合两条路径输出的融合模块 , 添加的模块对于参数量和内 存的影响很小 , 可以即插即用 。 以 Cityscapes 和 CamVid 为数据集 。在 Citycapes 上 , mIo U 达到 77. 3% ;在 CamVid 上 ,mIo U 达到 66. 5% 。输入图像分辨率为 1024 × 2048 时 ,推理时间为 37. 9 ms。
Abstract:An important issue in the field of unmanned driving is how to run real-time high-precision semantic segmentation models on low-power mobile electronic devices. Existing semantic segmentation algorithms have too many parameters and huge memory usage , which makes it difficult to meet the problems of real-world applications such as unmanned driving. However , among the many factors that affect the accuracy and speed of the semantic segmentation model , spatial information and contextual features are particularly important , and it is difficult to take into account both. In response to this problem , it is proposed to use the incomplete ResNet18 as the backbone network , design a bilateral semantic segmentation model , and add a channel space dual attention model to the two paths to obtain more contextual and spatial information. In addition , the attention optimization module that refines the context information and the fusion module that integrates the output of the two paths are also used. Take Cityscapes and CamVid as data sets. On Citycapes , mIoU reached 77. 3% ; on CamVid , mIoU reached 66. 5% . When the input image resolution is 1024 × 2048 , the segmentation speed is 37. 9 ms.
PDF全文下载地址:
可免费Download/下载PDF全文
删除或更新信息,请邮件至freekaoyan#163.com(#换成@)