赵陶1,
熊闻心1,
杨天1,
姚渭箐2,,
1.武汉大学电子信息学院 武汉 430072
2.国网湖北省电力有限公司信息通信公司 武汉 430077
基金项目:国家自然科学基金(61471272),国网湖北省电力有限公司2019年科技项目(52153318004G)
详细信息
作者简介:肖进胜:男,1975年生,博士,副教授,硕士导师,研究方向为图像与视频处理
赵陶:男,1996年生,硕士生,研究方向为图像与视频处理
熊闻心:女,1998年生,硕士生,研究方向为图像处理
杨天:男,1996年生,硕士,研究方向为后台开发
姚渭箐:女,1990年生,博士,技术工程师,研究方向为文档信息处理
通讯作者:姚渭箐 ywq1005@whu.edu.cn
中图分类号:TN911.73; TP391.1计量
文章访问数:269
HTML全文浏览量:205
PDF下载量:54
被引次数:0
出版历程
收稿日期:2020-11-30
修回日期:2021-03-26
网络出版日期:2021-04-15
刊出日期:2021-11-23
Seal Text Detection and Recognition Algorithm with Angle Optimization Network
Jinsheng XIAO1,Tao ZHAO1,
Wenxin XIONG1,
Tian YANG1,
Weiqing YAO2,,
1. School of Electronic Information, Wuhan University, Wuhan 430072, China
2. State Grid Hubei Information & Telecommunication Company Limited, Wuhan 430077, China
Funds:The National Natural Science Foundation of China (61471272), The Technology Project of State Grid Hubei Electric Power Co., Ltd. (52153318004G)
摘要
摘要:利用光学字符识别方法对印章文字进行检测与识别,能够加快各类合同的分类处理速度与鉴别效率。该文针对圆形印章文字呈环形排列的特点,利用极坐标展开对印章文字进行预处理,克服了印章文字方向不统一的问题。对于展开后上下起伏的文本区域,利用带角度信息的联结文本提议网(CTPN)对印章文字区域进行检测,并使用贝塞尔拟合文本区域,实现了对印章区域的准确检测。最后利用注意力转移机制和该文匹配算法对检测的文字区域进行识别,输出印章文字内容。运用该算法对输出印章文字内容自制的中文印章数据集进行实验,印章内容的文字检测F值可以达到84.73%,文字识别召回率达到84.4%,表明该算法可以有效地检测识别印章内容,对文档的分类与鉴别研究具有重要的意义。
关键词:图像处理/
印章识别/
循环神经网络/
极坐标转换
Abstract:Using the methods of Optical Character Recognition (OCR) to detect and recognize the seal characters can speed up the classification speed and identification efficiency of all kinds of contracts. According to the characteristics of the cycle seal characters arranged in a ring, polar coordinate conversion is used to preprocess the seal characters, which overcomes the problem that the direction of the seal characters is not uniform. The Connectionist Text Proposal Network (CTPN) with angle information is used to detect the undulating text area, and the Bezier curve is used to achieve the accurate detection of the seal area. Finally, a method combined with the attention mechanism and the matching algorithm is used to recognize the detected text area and the seal text content is obtained. Using this algorithm to test the self-made Chinese seal data set, the F-measure of the seal content can reach 84.73%, and the recall rate of the character recognition is 84.4%, which shows that this algorithm can detect and recognize the seal content effectively, and has an important meaning for the research of document classification and identification.
Key words:Image processing/
Seal recognition/
Recurrent neural networks/
Polar coordinate conversion
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=6a388ebd-3f32-425b-a467-0f38b09ed3b8