翁 洋 ,谷松原 ,李 静 ,王 枫 ,李俊良,李 鑫
AuthorsHTML:翁 洋 1 ,谷松原 2 ,李 静 1 ,王 枫 3 ,李俊良 3 ,李 鑫 2
AuthorsListE:Weng Yang,Gu Songyuan,Li Jing,Wang Feng,Li Junliang,Li Xin
AuthorsHTMLE:Weng Yang1,Gu Songyuan2,Li Jing1,Wang Feng3,Li Junliang3,Li Xin2
Unit:1. 四川大学数学学院,成都 610064;
2. 四川大学法学院,成都 610207;
3. 数之联科技有限公司,成都 610041
Unit_EngLish:1. College of Mathematics,Sichuan University,Chengdu 610064,China;
2. Law School,Sichuan University,Chengdu 610207,China;
3. Union Big Data Technology Co.,Ltd.,Chengdu 610041,China
Abstract_Chinese:大数据和人工智能作为国家战略,使得新技术在司法领域应用的重要性凸显.同时,最高人民法院推动人 工智能在司法领域的深度应用为相关研究提供了契机.最高人民法院主导的信息化建设以及司法公开等需求使得大 量的裁判文书上网,裁判文书作为重要的法律文本信息资源,包含大量关键的案件审判信息,具有多元化的研究与 应用价值.然而,裁判文书中存在着大量非结构化信息,妨碍了信息的准确抽取.对裁判文书进行结构化处理是基于 裁判文书开展研究的重要前提.海量的裁判文书上网,人工处理将耗费大量的时间和精力,而裁判文书规范化改革 为人工智能的司法应用提供基础.针对裁判文书结构化任务,已有的正则匹配方法或者基于文本分类模型的研究方 法,未能利用文书上下文段落标签的结构特征,结构化效果较差.针对这一问题,提出了一种基于裁判文书段落级 别的上下文语义特征信息的序列标注模型方法.通过学习完整的裁判文书中段落标签的结构信息、段落上下文之间 的联系,实现良好的裁判文书结构化效果.结果表明:准确率、召回率和 F1 值较文本分类的基线模型有了全面提 高,得到了几乎完全准确的分类效果.另外,本文采取的结构化方法核心在于利用裁判文书段落级别的上下文语义 特征信息,该方法可以推广到各种类型的裁判文书的结构化任务.
Abstract_English:As a national strategy,big data and artificial intelligence(AI)are driving the application of new technologies in the judicial field. The Supreme People’s Court is also promoting the application of AI in the judicial system, which provides an opportunity for related research. The demand for information frameworks and the judicial openness by the Supreme People’s Court have brought a large number of judgments online. As an important legal text information resource,these judgments contain a large volume of key trial information with a diverse range of research and application values. However,there is also a large amount of unstructured information in the judgments that prevents the efficient and accurate extraction of information. Structural processing is an important prerequisite for any research based on these judgments. Massive numbers of judgments are uploaded to the internet,and their manual processing would consume much time and energy. A standardized reform of judgments would provide a basis for the applicationof AI to the judicial system. In the structuring of judgments,existing matching and research methods based on text classification models fail to take advantage of the structural features of the paragraph tags regarding the context of the document,which yield poor structuring results. To solve this problem,we propose a sequential labeling model method based on contextual semantic feature information at the paragraph level of the judgments. By studying the structural information of the paragraph labels in complete judgments and the relationship between the paragraph contexts,a good structuring of the judgments is achieved. The results show that the accuracy rate,recall rate,and F1 value are significantly improved compared to the results obtained by the baseline model of text classification,with almost completely accurate classification results obtained. In addition,as the proposed method utilizes contextual semantic information at the paragraph level of judgment text,this information can be extended to various types of judgment text structuring tasks.
Keyword_Chinese:裁判文书;文本结构化;预训练模型
Keywords_English:judgment texts;text structuring;pre-training model
PDF全文下载地址:http://xbzrb.tju.edu.cn/#/digest?ArticleID=6620
删除或更新信息,请邮件至freekaoyan#163.com(#换成@)
面向大规模裁判文书结构化的文本分类算法
本站小编 Free考研考试/2022-01-16
相关话题/算法 裁判
面向智能碾压机的位姿感知算法
谢辉,刘煜光,闫龙AuthorsHTML:谢辉,刘煜光,闫龙AuthorsListE:XieHui,LiuYuguang,YanLongAuthorsHTMLE:XieHui,LiuYuguang,YanLongUnit:天津大学机械工程学院,天津300072Unit_EngLish:Schoolo ...天津大学科研学术 本站小编 Free考研考试 2022-01-16基于改进 SVR 算法的灌浆功率阈值预测方法研究
王晓玲,薛林丽,佟大威,余佳,祝玉珊,王佳俊AuthorsHTML:王晓玲,薛林丽,佟大威,余佳,祝玉珊,王佳俊AuthorsListE:WangXiaoling,XueLinli,TongDawei,YuJia,ZhuYushan,WangJiajunAuthorsHTMLE:WangXiaoli ...天津大学科研学术 本站小编 Free考研考试 2022-01-16无人碾压机轨迹跟踪算法及能耗规律研究
杜续,宋康,谢辉AuthorsHTML:杜续,宋康,谢辉AuthorsListE:DuXu,SongKang,XieHuiAuthorsHTMLE:DuXu,SongKang,XieHuiUnit:天津大学机械工程学院,天津300072Unit_EngLish:SchoolofMechanicalE ...天津大学科研学术 本站小编 Free考研考试 2022-01-16海上机动目标的天基观测体系的观测预判算法
伦伟成,李群,朱智,肖刚,张灿AuthorsHTML:伦伟成1,李群1,朱智1,肖刚2,张灿1AuthorsListE:LunWeicheng,LiQun,ZhuZhi,XiaoGang,ZhangCanAuthorsHTMLE:LunWeicheng1,LiQun1,ZhuZhi1,XiaoGan ...天津大学科研学术 本站小编 Free考研考试 2022-01-16基于稀疏贝叶斯-RNAMBO 算法的低剂量 CT 盲复原方法
刘晓培,滕建辅,费腾,孙云山AuthorsHTML:刘晓培1,2,滕建辅3,费腾2,孙云山2AuthorsListE:LiuXiaopei,TengJianfu,FeiTeng,SunYunshanAuthorsHTMLE:LiuXiaopei1,2,TengJianfu3,FeiTeng2,Sun ...天津大学科研学术 本站小编 Free考研考试 2022-01-16基于动态一致性算法的光伏-储能分布式协调电压控制
姜飞,林政阳,何桂雄,吴朝晖,范瑞祥AuthorsHTML:姜飞1,林政阳1,何桂雄2,吴朝晖3,范瑞祥3AuthorsListE:JiangFei,LinZhengyang,HeGuixiong,WuZhaohui,FanRuixiangAuthorsHTMLE:JiangFei1,LinZhen ...天津大学科研学术 本站小编 Free考研考试 2022-01-16一种基于MRF的快速图像修复算法\r\n\t\t
何凯,沈成南,刘坤,高圣楠AuthorsHTML:何凯,沈成南,刘坤,高圣楠AuthorsListE:HeKai,ShenChengnan,LiuKun,GaoShengnanAuthorsHTMLE:HeKai,ShenChengnan,LiuKun,Gao ...天津大学科研学术 本站小编 Free考研考试 2022-01-16基于单目视觉的高速并联机器人动态目标跟踪算法\r\n\t\t
梅江平,王浩,张舵,闫寒,李策AuthorsHTML:梅江平,王浩,张舵,闫寒,李策AuthorsListE:MeiJiangping,WangHao,ZhangDuo,YanHan,LiCeAuthorsHTMLE:MeiJiangping,WangHao, ...天津大学科研学术 本站小编 Free考研考试 2022-01-16基于级联卷积神经网络的服饰关键点定位算法\r\n\t\t
李锵,姚麟倩,关欣AuthorsHTML:李锵,姚麟倩,关欣AuthorsListE:LiQiang,YaoLinqian,GuanXinAuthorsHTMLE:LiQiang,YaoLinqian,GuanXinUnit:天津大学微电子学院,天 ...天津大学科研学术 本站小编 Free考研考试 2022-01-16一种基于YOLOv3 的汽车底部危险目标检测算法\r\n\t\t
高春艳,赵文辉,张明路,孟宪春AuthorsHTML:高春艳,赵文辉,张明路,孟宪春AuthorsListE:GaoChunyan,ZhaoWenhui,ZhangMinglu,MengXianchunAuthorsHTMLE:GaoChunyan,ZhaoW ...天津大学科研学术 本站小编 Free考研考试 2022-01-16