张坤,
1.长沙理工大学计算机与通信工程学院 ??长沙 ??410114
2.长沙理工大学综合交通运输大数据智能处理湖南省重点实验室 长沙 410114
基金项目:国家自然科学基金(61772087)
详细信息
作者简介:陈曦:男,1963年生,教授,硕士生导师,研究方向为数据挖掘
张坤:男,1993年生,硕士生,研究方向为数据挖掘
通讯作者:张坤 zonkis2016@outlook.com
中图分类号:TP311.1计量
文章访问数:2114
HTML全文浏览量:866
PDF下载量:71
被引次数:0
出版历程
收稿日期:2018-09-18
修回日期:2019-03-27
网络出版日期:2019-04-20
刊出日期:2019-08-01
A Classifier Learning Method Based on Tree-Augmented Na?ve Bayes
Xi CHEN,Kun ZHANG,
1. School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
2. Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha 410114, China
Funds:The National Natural Science Foundation of China (61772087)
摘要
摘要:树增强朴素贝叶斯(TAN)结构强制每个属性结点必须拥有类别父结点和一个属性父结点,也没有考虑到各个属性与类别之间的相关性差异,导致分类准确率较差。为了改进TAN的分类准确率,该文首先扩展TAN结构,允许属性结点没有父结点或只有一个属性父结点;提出一种利用可分解的评分函数构建树形贝叶斯分类模型的学习方法,采用低阶条件独立性(CI)测试初步剔除无效属性,再结合改进的贝叶斯信息标准(BIC)评分函数利用贪婪搜索获得每个属性结点的父结点,从而建立分类模型。对比朴素贝叶斯(NB)和TAN,构建的分类器在多个分类指标上表现更好,说明该方法具有一定的优越性。
关键词:贝叶斯分类器/
树增强朴素贝叶斯/
评分函数
Abstract:The structure of Tree-Augmented Na?ve Bayes (TAN) forces each attribute node to have a class node and a attribute node as parent, which results in poor classification accuracy without considering correlation between each attribute node and the class node. In order to improve the classification accuracy of TAN, firstly, the TAN structure is proposed that allows each attribute node to have no parent or only one attribute node as parent. Then, a learning method of building the tree-like Bayesian classifier using a decomposable scoring function is proposed. Finally, the low-order Conditional Independency (CI) test is applied to eliminating the useless attribute, and then based on improved Bayesian Information Criterion (BIC) function, the classification model with acquired the parent node of each attribute node is established using the greedy algorithm. Through comprehensive experiments, the proposed classifier outperforms Na?ve Bayes (NB) and TAN on multiple classification, and the results prove that this learning method has certain advantages.
Key words:Bayesian classifier/
Tree-Augmented Na?ve Bayes (TAN)/
Scoring function
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=05eb0735-af24-49cf-8b43-b06b12e47af6