删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

SEVIS方法的局部线性估计及其在超高维数据下的应用

本站小编 Free考研考试/2021-12-27

SEVIS方法的局部线性估计及其在超高维数据下的应用 连亦旻1, 陈钊2, 舒明良31. 中国科学技术大学统计与金融系, 合肥 230026;
2. Department of Statistics, Pennsylvania State University, State College, USA, PA 16802;
3. 中国科学院数学与系统科学研究院, 北京 100190 Local Estimation of Sure Explained Variability Independence Screening and Its Application for Ultrahigh-dimensional Data LIAN Yimin1, CHEN Zhao2, SHU Mingliang31. Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China;
2. Department of Statistics, Pennsylvania State University, State College, PA 16802, U.S.A.;
3. Academy of Mathematics and System Sciences, Chinese Academy of Sciences, Beijing 100190, China
摘要
图/表
参考文献
相关文章(15)
点击分布统计
下载分布统计
-->

全文: PDF(466 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)
摘要在大数据时代的背景下,如何从超高维数据中筛选出真正重要的特征成为许多相关行业的研究者们广泛关注的一个问题.特征筛选的核心思想就在于排除那些明显与因变量不相关的特征以达到这一目的.基于核估计的SEVIS (Sure Explained Variability and Independence Screening)特征筛选方法在处理非对称,非线性数据下要在一定程度上优于之前的特征筛选模型,但其采用核估计的方式对非参数部分进行估计的方法仍存在进一步改进的空间.本文就从这个角度出发,将其核估计的算法修改为局部线性估计,并考虑部分特殊情况下的变量选择过程.结果显示,基于局部线性估计的SEVIS方法在准确性,运行效率上都要优于基于核估计的SEVIS的方法.
服务
加入引用管理器
E-mail Alert
RSS
收稿日期: 2016-11-18
PACS:O212.7
基金资助:国家自然科学基金(11671102,11361011)和广西自然科学基金(2016GXNSFAA3800163)资助项目.
引用本文:
连亦旻, 陈钊, 舒明良. SEVIS方法的局部线性估计及其在超高维数据下的应用[J]. 应用数学学报, 2018, 41(1): 1-13. LIAN Yimin, CHEN Zhao, SHU Mingliang. Local Estimation of Sure Explained Variability Independence Screening and Its Application for Ultrahigh-dimensional Data. Acta Mathematicae Applicatae Sinica, 2018, 41(1): 1-13.
链接本文:
http://123.57.41.99/jweb_yysxxb/CN/ http://123.57.41.99/jweb_yysxxb/CN/Y2018/V41/I1/1


[1] Tibshirani R. Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B., 1996, 58:267-288
[2] Yuan M, Lin Y. Medel Selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society:Series B, 2006, 68:49-67
[3] Zou H. The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 2006, 101:1418-1429
[4] Fan J. Comment on "Wavelets in statistics:a review" by A. Antoniadis. J. Ital. Statist. Soc., 1997, 2:131-138
[5] Zou H, Hasties T. Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society, Series B, 2005, 67:301-320
[6] Candes E, Tao T. The Dantzig Selector:statistical estimation when p is much larger than n (with discussion). The Annals of Statistics, 2007, 35:2313-2404
[7] Fan J, Samworth R, Wu Y. Ultrahigh dimensional feature selection:Beyond the linear model. Journal of Machine Learning Research, 2009, 10:1829-1853
[8] Fan J, Lv J. Sure independence screening for ultrahigh dimensional feature space (with discussion). Journal of the Royal Statistical Society, Series B, 2008, 70:849-911
[9] Fan J, Song R. Sure independence screening in generalized linear models with NP-dimensionality. The Annals of Statistics, 2010, 38:3567-3604
[10] Fan J, Feng Y, Song R. Nonparametric independence screening in sparse ultra-high dimensional additive models. Journal of the American Statistical Association, 2011, 106:544-557
[11] Song R, Yi F, Zou H. On Varying-coefficient independence screening for high-dimensional varyingcoefficient models. Statistica Sinica, 2012, 24:1735-1752
[12] Liu J, Li R, Wu R. Feature selection for varying coefficient models with ultrahigh dimensional covariates. Technical Report, Departmernt of Statitics, Pennsylvania State University, 2013
[13] Zhu L P, Li L, Zhu L X. Model-free feature screening for ultrahigh dimensional data. Journal of the American Statistical Association, 2011, 106:1464-1475
[14] Li R, Zhong W, Zhu L. Feature Screening via Distance Correlation Learning. Journal of American Statistical Association, 2012, 107:1129-1139
[15] He X, Wang L, Hong H G. Quantile-adaptive model-free variable scerrning for high-dimensional heterogeneous data. The Annals of Statistics, 2013, 41:342-369
[16] Wu Y, Yin G. Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika, 2014, 102:65-76
[17] Chen M, Lian Y, Chen Z, Zhang Z. Sure Explained Variability and Independence Screening. Journal of Nonparametric Statistics, 2017, 1-35
[18] Fan J. Design-adaptive Nonparametric Regression. Journal of the American Statistical Association, 1992, 87:998-1004
[19] Zheng S, Shi N Z, Zhang Z. Generalized Measures of Correlation for Asymmetry, Nonlinearity, and Beyond. Journal of the American Statistical Association, 2012, 107:1239-1252

[1]李国军, 陈东杰, 韩一士. 带有初态误差的高阶多智能体系统一致性跟踪[J]. 应用数学学报, 2018, 41(2): 156-171.
[2]刘旭, 舒鑫鑫, 周勇. 不完全数据下的分位数估计[J]. 应用数学学报, 2018, 41(2): 198-214.
[3]王虎, 田晶磊, 孙玉琴, 于永光. 具有阶段结构的时滞分数阶捕食者-食饵系统的稳定性分析[J]. 应用数学学报, 2018, 41(1): 27-42.
[4]张涛, 万艳玲, 王智文. 部分线性函数多项式模型的联合探测[J]. 应用数学学报, 2018, 41(1): 110-123.
[5]蔡晓丽, 唐应辉. 温储备失效和单重休假Min(N,V)-策略的M/G/1可修排队系统[J]. 应用数学学报, 2017, 40(5): 702-726.
[6]黎玲, 李华英, 秦永松. 强混合样本情形含附加信息时总体分位数的估计[J]. 应用数学学报, 2017, 40(5): 781-800.
[7]曹学锋, 曲连强. 带信息终止事件的复发事件数据的联合建模分析[J]. 应用数学学报, 2017, 40(4): 530-542.
[8]江羡珍, 简金宝. 一个自调节Polak-Ribière-Polyak型共轭梯度法[J]. 应用数学学报, 2017, 40(3): 449-460.
[9]赵建伟, 刘新国. 求解多组变量典型相关分析Maxrat准则的预处理Dinkelbach方法[J]. 应用数学学报, 2016, 39(5): 641-655.
[10]郦博文, 张海祥. 多类型复发事件数据下一类Box-Cox转移模型[J]. 应用数学学报, 2016, 39(5): 656-668.
[11]周敏, 汪文俊. t分布的代表点及其在统计模拟中的应用[J]. 应用数学学报, 2016, 39(4): 620-640.
[12]马力, 王兢. 能量不等式和薛定谔流弱解的唯一性[J]. 应用数学学报, 2016, 39(2): 223-228.
[13]党艳霞, 蔡礼明, 李学志. 一类具有离散时滞的多菌株媒介传染病模型的竞争排斥[J]. 应用数学学报, 2016, 39(1): 100-120.
[14]彭再云, 秦南南, 李科科. G-E-半预不变凸型多目标规划的Wolfe型对偶[J]. 应用数学学报, 2015, 38(6): 1103-1114.
[15]邱崇, 王承富. 一类与椭圆型边值问题相关的重排优化问题[J]. 应用数学学报, 2015, 38(6): 976-986.



PDF全文下载地址:

http://123.57.41.99/jweb_yysxxb/CN/article/downloadArticleFile.do?attachType=PDF&id=14436
相关话题/应用数学 数据 统计 系统 信息