Construction and Verification of Fusarium Head Blight Prediction Model in Haihe Plain Based on Boosted Regression Tree
TAO Bu,1, QI YongZhi,1, QU Yun2, CAO ZhiYan1, ZHAO XuSheng1, ZHEN WenChao,31College of Plant Protection, Hebei Agricultural University, Baoding 071001, Hebei 2Modern Educational Technology Center, Hebei Agricultural University, Baoding 071001, Hebei 3College of Agronomy, Hebei Agricultural University/State Key Laboratory of North China Crop Improvement and Regulation/Key Laboratory of Regulation and Control of Crop Growth of Hebei, Baoding 071001, Hebei
Received:2021-02-1Accepted:2021-03-2 作者简介 About authors 陶晡,Tel:0312-7526131;E-mail: taobu@hebau.edu.cn。
齐永志,E-mail: qiyongzhi1981@163.com。
摘要 【背景】自1995年至今,小麦赤霉病(Fusarium head blight,FHB)逐渐在海河平原蔓延,由零星出现演变成连片发生,在流行年份呈现出暴发快、面积大、损失重的特点,小麦赤霉病已由次要病害上升为主要病害之一。准确的预测预报是有效控制小麦赤霉病发生与发展的关键和难点。【目的】根据海河平原小麦赤霉病发生情况的监测分析,构建适宜的小麦赤霉病预测模型,为科学防控赤霉病提供技术支撑。【方法】基于2001—2016年海河平原21个小麦主产县(市)的赤霉病病穗率数据,以及小麦关键生育期内的气象数据,采用逐步回归分析,筛选影响小麦赤霉病发生的关键气象因子,构建基于多元线性回归模型和增强回归树模型的小麦赤霉病发生预测模型。【结果】明确了增强回归树模型的学习效率(lr)为0.005、树的复杂度(tc)为6时,模型的预测偏差最低,残差标准误为0.006311;筛选出8个对海河平原小麦赤霉病发生影响显著的关键气象因子,即MRH15、Rain-35、MRH-55、SD15、LT-65、MWS-55、MT-25、DRain15,并构建了含有8个预测变量的多元线性回归模型(R2=0.8158,矫正R2=0.8018,P<2.2×10-16)。同时,应用增强回归树模型评估了上述8个关键气象因子的重要性,分别为69.62%、14.08%、4.89%、4.34%、3.35%、2.02%、1.20%、0.50%;根据重要的预测变量进一步简化预测模型,构建了含有4个预测变量的多元线性回归模型(y=-19.45376+0.11689MRH15+0.17346Rain-35+0.04185SD15+0.26592MRH-55,R2=0.7575,矫正R2=0.7468,P<2.2×10-16);当预测变量由8个调减至4个时,利用2008、2010、2012年安新、定州、馆陶等地历史数据验证模型预测病穗率的准确度,多元线性回归模型预测准确度由88.43%降至85.90%,增强回归树模型预测准确度由87.72%升至91.23%;利用2001—2016年正定、栾城的历史数据验证模型预测病穗率的准确度,两个模型预测准确度无显著变化,多元线性回归模型预测准确度由87.53%变为87.42%,增强回归树模型预测准确度由89.20%变为89.21%。整体而言,多元线性回归模型预测准确度呈下降趋势,而增强回归树模型预测准确度呈上升趋势。【结论】研究构建了含有4个预测变量的增强回归树模型,其预测准确度达89.21%,病穗率预测值与实际观测值的波动趋势基本一致,表明增强回归树模型在海河平原小麦赤霉病预测预报中具有很好的应用前景。 关键词:小麦赤霉病;禾谷镰孢;预测模型;增强回归树
Abstract 【Background】 Since 1995, Fusarium head blight (FHB) has gradually spread and risen from a secondary disease to a major disease in Haihe Plain, from sporadic occurrence to continuous occurrence, showing the characteristics of rapid outbreak, large area and heavy loss in epidemic years. To realize effective prevention and control of FHB, accurate forecasting technology is an important prerequisite for controlling the occurrence and development of FHB. 【Objective】According to the occurrence of FHB in Haihe Plain, the prediction model of FHB suitable for Haihe Plain was established to provide technical supports for scientific prevention and control of FHB.【Method】Based on the data about spike rate of FHB and meteorological factors of key growth stage of wheat in 21 counties of Haihe Plain from 2001 to 2016, the key meteorological factors which have significant influences on the FHB occurrence in Haihe Plain were screened by stepwise regression analysis, and the prediction models of FHB occurrence based on multiple linear regression model and boosted regression tree model were constructed, respectively.【Result】When the learning efficiency (lr) of the boosted regression tree model was 0.005 and the complexity (tc) of the tree was 6, the prediction deviation of the model was the lowest, and the residual standard error was 0.006311. Eight key meteorological factors, including MRH15, Rain-35, MRH-55, SD15, LT-65, MWS-55, MT-25 and DRain15, which had a significant impact on the occurrence of FHB in Haihe Plain, were screened out, and a multiple linear regression model with eight predictive variables was established (R2=0.8158, corrected R2=0.8018, P<2.2×10 -16). Meanwhile, the importance of each key meteorological factor was evaluated by using the boosted regression tree model, with the values of 69.62%, 14.08%, 4.89%, 4.34%, 3.35%, 2.02%, 1.20% and 0.50%, respectively. According to the key predictive variables, the prediction model was further simplified, and a multiple linear regression model with four predictive variables was constructed (y=-19.45376+0.11689MRH15+0.17346Rain-35+0.04185SD15+0.26592MRH-55, R2=0.7575, corrected R2=0.7468, P<2.2×10 -16). When the prediction variables was reduced from 8 to 4, the prediction accuracy of the multiple linear regression model decreased from 88.43% to 85.90%, but the prediction accuracy on the disease spike rate of the boosted regression tree model increased from 87.72% to 91.23%, which was verified by using the historical data of Anxin, Dingzhou and Guantao, etc in 2008, 2010 and 2012. The prediction accuracy on the disease spike rate of the multiple linear regression model and the boosted regression tree model changed from 87.53% to 87.42% and from 89.20% to 89.21%, respectively, but there was no significant difference between the multiple linear regression model and the boosted regression tree model, when they were verified with the historical data of Zhengding and Luancheng from 2001 to 2016. In a word, the prediction accuracy of multiple linear regression model showed a downward trend, while the prediction accuracy of boosted regression tree model showed an upward trend.【Conclusion】In this study, the boosted regression tree model with four predictive variables was constructed, with the prediction accuracy of 89.21%. At the same time, the disease spike rate predicted by the boosted regression tree model was basically consistent with the observed fluctuation trend, indicating that the boosted regression tree model had a good application prospect in the prediction of FHB in Haihe Plain. Keywords:Fusarium head blight (FHB);Fusarium graminearum;prediction model;boosted regression tree (BRT)
PDF (1008KB)元数据多维度评价相关文章导出EndNote|Ris|Bibtex收藏本文 本文引用格式 陶晡, 齐永志, 屈赟, 曹志艳, 赵绪生, 甄文超. 基于增强回归树的海河平原小麦赤霉病预测模型构建与验证. 中国农业科学, 2021, 54(18): 3860-3870 doi:10.3864/j.issn.0578-1752.2021.18.006 TAO Bu, QI YongZhi, QU Yun, CAO ZhiYan, ZHAO XuSheng, ZHEN WenChao. Construction and Verification of Fusarium Head Blight Prediction Model in Haihe Plain Based on Boosted Regression Tree. Scientia Acricultura Sinica, 2021, 54(18): 3860-3870 doi:10.3864/j.issn.0578-1752.2021.18.006
开放科学(资源服务)标识码(OSID):
0 引言
【研究意义】小麦赤霉病(Fusarium head blight,FHB)是小麦生产上发生面积最广、危害程度最大的麦类病害之一[1],该病是以禾谷镰孢(Fusarium graminearum)为主要致病菌的真菌性病害[2]。据报道,自1990年以来美国小麦种植面积因赤霉病流行不断压缩,2018年小麦种植面积减少1 200万公顷[3],2016年小麦赤霉病造成加拿大萨斯克彻温省经济损失约为10亿美元[4]。同时病菌产生的脱氧雪腐镰刀菌烯醇(DON毒素)和玉米赤霉烯酮(ZEN毒素)等危害人畜健康,对小麦品质和产量造成严重影响[5,6]。近年来,因气候条件变化、耕作制度改变,我国小麦赤霉病发生呈日趋严重的趋势,由长江中下游麦区逐渐向北扩展,淮河流域地区成为重发区,在黄淮北片麦区和北部冬麦区也成为常发病害[7],2010年以来重发频率在50%以上,2015、2016、2018年发生面积均超过550万公顷[8]。自1995年以来,小麦赤霉病逐渐在海河平原(也称河北平原)蔓延,已由零星出现逐渐演变成连片发生,并由次要病害上升为主要病害之一,年均发生面积达26.7万公顷以上[7]。小麦赤霉病在流行年份具有短期内暴发快、面积大、损失重的特性,因此,明确海河平原影响赤霉病发生的关键气象因子,建立适宜该区域的病害预测模型,提供准确的预测预报信息,对有效防控病害蔓延具有重要意义。【前人研究进展】DE WOLF等[9,10]以小麦开花前7 d和开花后10 d的气象因子作为预测变量,用逻辑回归建立了小麦赤霉病测报模型;在此基础上,SHAH等以品种抗性、玉米残茬以及前期研究获得的4个气象因子作为变量,通过R语言,建立了基于Leaps and Bounds算法的Logistic回归模型[11]和增强回归树(boosted regression tree,BRT)模型[12],结果表明,增强回归树模型误判率低于Logistic回归模型;HOOKER等[13]以抽穗前4—7 d降雨天数和温度为预测变量,建立了含有指数项的模型,预测小麦DON毒素含量;DEL PONTE等[14]以空中孢子捕捉量和感病组织为基础建立测报模型;ROSSI等[15]以菌源量、小麦关键生育期为基础,综合考虑日产孢率、孢子分散率、侵染机率和小麦生育期等因素,预测小麦赤霉病发生风险;MUSA等[16]建立了基于web的瑞士小麦赤霉病预警系统FusaProg,预测小麦赤霉病发生、DON毒素含量并指导杀菌剂科学使用。国内专家****从气候预测、菌量预测、气候菌量相结合预测等方面展开了研究,建立了长期预测、中期预测与短期预测模型,同时,借助神经网络[17,18]、支持向量机[19]、无人机高光谱图像[20]等技术,不断提高了预测预报的准确度。一般情况下,预测模型存在可移植性差、跨地区应用准确度下降等问题。【本研究切入点】增强回归树是以分类回归树(classification and regression tree,CART)算法为基础的一种自学方法,通过自我学习和随机选择生成多重回归树,提高模型稳定性和预测精度。YOU等利用该模型明确了环境变量与品种对牧草病害发生的影响,取得了较好的效果[21],为评估小麦赤霉病主要影响因子重要性提供了一种新的思路。【拟解决的关键问题】根据影响小麦赤霉病流行的关键生育期,选择温度、湿度、降雨、日照、风速等气象因子为预测变量,筛选出重要预测变量,并分析其对病害发生的影响,以期提升模型预测准确度,为小麦赤霉病发生预测预报提供参考,同时也可为建立该病害综合、高效防控体系提供技术支撑。
1.3.2 模型参数选择 在模型运行过程中,需要优化迭代次数(the number of trees,nt)、树的复杂度(tree complexity,tc)、学习效率(learning rate,lr)、抽样比率(bag fraction,bf)、函数损失形式(distribution)、交叉验证折数(cv.folds)等参数[12]。树的复杂度即为单棵决策树的叶节点数量,它是模型拟合环境因子间交互作用的阶数。增强回归树中所有决策树的叶节点数量相同,训练过程中叶节点达到一定数量时则停止生长,不需要剪枝[23,24]。学习效率决定了模型达到最优所需训练的时间,lr值过小,则收敛速度慢、训练时间越长;lr值过大,抽样时容易产生噪音,导致函数平滑性降低、稳定性差[25]。通常情况下,迭代次数(nt)要达到1 000以上模型才趋于稳定,树的复杂度(tc)1—16,学习效率(lr)0.001—0.1,抽样比率(bf)为0.75,函数损失形式为“gaussian”。由于tc和lr的取值影响模型的预测准确度,随机选择70%训练集数据用于构建模型,剩余30%的数据用于计算模型的预测偏差,根据模型预测偏差大小选择最优的tc值和lr值。
Fig. 1Relationship between the prediction error of BRT model and the number of decision trees
设置lr为0.01、0.005,由不同tc的残差标准误(residual standard error,图2)可知,在lr为0.01和0.005的学习效率条件下,当tc为6时增强回归树模型的残差标准误分别为0.01004和0.006311,随着tc值的增加,增强回归树模型的预测偏差相对变化不大。综合考虑不同lr和tc下模型预测偏差,选择模型的lr为0.005,tc为6。
MLRⅠ、MLRⅡ分别为含8个、4个预测变量的多元线性回归模型;BRTⅠ、BRTⅡ分别为含8个、4个预测变量的增强回归树模型 MLRⅠand MLRⅡ represent the multiple linear regression model with 8 and 4 predictive variables, respectively. BRTⅠand BRTⅡrepresent the boosted regression tree model with 8 and 4 predictive variables, respectively。图6同The same as Fig. 6 Fig. 5Comparison between the observed and predicted values of multivariate linear regression and BRT models of FHB in multiple monitoring sites in the same year
Fig. 6Comparison between the observed and predicted values of multivariate linear regression and BRT models on the diseased panicle rate of FHB in the location monitoring site from 2001 to 2016
SAVARYS, WILLOCQUETL, PETHYBRUDGE SJ, ESKERP, MCROBERTSN, NELSONA. The global burden of pathogens and pests on major food crops Nature Ecology and Evolution, 2019, 3(3): 430-439. DOI:10.1038/s41559-018-0793-yURL [本文引用: 2]
LU WZ, CHENG SH, WANG YZ. Study on Wheat Scab. Beijing: Science Press, 2001. (in Chinese) [本文引用: 1]
GHIMIREB, SAPKOTAS, BAHRI BA, MARTINEZ-ESPINOZAA D, BUCKJ W, MERGOUMM. Fusarium head blight and rust diseases in soft red winter wheat in the southeast United States: State of the art, challenges and future perspective for breeding Frontiers in Plant Science, 2020, 11:1080. DOI:10.3389/fpls.2020.01080URL [本文引用: 1]
RUAN YF, ZHANG WT, KNOX RE, BERRAIESS, CAMPBELL HL, RAGUPATHYR, BOYLEK, POLLEYB, HENRIQUEZ MA, BURTA, KUMARS, CUTHBERT RD, FBOERT PR, BUERSTMAYRH, DEPAUW RM. Characterization of the genetic architecture for Fusarium head blight resistance in durum wheat: The complex association of resistance, flowering time, and height genes Frontiers in Plant Science, 2020, 11:592064. DOI:10.3389/fpls.2020.592064URL [本文引用: 1]
HU WJ, ZHANG CM, WUD, LU CB, DONG YC, CHENG XM, ZHANGY, GAO DR. Screening for resistance to Fusarium head blight and agronomic traits of wheat germplasms from Yangtze river region Scientia Agricultura Sinica, 2020, 53(21): 4313-4321. (in Chinese) [本文引用: 1]
ZHANG AM, YANG WL, LIX, SUN JZ. Current status and perspective on research against Fusarium head blight in wheat Hereditas, 2018, 40(10): 858-873. (in Chinese) [本文引用: 2]
HUANGC, JIANG YY, LI CG. Occurrence, yield loss and dynamics of wheat diseases and insect pests in China from 1987 to 2018 Plant Protection, 2020, 46(6): 186-193. (in Chinese) [本文引用: 1]
DE WOLF ED, MADDEN LV, LIPPS PE. Risk assessment models for wheat Fusarium head blight epidemics based on within-season weather data Phytopathology, 2003, 93(4): 428-435. DOI:10.1094/PHYTO.2003.93.4.428URL [本文引用: 3]
SHAH DA, MOLINEROS JE, PAUL PA, WILLYERD KT, MADDEN LV, DEWOLF E D. Predicting Fusarium head blight epidemics with weather-driven pre- and post-anthesis logistic regression models Phytopathology, 2013, 103(9): 906-919. DOI:10.1094/PHYTO-11-12-0304-RURL [本文引用: 1]
SHAH DA, DEWOLF E D, PAULP A, MADDENL V. Predicting Fusarium head blight epidemics with boosted regression trees Phytopathology, 2014, 104(7): 702-714. DOI:10.1094/PHYTO-10-13-0273-RURL [本文引用: 5]
HOOKER DC, SCHAAFSMA AW, TAMBURIC-ILINCICL. Using weather variables pre- and post-heading to predict deoxynivalenol content in winter wheat Plant Disease, 2002, 86(6): 611-619. DOI:10.1094/PDIS.2002.86.6.611URL [本文引用: 1]
DEL PONTE EM, FERNANDES J MC, PAVANW. A risk infection simulation model for Fusarium head blight of wheat Fitopatologia Brasileira, 2005, 30(6): 634-642. DOI:10.1590/S0100-41582005000600011URL [本文引用: 1]
ROSSIV, GIOSUES, PATTORIE, SPANNAF,DEL VECCHIO A. A model estimating the risk of Fusarium head blight on wheat Bulletin OEPP/EPPO Bulletin, 2003, 33(3): 421-425. DOI:10.1111/epp.2003.33.issue-3URL [本文引用: 1]
MUSAT, HECKERA, VOGELGSANGS, FORRER HR. Forecasting of Fusarium head blight and deoxynivalenol content in winter wheat with FusaProg Bulletin OEPP/EPPO Bulletin, 2007, 37(2): 283-289. DOI:10.1111/epp.2007.37.issue-2URL [本文引用: 1]
ZHANG DY, WANG DY, GU CY, JINN, ZHAO HT, CHENG, LIANG HY, LIANGD. Using neural network to identify the severity of wheat Fusarium head blight in the field environment Remote Sensing, 2019, 11(20): 2375. DOI:10.3390/rs11202375URL [本文引用: 1]
LIU ZH, ZHANGL, YAN YF, ZHOUJ, ZHANG ZJ, ZHANG HB. Application research on wheat scab forecasting based on BP neural network Journal of Yunnan Agricultural University, 2010, 25(5): 680-685. (in Chinese) [本文引用: 1]
HUANG LS, WU ZC, HUANG WJ, MA HQ, ZHAO JL. Identification of Fusarium head blight in winter wheat ears based on Fisher’s linear discriminant analysis and a support vector machine Applied Sciences, 2019, 9(18): 3894. DOI:10.3390/app9183894URL [本文引用: 1]
LIU LY, DONG YY, HUANG WJ, DU XP, MA HQ. Monitoring wheat Fusarium head blight using unmanned aerial vehicle hyperspectral imagery Remote Sensing, 2020, 12(22): 3811. DOI:10.3390/rs12223811URL [本文引用: 1]
YOU MP, RENSINGK, RENTONM, BARBETTI MJ. Critical factors driving aphanomyces damping-off and root disease in clover revealed and explained using linear and generalized linear models and boosted regression trees Plant Pathology, 2018, 67(6): 1374-1387. DOI:10.1111/ppa.2018.67.issue-6URL [本文引用: 1]
LI YH, YANG LH. Cultivation techniques of high yield, water saving and fertilizer saving in winter wheat in Hebei Province. Beijing: China Agriculture Press, 2017. (in Chinese) [本文引用: 2]
ZHANGY, CHENH Y H, REICHP B. Forest productivity increases with evenness, species richness and trait variation: A global meta-analysis Journal of Ecology, 2012, 100(3): 742-749. DOI:10.1111/j.1365-2745.2011.01944.xURL [本文引用: 1]
FRIEDMAN JH. Greedy function approximation: A gradient boosting machine The Annals of Statistics, 2001, 29(5): 1189-1232. DOI:10.1214/aos/1013203450URL [本文引用: 1]
FRIEDMAN JH, MEULMAN JJ. Multiple additive regression trees with application in epidemiology Statistics in Medicine, 2003, 22(9): 1365-1381. DOI:10.1002/(ISSN)1097-0258URL [本文引用: 1]
XIAO YY. The studies on the reliability evaluation of pest prediction and forecasting Plant Protection Technology and Extension, 1997, 17(4): 3-6. (in Chinese) [本文引用: 1]
MOSCHINI RC, FORTUGNOC. Predicting wheat head blight incidence using models based on meteorological factors in Pergamino, Argentina European Journal of Plant Pathology, 1996, 102(3): 211-218. DOI:10.1007/BF01877959URL [本文引用: 1]
MOSCHINI RC, PIOLIR, CARMONAM, SACCHIO. Empirical predictions of wheat head blight in the Northern Argentinean Pampas Region Crop Science, 2001, 41(5): 1541-1545. DOI:10.2135/cropsci2001.4151541xURL [本文引用: 1]
GIROUX ME, BOURGEOISG, DIONY, RIOUXS, PAGEAUD, ZOGHLAMIS, PARENTC, VACHONE, VANASSEA. Evaluation of forecasting models for Fusarium head blight of wheat under growing conditions of Quebec, Canada Plant Disease, 2016, 100(6): 1192-1201. DOI:10.1094/PDIS-04-15-0404-REURL [本文引用: 1]
LANDSCHOOTS, WAEGEMANW, AUDENAERTK, VAN DAMMEP, VANDEPITTEJ, DE BAETSB, HAESAERTG. A field-specific web tool for the prediction of Fusarium head blight and deoxynivalenol content in Belgium Computers and Electronics in Agriculture, 2013, 93:140-148. DOI:10.1016/j.compag.2013.02.011URL [本文引用: 1]
XU XM, MADDEN LV, EDWARDS SG, DOOHAN FM, MORETTIA, HORNOKL, NICHOLSONP, RITIENIA. Developing logistic models to relate the accumulation of DON associated with Fusarium head blight to climatic conditions in Europe European Journal of Plant Pathology, 2013, 137(4): 689-706. DOI:10.1007/s10658-013-0280-xURL [本文引用: 1]
ZHANG XJ. The construction of meteorological forecasting model of wheat head blight occurring trend and the information release platform in the mid-shouth of Hebei Province [D]. Baoding: Hebei Agricultural University, 2017. (in Chinese) [本文引用: 1]
SHAH DA, PAUL PA, DEWOLF E D, MADDENL V. Predicting plant disease epidemics from functionally represented weather series Philosophical Transactions of the Royal Society of London B, Biological Sciences, 2019, 374(1775): 20180273. [本文引用: 1]
GUO XW. Development of models to predict Fusarium head blight disease and deoxynivalenol in wheat, and genetic causes for chemotype diversity and shifting of Fusarium graminearum in Manitoba Canada: University of Manitoba, 2008.
KRISS AB, PAUL PA, MADDEN LV. Characterizing heterogeneity of disease incidence in a spatial hierarchy: A case study from a decade of observations of Fusarium head blight of wheat Phytopathology, 2012, 102(9): 867-877. DOI:10.1094/PHYTO-11-11-0323URL
ZUO YH, ZHENG LZ, ZHANG JH, LIU TR, PENGC, ZHANG YQ, SHI CP, XU QZ, PAN CY. Studies on the epidemic method of spring wheat scab in Heilongjiang Province Acta Phytophylacica Sinica, 1995, 22(4): 297-302. (in Chinese) [本文引用: 1]