Academy of Mathematics and Systems Science, CAS Colloquia & Seminars | Speaker: | Haili Zhang,Southern University of Science and Technology | Inviter: | | Title: | A Fast Distributed Screening for High-dimensional Linear Quantile Regression | Time & Venue: | 2021.12.12 19:00-20:00 腾讯会议:34241285354 | Abstract: | We study variable selection using the screening method for distributed high dimensional regression with large sample size n under a limited memory constraint, where the memory of one machine can only store a subset of data. This is a much needed research problem to be solved in the big data era. A naive divide-and-conquer method solving this problem is to split the whole data into N parts and run each part on one of N machines, aggregate the results from all machines via averaging, and finally obtain the selected variables. However, it tends to select more noise variables, and the false discovery rate may not be well controlled. We improve it by proposing a new cumulative sum chart monitoring algorithm for distributed high dimensional quantile regression. Theoretically, we establish asymptotic properties of our estimator for the screening method with a diverging number of parameters. Under some regularity conditions we establish oracle properties in the sense that our distributed estimator shares the same asymptotic efficiency as the estimator based on the full sample. Computationally, a distributed data-driven screening algorithm is proposed to data with heavy-tailed errors. Simulations and a real example demonstrate nice performance of our proposed distributed procedure. | | | |