ABSTRACT
Error variance estimation plays an important role in statistical inference for high-dimensional regression models. This article concerns with error variance estimation in high-dimensional sparse additive model. We study the asymptotic behavior of the traditional mean squared errors, the naive estimate of error variance, and show that it may significantly underestimate the error variance due to spurious correlations that are even higher in nonparametric models than linear models. We further propose an accurate estimate for error variance in ultrahigh-dimensional sparse additive model by effectively integrating sure independence screening and refitted cross-validation techniques. The root n consistency and the asymptotic normality of the resulting estimate are established. We conduct Monte Carlo simulation study to examine the finite sample performance of the newly proposed estimate. A real data example is used to illustrate the proposed methodology. Supplementary materials for this article are available online.
Supplementary Materials
The supplementary material consists of a rigorous proof of (A.3).
Acknowledgments
The authors thank the editor, the AE, and reviewers for their constructive comments that have led to a dramatic improvement of the earlier version of this article. Jianqing Fan is the corresponding author. All authors equally contribute to this paper, and are listed in alphabetic order.
Funding
Chen's research was supported by NSF grant DMS-1206464 and NIH grants R01-GM072611. Fan's research was supported by NSF grant DMS-1206464 and NIH grants R01-GM072611 and R01GM100474-01. Li research was supported by a NSF grant DMS 1512422, National Institute on Drug Abuse (NIDA) grants P50 DA039838, P50 DA036107, and R01 DA039854, and National Nature Science Foundation of China, 11690015. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NSF, NIH, and NIDA.