CaltechAUTHORS
  A Caltech Library Service

Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism

Arias-Castro, Ery and Candès, Emmanuel J. and Plan, Yaniv (2011) Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Annals of Statistics, 39 (5). pp. 2533-2556. ISSN 0090-5364. https://resolver.caltech.edu/CaltechAUTHORS:20120210-151204173

[img]
Preview
PDF - Published Version
See Usage Policy.

399Kb
[img]
Preview
PDF (Supplementary Material) - Supplemental Material
See Usage Policy.

252Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20120210-151204173

Abstract

Testing for the significance of a subset of regression coefficients in a linear model, a staple of statistical analysis, goes back at least to the work of Fisher who introduced the analysis of variance (ANOVA). We study this problem under the assumption that the coefficient vector is sparse, a common situation in modern high-dimensional settings. Suppose we have p covariates and that under the alternative, the response only depends upon the order of p^(1−α) of those, 0 ≤ α ≤ 1. Under moderate sparsity levels, that is, 0 ≤ α ≤ 1/2, we show that ANOVA is essentially optimal under some conditions on the design. This is no longer the case under strong sparsity constraints, that is, α > 1/2. In such settings, a multiple comparison procedure is often preferred and we establish its optimality when α ≥ 3/4. However, these two very popular methods are suboptimal, and sometimes powerless, under moderately strong sparsity where 1/2 < α < 3/4. We suggest a method based on the higher criticism that is powerful in the whole range α > 1/2. This optimality property is true for a variety of designs, including the classical (balanced) multi-way designs and more modern “p > n” designs arising in genetics and signal processing. In addition to the standard fixed effects model, we establish similar results for a random effects model where the nonzero coefficients of the regression vector are normally distributed.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1214/11-AOS910DOIUNSPECIFIED
http://projecteuclid.org/euclid.aos/1322663467PublisherUNSPECIFIED
Additional Information:© 2011 Institute of Mathematical Statistics. Received July 2010; revised April 2011. Supported in part by an ONR Grant N00014-09-1-0258. We would like to thank Chiara Sabatti for stimulating discussions and for suggesting improvements on an earlier version of the manuscript, and Ewout van den Berg for help with the simulations. We also thank the anonymous referees for their inspiring comments which helped us improve the content of the paper.
Funders:
Funding AgencyGrant Number
Office of Naval Research (ONR)N00014-09-1-0258
Subject Keywords:Detecting a sparse signal; analysis of variance; higher criticism, minimax detection; incoherence; random matrices; suprema of Gaussian processes; compressive sensing
Issue or Number:5
Classification Code:Primary Subjects: 62G10, 94A13; Secondary Subjects: 62G20
Record Number:CaltechAUTHORS:20120210-151204173
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20120210-151204173
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:29246
Collection:CaltechAUTHORS
Deposited By: Jason Perez
Deposited On:13 Feb 2012 21:15
Last Modified:03 Oct 2019 03:39

Repository Staff Only: item control page