CaltechAUTHORS
  A Caltech Library Service

Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

Chalupka, Krzysztof and Perona, Pietro and Eberhardt, Frederick (2018) Fast Conditional Independence Test for Vector Variables with Large Sample Sizes. . (Unpublished) http://resolver.caltech.edu/CaltechAUTHORS:20180613-135346984

[img] PDF - Submitted Version
See Usage Policy.

7Mb

Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:20180613-135346984

Abstract

We present and evaluate the Fast (conditional) Independence Test (FIT) -- a nonparametric conditional independence test. The test is based on the idea that when P(X∣Y,Z)=P(X∣Y), Z is not useful as a feature to predict X, as long as Y is also a regressor. On the contrary, if P(X∣Y,Z)≠P(X∣Y), Z might improve prediction results. FIT applies to thousand-dimensional random variables with a hundred thousand samples in a fraction of the time required by alternative methods. We provide an extensive evaluation that compares FIT to six extant nonparametric independence tests. The evaluation shows that FIT has low probability of making both Type I and Type II errors compared to other tests, especially as the number of available samples grows. Our implementation of FIT is publicly available.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
http://arxiv.org/abs/1804.02747arXivDiscussion Paper
ORCID:
AuthorORCID
Chalupka, Krzysztof0000-0002-1225-2112
Perona, Pietro0000-0002-7583-5809
Record Number:CaltechAUTHORS:20180613-135346984
Persistent URL:http://resolver.caltech.edu/CaltechAUTHORS:20180613-135346984
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:87074
Collection:CaltechAUTHORS
Deposited By: Caroline Murphy
Deposited On:13 Jun 2018 21:02
Last Modified:13 Jun 2018 21:02

Repository Staff Only: item control page