CaltechAUTHORS
  A Caltech Library Service

Conditional Linear Regression

Calderon, Diego and Juba, Brendan and Li, Sirui and Li, Zongyi and Ruan, Lisa (2020) Conditional Linear Regression. Proceedings of Machine Learning Research, 108 . pp. 2164-2173. ISSN 2640-3498. doi:10.48550/arXiv.1806.02326. https://resolver.caltech.edu/CaltechAUTHORS:20201030-082208650

[img] PDF - Published Version
See Usage Policy.

422kB
[img]
Preview
PDF - Submitted Version
See Usage Policy.

4MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20201030-082208650

Abstract

Work in machine learning and statistics commonly focuses on building models that capture the vast majority of data, possibly ignoring a segment of the population as outliers. However, there may not exist a good, simple model for the distribution, so we seek to find a small subset where there exists such a model. We give a computationally efficient algorithm with theoretical analysis for the conditional linear regression task, which is the joint task of identifying a significant portion of the data distribution, described by a k-DNF, along with a linear predictor on that portion with a small loss. In contrast to work in robust statistics on small subsets, our loss bounds do not feature a dependence on the density of the portion we fit, and compared to previous work on conditional linear regression, our algorithm’s running time scales polynomially with the sparsity of the linear predictor. We also demonstrate empirically that our algorithm can leverage this advantage to obtain a k-DNF with a better linear predictor in practice.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://proceedings.mlr.press/v108/calderon20a.htmlPublisherArticle
https://arxiv.org/abs/1806.02326arXivDiscussion Paper
Additional Information:© 2020 by the author(s). Brendan Juba was supported by an AFOSR Young Investigator Award and NSF award CCF-1718380; part of this work was performed while visiting the Simons Institute for Theory of Computing. Part of this work was performed as an REU at Washington University in St. Louis, when Diego Calderon was supported by WUSEF and Lisa Ruan was supported by the NSF Big Data Analytics REU Site, award IIS-1560191.
Funders:
Funding AgencyGrant Number
Air Force Office of Scientific Research (AFOSR)UNSPECIFIED
NSFCCF-1718380
Washington UniversityUNSPECIFIED
NSFIIS-1560191
DOI:10.48550/arXiv.1806.02326
Record Number:CaltechAUTHORS:20201030-082208650
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20201030-082208650
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:106351
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:30 Oct 2020 15:35
Last Modified:02 Jun 2023 00:41

Repository Staff Only: item control page