CaltechAUTHORS
  A Caltech Library Service

Factor analysis for survival time prediction with informative censoring and diverse covariates

McCurdy, Shannon and Molinaro, Annette and Pachter, Lior (2019) Factor analysis for survival time prediction with informative censoring and diverse covariates. Statistics in Medicine, 38 (20). pp. 3719-3732. ISSN 0277-6715. https://resolver.caltech.edu/CaltechAUTHORS:20190610-075805993

[img] PDF - Supplemental Material
See Usage Policy.

1250Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190610-075805993

Abstract

Fulfilling the promise of precision medicine requires accurately and precisely classifying disease states. For cancer, this includes prediction of survival time from a surfeit of covariates. Such data presents an opportunity for improved prediction, but also a challenge due to high dimensionality. Furthermore, disease populations can be heterogeneous. Integrative modeling is sensible, as the underlying hypothesis is that joint analysis of multiple covariates provides greater explanatory power than separate analyses. We propose an integrative latent variable model that combines factor analysis for various data types and an exponential proportional hazards (EPH) model for continuous survival time with informative censoring. The factor and EPH models are connected through low‐dimensional latent variables that can be interpreted and visualized to identify subpopulations. We use this model to predict survival time. We demonstrate this model's utility in simulation and on four Cancer Genome Atlas datasets: diffuse lower‐grade glioma, glioblastoma multiforme, lung adenocarcinoma, and lung squamous cell carcinoma. These datasets have small sample sizes, high‐dimensional diverse covariates, and high censorship rates. We compare the predictions from our model to three alternative models. Our model outperforms in simulation and is competitive on real datasets. Furthermore, the low‐dimensional visualization for diffuse lower‐grade glioma displays known subpopulations.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://doi.org/10.1002/sim.8151DOIArticle
ORCID:
AuthorORCID
McCurdy, Shannon0000-0001-5555-4156
Molinaro, Annette0000-0002-9854-7404
Pachter, Lior0000-0002-9164-6231
Additional Information:© 2019 John Wiley & Sons, Ltd. Version of Record online: 04 June 2019; Manuscript accepted: 03 March 2019; Manuscript revised: 15 January 2019; Manuscript received: 23 January 2018. Funding Information: National Human Genome Research Institute of the National Institutes of Health. Grant Number: F32HG008713.
Funders:
Funding AgencyGrant Number
NIH Postdoctoral FellowshipF32HG008713
Subject Keywords:diffuse lower‐grade glioma; exponential proportional hazards; factor analysis; glioblastoma multiforme; informative censoring; integrative models; latent variables; lung adenocarcinoma; lung squamous cell carcinoma
Issue or Number:20
Record Number:CaltechAUTHORS:20190610-075805993
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190610-075805993
Official Citation:McCurdy, S, Molinaro, A, Pachter, L. Factor analysis for survival time prediction with informative censoring and diverse covariates. Statistics in Medicine. 2019; 38: 3719–3732 https://doi.org/10.1002/sim.8151
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:96223
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:10 Jun 2019 17:44
Last Modified:03 Oct 2019 21:20

Repository Staff Only: item control page