A Caltech Library Service

Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps

Mortazavi, Ali and Pepke, Shirley and Jansen, Camden and Marinov, Georgi K. and Ernst, Jason and Kellis, Manolis and Hardison, Ross C. and Myers, Richard M. and Wold, Barbara J. (2013) Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps. Genome Research, 23 (12). pp. 2136-2148. ISSN 1088-9051. PMCID PMC3847782.

PDF - Published Version
Creative Commons Attribution Non-commercial.

PDF (Supplemental Figures S1-S13) - Supplemental Material
Creative Commons Attribution Non-commercial.

PDF (Supplemental Figure S14) - Supplemental Material
Creative Commons Attribution Non-commercial.

[img] MS Word (Supplemental Legends) - Supplemental Material
Creative Commons Attribution Non-commercial.

[img] Archive (TGZ) (Supplemental Tables) - Supplemental Material
Creative Commons Attribution Non-commercial.


Use this Persistent URL to link to this item:


We tested whether self-organizing maps (SOMs) could be used to effectively integrate, visualize, and mine diverse genomics data types, including complex chromatin signatures. A fine-grained SOM was trained on 72 ChIP-seq histone modifications and DNase-seq data sets from six biologically diverse cell lines studied by The ENCODE Project Consortium. We mined the resulting SOM to identify chromatin signatures related to sequence-specific transcription factor occupancy, sequence motif enrichment, and biological functions. To highlight clusters enriched for specific functions such as transcriptional promoters or enhancers, we overlaid onto the map additional data sets not used during training, such as ChIP-seq, RNA-seq, CAGE, and information on cis-acting regulatory modules from the literature. We used the SOM to parse known transcriptional enhancers according to the cell-type-specific chromatin signature, and we further corroborated this pattern on the map by EP300 (also known as p300) occupancy. New candidate cell-type-specific enhancers were identified for multiple ENCODE cell types in this way, along with new candidates for ubiquitous enhancer activity. An interactive web interface was developed to allow users to visualize and custom-mine the ENCODE SOM. We conclude that large SOMs trained on chromatin data from multiple cell types provide a powerful way to identify complex relationships in genomic data at user-selected levels of granularity.

Item Type:Article
Related URLs:
URLURL TypeDescription DOIArticle CentralArticle
Mortazavi, Ali0000-0002-4259-6362
Marinov, Georgi K.0000-0003-1822-7273
Wold, Barbara J.0000-0003-3235-8130
Additional Information:© 2013 Mortazavi et al. This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at Published by Cold Spring Harbor Press. Received March 29, 2013; accepted in revised form October 7, 2013. Published in Advance October 29, 2013. We gratefully acknowledge Ewan Birney, Ian Dunham, Eric Mjolsness, and Paul Sternberg for general discussion of SOMs and their applications to functional genomics; and Diane Trout, Henry Amrhein, and Anna Abelin for computational assistance. At Caltech this work was supported by grants to B.J.W. from the Beckman Foundation, the Donald Bren Endowment, the Gordon Moore Cell Center at Caltech, NIH U54HG004576, NIH U54HG006998, and RC2HG005573; at Hudson Alpha Institute by grant NIH U54HG004576 and NIH U54HG006998 to R.M.M.; and at Penn State University by NIH R01DK065806, U54HG006998, and RC2HG005573 to R.C.H. A.M. was partly supported as a Caltech Beckman Fellow and a Moore Cell Center Fellow while at Caltech. The Mortazavi laboratory at UC Irvine was supported by grant NIH U54HG006998 as well as EU-FP7 project STATegra (306000) to A.M.
Funding AgencyGrant Number
Arnold and Mabel Beckman FoundationUNSPECIFIED
Donald Bren EndowmentUNSPECIFIED
Caltech Gordon Moore Cell CenterUNSPECIFIED
Caltech Beckman InstituteUNSPECIFIED
Caltech Moore Cell Center FellowshipUNSPECIFIED
European Research Council (ERC)306000
Issue or Number:12
PubMed Central ID:PMC3847782
Record Number:CaltechAUTHORS:20140113-101131322
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:43336
Deposited By: Jason Perez
Deposited On:14 Jan 2014 19:38
Last Modified:29 Oct 2019 22:55

Repository Staff Only: item control page