A Caltech Library Service

Integrative genome modeling platform reveals essentiality of rare contact events in 3D genome organizations

Boninsegna, Lorenzo and Yildirim, Asli and Polles, Guido and Zhan, Yuxiang and Quinodoz, Sofia A. and Finn, Elizabeth H. and Guttman, Mitchell and Zhou, Xianghong Jasmine and Alber, Frank (2022) Integrative genome modeling platform reveals essentiality of rare contact events in 3D genome organizations. Nature Methods, 19 (8). pp. 938-949. ISSN 1548-7091. doi:10.1038/s41592-022-01527-x.

[img] PDF - Published Version
Creative Commons Attribution.

[img] PDF - Submitted Version
Creative Commons Attribution.

[img] PDF (Supplementary Discussion and Supplementary Tables 1–3) - Supplemental Material
Creative Commons Attribution.

[img] PDF (Reporting Summary) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 1: Flowchart of the Stepwise Iterative Optimization pipeline) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 2: Optimization statistics for HFFc6 all-data genome model) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 3: χ2 goodness-of-fit test between the predicted data from IGM HDSF populations and the input data from experiments) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 4: Validating chromosome structures from HDSF population with single cell structures from imaging experiments) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 5: Reproducibility across IGM replicates) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 6: Prediction of experimental SPRITE and FISH data in HFFc6 H, HD, HDS, HDSF populations) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 7: Relevance of low frequency inter-chromosomal contacts) - Supplemental Material
Creative Commons Attribution.

[img] Image (JPEG) (Extended Data Fig. 8: Comparing information content of lamina DamID data against increasingly larger radial distance distribution FISH data sets) - Supplemental Material
Creative Commons Attribution.


Use this Persistent URL to link to this item:


A multitude of sequencing-based and microscopy technologies provide the means to unravel the relationship between the three-dimensional organization of genomes and key regulatory processes of genome function. Here, we develop a multimodal data integration approach to produce populations of single-cell genome structures that are highly predictive for nuclear locations of genes and nuclear bodies, local chromatin compaction and spatial segregation of functionally related chromatin. We demonstrate that multimodal data integration can compensate for systematic errors in some of the data and can greatly increase accuracy and coverage of genome structure models. We also show that alternative combinations of different orthogonal data sources can converge to models with similar predictive power. Moreover, our study reveals the key contributions of low-frequency (‘rare’) interchromosomal contacts to accurately predicting the global nuclear architecture, including the positioning of genes and chromosomes. Overall, our results highlight the benefits of multimodal data integration for genome structure analysis, available through the Integrative Genome Modeling software package.

Item Type:Article
Related URLs:
URLURL TypeDescription Paper ItemData ItemData ItemData ItemData ItemIGM platform
Boninsegna, Lorenzo0000-0003-4529-3567
Polles, Guido0000-0002-3242-5818
Quinodoz, Sofia A.0000-0003-1862-5204
Finn, Elizabeth H.0000-0001-8320-2190
Guttman, Mitchell0000-0003-4748-9352
Zhou, Xianghong Jasmine0000-0001-9063-9518
Alber, Frank0000-0003-1981-8390
Additional Information:© The Author(s) 2022. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit Received 22 August 2021; Accepted 18 May 2022; Published 11 July 2022. This work was supported by the National Institutes of Health (NIH; grants U54DK107981 and UM1HG011593 to F.A.), and an NSF CAREER grant (1150287 to F.A.). We thank the laboratories of J. Dekker (University of Massachusetts Medical School), B. Van Steensel (Netherlands Cancer Institute), T. Misteli (NIH) and A. Belmont (University of Illinois Urbana-Champaign) for kindly providing the experimental data (in situ Hi-C, lamina DamID, 3D HIPMap FISH, DNA SPRITE and SON TSA-seq) used for generating and validating our genome models. We thank W. Li for proofreading the section about the probability functions. Data availability: The following datasets were used to generate or validate the structures: ensemble Hi-C (4DN portal; accession code 4DNES2R6PUEK), lamin B1 DamID (4DN portal; accession code 4DNESXZ4FW4T), 3D HIPMap FISH (4DN portal;, single-cell SPRITE (4DN portal identifier: 4DNESJYGTI8S, private), SON TSA-seq (4DN portal; 4DNES85R9TIB), transcription data (ENCODE; accession code ENCSR735JKB). Super-resolution single-cell imaging data are available at the referenced papers. The pre-processed experimental inputs of different data sources (Hi-C, lamin B1 DamID, 3D HIPMap FISH and single-cell SPRITE) for the HFF cell line and the simulated HDSF population are available at Other data (including configuration files and synthetic data input files) are available upon request. The configuration files and pre-processed data input files are sufficient to reproduce the structure populations with the IGM software. Code availability: The IGM platform is available at This includes, but is not limited to, the source code, a README file detailing code installation and execution, accompanying documentation, and a demo that uses a reduced data input for users to familiarize with the input, expected outputs and execution steps. Contributions: L.B. and F.A. designed research. L.B., A.Y. and Y.Z. performed all calculations and data analysis. L.B., A.Y. and F.A. interpreted results and data analysis with input from X.J.Z. G.P., L.B. and A.Y. wrote software and documentation. S.A.Q. and M.G. contributed new data sources. E.H.F. provided data and help in data interpretation. L.B., A.Y. and F.A. wrote the manuscript with input from X.J.Z. All authors approved the final manuscript. The authors declare no competing interests. Peer review information: Nature Methods thanks Ming Hu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.
Funding AgencyGrant Number
Issue or Number:8
Record Number:CaltechAUTHORS:20210824-174746931
Persistent URL:
Official Citation:Boninsegna, L., Yildirim, A., Polles, G. et al. Integrative genome modeling platform reveals essentiality of rare contact events in 3D genome organizations. Nat Methods 19, 938–949 (2022).
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:110401
Deposited By: Tony Diaz
Deposited On:24 Aug 2021 18:17
Last Modified:04 Aug 2022 17:55

Repository Staff Only: item control page