Occupancy maps of 208 chromatin-associated proteins in one human cell type

Creators: Partridge, E. Christopher; Chhetri, Surya B.; Prokop, Jeremy W.; Ramaker, Ryne C.; Jansen, Camden S.; Goh, Say-Tar; Mackiewicz, Mark; Newberry, Kimberly M.; Brandsmeier, Laurel A.; Meadows, Sarah K.; Messer, C. Luke; Hardigan, Andrew A.; Coppola, Candice J.; Dean, Emma C.; Jiang, Shan; Savic, Daniel; Mortazavi, Ali; Wold, Barbara J.; Myers, Richard M.; Mendenhall, Eric M.

Style

An error occurred while generating the citation.

Abstract

Transcription factors are DNA-binding proteins that have key roles in gene regulation. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP–seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium.

Additional Information

© 2020 The Author(s). This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Received 04 October 2017; Accepted 09 January 2020; Published 29 July 2020. Research reported in this publication was supported by the National Human Genome Research Institute of the National Institutes of Health under Award Number U54HG006998 to R.M.M. and E.M.M. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This work was also supported by funds from The HudsonAlpha Institute for Biotechnology. We thank R. Nguyen, D. Moore, and M. McEown for their technical efforts in this study; B. S. Roberts and G. M. Cooper for comments; HudsonAlpha's Genomic Services Laboratory led by S. Levy for the high-throughput sequencing of much of the data used in this paper; and members of the ENCODE Consortium for public deposition of data generated by other Consortium groups. Data availability: Data sets generated from this study are available at the ENCODE portal or at the Gene Expression Omnibus under accession number GSE104247. CETCh–seq reagents are available at https://www.addgene.org/crispr/tagging/. Code availability: All code is available at https://github.com/chhetribsurya/PartridgeChhetri_etal. Author Contributions: These authors contributed equally: E. Christopher Partridge, Surya B. Chhetri, E.C.P., M.M., K.M.N., L.A.B., S.K.M., C.L.M., C.J.C., E.C.D., and D.S. developed the CETCh–seq method and performed ChIP–seq and CETCh–seq experiments and accompanying validations; S.B.C. performed peak calling and mapped TF binding sites; S.B.C. and E.C.P. performed motif analyses, gene expression analyses, IDEAS segmentation analyses, and co-association analyses; J.W.P. and S.B.C. performed GATAD2A analyses and experiments; M.M. performed immunoprecipitation–mass spectrometry analyses and managed the production of ChIP–seq and CETCh–seq experiments; C.S.J., S.J., and A.M. performed SOM analyses; S.B.C. and S.-T.G. performed conservation and co-association analyses; S.B.C., R.C.R., and A.A.H. performed LS-GKM SVM, random forest, PCA, and TF footprint analyses; E.C.P., S.B.C., B.J.W., R.M.M., and E.M.M. conceived and designed the study; R.M.M. and E.M.M. directed the study; E.C.P., S.B.C., and E.M.M. wrote the manuscript with assistance from all authors; and all authors read and approved the manuscript. The authors declare no competing interests.

Attached Files

Published - s41586-020-2023-4.pdf

Submitted - 464800.full.pdf

Supplemental Material - 41586_2020_2023_Fig10_ESM.webp

Supplemental Material - 41586_2020_2023_Fig11_ESM.webp

Supplemental Material - 41586_2020_2023_Fig12_ESM.webp

Supplemental Material - 41586_2020_2023_Fig13_ESM.webp

Supplemental Material - 41586_2020_2023_Fig14_ESM.webp

Supplemental Material - 41586_2020_2023_Fig15_ESM.webp

Supplemental Material - 41586_2020_2023_Fig16_ESM.webp

Name	Size	Download all
41586_2020_2023_Fig13_ESM.webp md5:8c30f86dad8f885d404130a53b53f802	523.3 kB	Download
41586_2020_2023_Fig12_ESM.webp md5:b482f329d5c8eca8e28012c7ae453aaa	445.7 kB	Download
41586_2020_2023_MOESM3_ESM.xlsx md5:c36bfb5b3a9e5abb7aab3d4652c103c3	9.8 MB	Download
41586_2020_2023_Fig14_ESM.webp md5:68a2e52b8de3ceb2da2e4673a55a0baf	259.5 kB	Download
41586_2020_2023_Fig15_ESM.webp md5:62e3fec440fb02e71575f50f62eef2c6	180.0 kB	Download
41586_2020_2023_Fig8_ESM.webp md5:0add235b159eb4eeab69a7cdde406845	189.2 kB	Download
41586_2020_2023_Fig16_ESM.webp md5:9f2a6a3d0d76d15310ca972b1e4715cd	194.5 kB	Download
41586_2020_2023_Fig10_ESM.webp md5:3d8ccd0cd17638d9d98a39f739abe000	213.1 kB	Download
41586_2020_2023_Fig9_ESM.webp md5:56af1e7cd1cfae15936e4ae97b02f992	123.4 kB	Download
41586_2020_2023_Fig7_ESM.webp md5:7d351f037281ea4b64c9e0b02a00dc00	286.8 kB	Download
41586_2020_2023_MOESM2_ESM.pdf md5:b979c64fb9b91f53657435ad3168147d	122.9 kB	Preview Download
41586_2020_2023_MOESM1_ESM.pdf md5:a3fc080cdfd8565fda18a8120612039b	2.1 MB	Preview Download
41586_2020_2023_Fig11_ESM.webp md5:0636cf4b818d1edb460079a26f8fbddd	374.8 kB	Download
41586_2020_2023_Fig17_ESM.webp md5:3f9a191477ed9544d0b3760f7c74f2fb	240.2 kB	Download
464800.full.pdf md5:58f319d949e2aff8b462a11effa6cd47	2.7 MB	Preview Download
s41586-020-2023-4.pdf md5:8039979cf6c52e942439065f0a2662be	12.8 MB	Preview Download

	All versions	This version
Views	43	43
Downloads	45	45
Data volume	193.2 MB	193.2 MB

Occupancy maps of 208 chromatin-associated proteins in one human cell type

Abstract

Additional Information

Attached Files

Files

Additional details