Occupancy maps of 208 chromatin-associated proteins in one human cell type
- Creators
- Partridge, E. Christopher
- Chhetri, Surya B.
- Prokop, Jeremy W.
-
Ramaker, Ryne C.
- Jansen, Camden S.
- Goh, Say-Tar
- Mackiewicz, Mark
- Newberry, Kimberly M.
- Brandsmeier, Laurel A.
- Meadows, Sarah K.
- Messer, C. Luke
- Hardigan, Andrew A.
- Coppola, Candice J.
- Dean, Emma C.
- Jiang, Shan
- Savic, Daniel
-
Mortazavi, Ali
-
Wold, Barbara J.
- Myers, Richard M.
- Mendenhall, Eric M.
Abstract
Transcription factors are DNA-binding proteins that have key roles in gene regulation. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP–seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium.
Additional Information
© 2020 The Author(s). This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Received 04 October 2017; Accepted 09 January 2020; Published 29 July 2020. Research reported in this publication was supported by the National Human Genome Research Institute of the National Institutes of Health under Award Number U54HG006998 to R.M.M. and E.M.M. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This work was also supported by funds from The HudsonAlpha Institute for Biotechnology. We thank R. Nguyen, D. Moore, and M. McEown for their technical efforts in this study; B. S. Roberts and G. M. Cooper for comments; HudsonAlpha's Genomic Services Laboratory led by S. Levy for the high-throughput sequencing of much of the data used in this paper; and members of the ENCODE Consortium for public deposition of data generated by other Consortium groups. Data availability: Data sets generated from this study are available at the ENCODE portal or at the Gene Expression Omnibus under accession number GSE104247. CETCh–seq reagents are available at https://www.addgene.org/crispr/tagging/. Code availability: All code is available at https://github.com/chhetribsurya/PartridgeChhetri_etal. Author Contributions: These authors contributed equally: E. Christopher Partridge, Surya B. Chhetri, E.C.P., M.M., K.M.N., L.A.B., S.K.M., C.L.M., C.J.C., E.C.D., and D.S. developed the CETCh–seq method and performed ChIP–seq and CETCh–seq experiments and accompanying validations; S.B.C. performed peak calling and mapped TF binding sites; S.B.C. and E.C.P. performed motif analyses, gene expression analyses, IDEAS segmentation analyses, and co-association analyses; J.W.P. and S.B.C. performed GATAD2A analyses and experiments; M.M. performed immunoprecipitation–mass spectrometry analyses and managed the production of ChIP–seq and CETCh–seq experiments; C.S.J., S.J., and A.M. performed SOM analyses; S.B.C. and S.-T.G. performed conservation and co-association analyses; S.B.C., R.C.R., and A.A.H. performed LS-GKM SVM, random forest, PCA, and TF footprint analyses; E.C.P., S.B.C., B.J.W., R.M.M., and E.M.M. conceived and designed the study; R.M.M. and E.M.M. directed the study; E.C.P., S.B.C., and E.M.M. wrote the manuscript with assistance from all authors; and all authors read and approved the manuscript. The authors declare no competing interests.Attached Files
Published - s41586-020-2023-4.pdf
Submitted - 464800.full.pdf
Supplemental Material - 41586_2020_2023_Fig10_ESM.webp
Supplemental Material - 41586_2020_2023_Fig11_ESM.webp
Supplemental Material - 41586_2020_2023_Fig12_ESM.webp
Supplemental Material - 41586_2020_2023_Fig13_ESM.webp
Supplemental Material - 41586_2020_2023_Fig14_ESM.webp
Supplemental Material - 41586_2020_2023_Fig15_ESM.webp
Supplemental Material - 41586_2020_2023_Fig16_ESM.webp
Supplemental Material - 41586_2020_2023_Fig17_ESM.webp
Supplemental Material - 41586_2020_2023_Fig7_ESM.webp
Supplemental Material - 41586_2020_2023_Fig8_ESM.webp
Supplemental Material - 41586_2020_2023_Fig9_ESM.webp
Supplemental Material - 41586_2020_2023_MOESM1_ESM.pdf
Supplemental Material - 41586_2020_2023_MOESM2_ESM.pdf
Supplemental Material - 41586_2020_2023_MOESM3_ESM.xlsx
Files
Name | Size | Download all |
---|---|---|
md5:8c30f86dad8f885d404130a53b53f802
|
523.3 kB | Download |
md5:b482f329d5c8eca8e28012c7ae453aaa
|
445.7 kB | Download |
md5:c36bfb5b3a9e5abb7aab3d4652c103c3
|
9.8 MB | Download |
md5:68a2e52b8de3ceb2da2e4673a55a0baf
|
259.5 kB | Download |
md5:62e3fec440fb02e71575f50f62eef2c6
|
180.0 kB | Download |
md5:0add235b159eb4eeab69a7cdde406845
|
189.2 kB | Download |
md5:9f2a6a3d0d76d15310ca972b1e4715cd
|
194.5 kB | Download |
md5:3d8ccd0cd17638d9d98a39f739abe000
|
213.1 kB | Download |
md5:56af1e7cd1cfae15936e4ae97b02f992
|
123.4 kB | Download |
md5:7d351f037281ea4b64c9e0b02a00dc00
|
286.8 kB | Download |
md5:b979c64fb9b91f53657435ad3168147d
|
122.9 kB | Preview Download |
md5:a3fc080cdfd8565fda18a8120612039b
|
2.1 MB | Preview Download |
md5:0636cf4b818d1edb460079a26f8fbddd
|
374.8 kB | Download |
md5:3f9a191477ed9544d0b3760f7c74f2fb
|
240.2 kB | Download |
md5:58f319d949e2aff8b462a11effa6cd47
|
2.7 MB | Preview Download |
md5:8039979cf6c52e942439065f0a2662be
|
12.8 MB | Preview Download |
Additional details
- Alternative title
- Occupancy patterns of 208 DNA-associated proteins in a single human cell type
- Eprint ID
- 91283
- Resolver ID
- CaltechAUTHORS:20181128-093527238
- NIH
- U54HG006998
- HudsonAlpha Institute for Biotechnology
- Created
-
2018-11-28Created from EPrint's datestamp field
- Updated
-
2023-06-01Created from EPrint's last_modified field
- Caltech groups
- Division of Biology and Biological Engineering (BBE)