Expanded encyclopaedias of DNA elements in the human and mouse genomes
- Creators
- Moore, Jill E.
- Purcaro, Michael J.
- Pratt, Henry E.
- Epstein, Charles B.
- Shoresh, Noam
- Adrian, Jessika
- Kawli, Trupti
- Davis, Carrie A.
- Dobin, Alexander
- Kaul, Rajinder
- Halow, Jessica
- Van Nostrand, Eric L.
- Freese, Peter
- Gorkin, David U.
- Shen, Yin
- He, Yupeng
- Mackiewicz, Mark
- Pauli-Behn, Florencia
- Williams, Brian A.
- Mortazavi, Ali
- Keller, Cheryl A.
- Zhang, Xiao-Ou
- Elhajjajy, Shaimae I.
- Huey, Jack
- Dickel, Diane E.
- Snetkova, Valentina
- Wei, Xintao
- Wang, Xiaofeng
- Rivera-Mulia, Juan Carlos
- Rozowsky, Joel
- Zhang, Jing
- Chhetri, Surya B.
- Zhang, Jialing
- Victorsen, Alec
- White, Kevin P.
- Visel, Axel
- Yeo, Gene W.
- Burge, Christopher B.
- Lécuyer, Eric
- Gilbert, David M.
- Dekker, Job
- Rinn, John
- Mendenhall, Eric M.
- Ecker, Joseph R.
- Kellis, Manolis
- Klein, Robert J.
- Noble, William S.
- Kundaje, Anshul
- Guigó, Roderic
- Farnham, Peggy J.
- Cherry, J. Michael
- Myers, Richard M.
- Ren, Bing
- Graveley, Brenton R.
- Gerstein, Mark B.
- Pennacchio, Len A.
- Snyder, Michael P.
- Bernstein, Bradley E.
- Wold, Barbara
- Hardison, Ross C.
- Gingeras, Thomas R.
- Stamatoyannopoulos, John A.
- Weng, Zhiping
- ENCODE Project Consortium
Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE and Roadmap Epigenomics data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
Additional Information
© 2020 Springer Nature Limited. Received 26 August 2017; Accepted 27 May 2020; Published 29 July 2020. Data availability: All data are available on the ENCODE data portal: www.encodeproject.org. Code availability: All code is available on GitHub from the links provided in the methods section. Code related to the Registry of cCREs can be found at https://github.com/weng-lab/ENCODE-cCREs. Code related to SCREEN can be found at https://github.com/weng-lab/SCREEN. We thank additional members of our laboratories and institutions who contributed to the experimental and analytical components of this project. We also thank the external advisors of the ENCODE Project for providing valuable input. This work was supported by grants from the NIH under U01HG007019, U01HG007033, U01HG007036, U01HG007037, U41HG006992, U41HG006993, U41HG006994, U41HG006995, U41HG006996, U41HG006997, U41HG006998, U41HG006999, U41HG007000, U41HG007001, U41HG007002, U41HG007003, U54HG006991, U54HG006997, U54HG006998, U54HG007004, U54HG007005, U54HG007010 and UM1HG009442. These authors contributed equally: Jill E. Moore, Michael J. Purcaro, Henry E. Pratt, Charles B. Epstein, Noam Shoresh, Jessika Adrian, Trupti Kawli, Carrie A. Davis, Alexander Dobin, Rajinder Kaul, Jessica Halow, Eric L. Van Nostrand, Peter Freese, David U. Gorkin, Yin Shen, Yupeng He, Mark Mackiewicz, Florencia Pauli-Behn These authors jointly supervised this work: J. Michael Cherry, Richard M. Myers, Bing Ren, Brenton R. Graveley, Mark B. Gerstein, Len A. Pennacchio, Michael P. Snyder, Bradley E. Bernstein, Barbara Wold, Ross C. Hardison, Thomas R. Gingeras, John A. Stamatoyannopoulos & Zhiping Weng Author Contributions: See the consortium author list in the Supplementary Information for full details of author contributions. Data analysis coordination (data analysis): J.E.M., M.J.P., H.E.P., B.W., R.C.H., T.R.G., J.A.S., Z.W. Data production coordination (data production): C.B.E., N.S., J.A., T.K., C.A.D., A.D., R.K., J.H., E.L.V.N., P.F., D.U.G., Y.S., Y.H., M.M., F.P.-B., R.M.M., B.R., B.R.G., L.A.P., M.P.S., B.E.B., B.W., R.C.H., T.R.G., J.A.S. Data analysis leads (data analysis): J.E.M., M.J.P., H.E.P., X.-O.Z., S.I.E., J.H., J.R., J.Z., M.K., R.J.K., W.S.N., A.K., R.G., M.B.G., B.W., R.C.H., Z.W. Data production leads (data production): C.B.E., N.S., J.A., T.K., C.A.D., A.D., R.K., J.H., E.L.V.N., P.F., D.U.G., Y.S., Y.H., M.M., F.P.-B., B.A.W., A.M., C.A.K., S.B.C., J.Z., A.V., K.P.W., A.V., G.W.Y., C.B.B., E.L., D.M.G., J.D., J.R., E.M.M., J.R.E., P.J.F., R.M.M., B.R., B.R.G., L.A.P., M.P.S., B.E.B., B.W., R.C.H., T.R.G., J.A.S. Writing group: R.M.M., B.R., B.R.G., L.A.P., M.P.S., B.E.B., B.W., R.C.H., T.R.G., J.A.S., Z.W. Principal investigators (steering committee): J.M.C., R.M.M., B.R., B.R.G., M.P.S., B.E.B., T.R.G., J.A.S., Z.W. Competing interests: B.E.B. declares outside interests in Fulcrum Therapeutics, 1CellBio, HiFiBio, Arsenal Biosciences, Cell Signaling Technologies, BioMillenia, and Nohla Therapeutics. P. Flicek is a member of the Scientific Advisory Boards of Fabric Genomics, Inc. and Eagle Genomics, Ltd. M.P.S. is cofounder of Personalis, SensOmics, Mirvie, Qbio, January, Filtircine, and Genome Heart. He serves on the scientific advisory board of these companies and Genapsys and Jupiter. Z. Weng is a cofounder of Rgenta Therapeutics and she serves on its scientific advisory board. G.W.Y. is co-founder, member of the Board of Directors, on the SAB, equity holder, and paid consultant for Locana and Eclipse BioInnovations, and a visiting professor at the National University of Singapore. G.W.Y.'s interests have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. E.L.V.N. is co-founder, member of the Board of Directors, on the SAB, equity holder, and paid consultant for Eclipse BioInnovations. E.L.V.N.'s interests have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. B.R. is a co-founder and member of SAB of Arima Genomics, Inc. The authors declare no other competing financial interests.Errata
The ENCODE Project Consortium., Moore, J.E., Purcaro, M.J. et al. Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature (2022). https://doi.org/10.1038/s41586-021-04226-3Attached Files
Supplemental Material - 41586_2020_2493_Fig5_ESM.webp
Supplemental Material - 41586_2020_2493_Fig6_ESM.webp
Supplemental Material - 41586_2020_2493_Fig7_ESM.webp
Supplemental Material - 41586_2020_2493_Fig8_ESM.webp
Supplemental Material - 41586_2020_2493_MOESM1_ESM.pdf
Supplemental Material - 41586_2020_2493_MOESM2_ESM.pdf
Supplemental Material - 41586_2020_2493_Tab1_ESM.jpg
Supplemental Material - Tables.zip
Erratum - s41586-021-04226-3.pdf
Files
Name | Size | Download all |
---|---|---|
md5:cae4fd82b4c73091c2322bede552fb91
|
101.1 kB | Preview Download |
md5:68e4e029b3a41f6a171acc25ae08d7a3
|
181.8 kB | Download |
md5:90ab7d088a25f76372fbc9655f603bcf
|
176.0 kB | Download |
md5:2aaa527a85e701afc00fe3635488e788
|
164.3 kB | Download |
md5:ab4dae74f34addcd5bb4b7ee713be611
|
170.8 kB | Preview Download |
md5:3fa459e891a76ae49e8f12e20000a288
|
17.4 MB | Preview Download |
md5:b3256486e14d90d6ff5507a29003dfd4
|
666.8 kB | Preview Download |
md5:e61c23b5ebab0c2fdb69fdcf37079094
|
57.3 MB | Preview Download |
md5:2ae49a15468ba1c8457c76e3f6b7a4b2
|
135.6 kB | Download |
Additional details
- Eprint ID
- 104642
- Resolver ID
- CaltechAUTHORS:20200729-134110926
- NIH
- U01HG007019
- NIH
- U01HG007033
- NIH
- U01HG007036
- NIH
- U01HG007037
- NIH
- U41HG006992
- NIH
- U41HG006993
- NIH
- U41HG006994
- NIH
- U41HG006995
- NIH
- U41HG006996
- NIH
- U41HG006997
- NIH
- U41HG006998
- NIH
- U41HG006999
- NIH
- U41HG007000
- NIH
- U41HG007001
- NIH
- U41HG007002
- NIH
- U41HG007003
- NIH
- U54HG006991
- NIH
- U54HG006997
- NIH
- U54HG006998
- NIH
- U54HG007004
- NIH
- U54HG007005
- NIH
- U54HG007010
- NIH
- UM1HG009442
- Created
-
2020-07-29Created from EPrint's datestamp field
- Updated
-
2022-04-26Created from EPrint's last_modified field