The changing mouse embryo transcriptome at whole tissue and single-cell resolution
Abstract
During mammalian embryogenesis, differential gene expression gradually builds the identity and complexity of each tissue and organ system. Here we systematically quantified mouse polyA-RNA from day 10.5 of embryonic development to birth, sampling 17 tissues and organs. The resulting developmental transcriptome is globally structured by dynamic cytodifferentiation, body-axis and cell-proliferation gene sets that were further characterized by the transcription factor motif codes of their promoters. We decomposed the tissue-level transcriptome using single-cell RNA-seq (sequencing of RNA reverse transcribed into cDNA) and found that neurogenesis and haematopoiesis dominate at both the gene and cellular levels, jointly accounting for one-third of differential gene expression and more than 40% of identified cell types. By integrating promoter sequence motifs with companion ENCODE epigenomic profiles, we identified a prominent promoter de-repression mechanism in neuronal expression clusters that was attributable to known and novel repressors. Focusing on the developing limb, single-cell RNA data identified 25 candidate cell types that included progenitor and differentiating states with computationally inferred lineage relationships. We extracted cell-type transcription factor networks and complementary sets of candidate enhancer elements by using single-cell RNA-seq to decompose integrative cis-element (IDEAS) models that were derived from whole-tissue epigenome chromatin data. These ENCODE reference data, computed network components and IDEAS chromatin segmentations are companion resources to the matching epigenomic developmental matrix, and are available for researchers to further mine and integrate.
Additional Information
© 2020 Springer Nature Limited. Received 20 September 2018; Accepted 22 June 2020; Published 29 July 2020. We thank G. Ace Dan for scientific illustration of limb development; S. Upchurch and S. Balasubramanian for data handling; Z. Weng and A. van der Velde for providing consolidated datasets to Y.Z.; S. A. Teichmann, L. Pachter, C. Trapnell and M. Thomson for discussions; H. Zhang and K. Polański for discussion and advice on computing; I. Antoshechkin at the Caltech Jacobs Genetics and Genomics Laboratory for sequencing the Illumina libraries; S. Chen and J. Park of the Single-Cell Profiling and Engineering Center at Caltech for building 10x Genomics libraries; A. Collazo at the Beckman Institute Imaging Center for IF imaging work; and E. H. Shim and R. Loving for supporting immunocytochemistry. B.J.W. was supported by NIH U54HG006998 and the Caltech Beckman Institute BIFGRC. R.C.H. and Y.Z. were supported by R24DK106766 and R01GM121613. P.H. was supported by The Arthur McCallum Scholarship. A.V., D.E.D. and L.A.P. were supported by U54HG006997. Research conducted at the E.O. Lawrence Berkeley National Laboratory was performed under US Department of Energy Contract DE-AC02-05CH11231, University of California. Data availability: These data are part of the ENCODE Consortium mouse embryo project, which provides companion microRNA-seq, DNA methylation, histone mark ChIP–seq, and chromatin accessibility datasets for the sample matrix (https://www.encodeproject.org/matrix/?type=Experiment&status=released&perturbed=false&lab.title=Barbara+Wold%2C+Caltech&award.rfa=ENCODE4). The raw and first level processed data can be accessed at the ENCODE portal (https://www.encodeproject.org) with the following experiment accession numbers: bulk RNA-seq: ENCSR574CRQ; Fluidigm C1 SMART-seq: ENCSR226XLF; 10x Genomics (raw data only): ENCSR713GIS. For convenient viewing on the UCSC single-cell browser (https://mouse-limb.cells.ucsc.edu/), we have uploaded the AnnData matrices corresponding to ENCSR226XLF (Fluidigm C1 SMART-Seq) and ENCSR713GIS (10x Genomics). The processed data matrix for the Fluidigm C1 is available at https://cells.ucsc.edu/mouse-limb/C1_200325/200315_C1_categorical.h5ad and the 10x Genomics processed matrix is available at https://cells.ucsc.edu/mouse-limb/10x/200120_10x.h5ad. Code availability: Standard ENCODE RNA-seq pipeline: https://www.encodeproject.org/pipelines/ENCPL002LSE/; ENCODE ChIP–seq pipeline: https://www.encodeproject.org/pipelines/ENCPL220NBH/; all MATLAB scripts: https://github.com/brianpenghe/Matlab-genomics. 10x single-cell RNA-seq data were processed using CellRanger with a compatible GTF annotation and default parameters. deepTools2.4.1: https://github.com/fidelram/deepTools/tree/2.4.1; FuncAssociate 3.0: http://llama.mshri.on.ca/funcassociate/; TFDB: http://bioinfo.life.hust.edu.cn/AnimalTFDB/; motifs annotated in the CIS-BP database: http://cisbp.ccbr.utoronto.ca/; STRING: https://string-db.org/. The complete code base for promoter motif graphs, STRING interaction graphs, as well as Docker and Singularity container recipes can be accessed on the GitHub repository: https://github.com/hamrhein/mouse_embryo. The IDEAS segmentation can be accessed by the Hub link at http://woldlab.caltech.edu/ENCODE3_Mouse_RNA_paper_yuzhang_me66n/. CIBERSORT: https://cibersort.stanford.edu/. These authors contributed equally: Peng He, Brian A. Williams. Author Contributions: P.H.: bioinformatics and computational data analysis, figures, wrote the paper; B.A.W.: performed all bulk and single-cell RNA-seq experiments, data analysis, wrote the paper; G.K.M.: DNA motif analysis, edited paper; D.T.: performed sequencing analysis, data submission, figure generation, edited paper; H.A.: network visualization, figure generation; L.B. and S.-T.G.: IF experiments, imaging and analysis; I.P.-F. and V.A.: staged and dissected mouse embryos; L.A.P.: mouse developmental matrix design, oversight, and VISTA resource; D.E.D. and A.V.: coordinated and supervised mouse dissection and staging; B.R.: mouse developmental matrix design and oversight of mouse ENCODE effort; R.C.H.: IDEAS development and edited the paper; Y.Z.: developed and implemented IDEAS; B.J.W.: supervised the project, analysed the data, wrote the paper. The authors declare no competing interests.Attached Files
Submitted - 2020.06.14.150599v3.full.pdf
Supplemental Material - 41586_2020_2536_Fig10_ESM.webp
Supplemental Material - 41586_2020_2536_Fig11_ESM.webp
Supplemental Material - 41586_2020_2536_Fig12_ESM.webp
Supplemental Material - 41586_2020_2536_Fig13_ESM.webp
Supplemental Material - 41586_2020_2536_Fig14_ESM.webp
Supplemental Material - 41586_2020_2536_Fig15_ESM.webp
Supplemental Material - 41586_2020_2536_Fig16_ESM.webp
Supplemental Material - 41586_2020_2536_Fig17_ESM.webp
Supplemental Material - 41586_2020_2536_Fig18_ESM.webp
Supplemental Material - 41586_2020_2536_Fig5_ESM.webp
Supplemental Material - 41586_2020_2536_Fig6_ESM.webp
Supplemental Material - 41586_2020_2536_Fig7_ESM.webp
Supplemental Material - 41586_2020_2536_Fig8_ESM.webp
Supplemental Material - 41586_2020_2536_Fig9_ESM.webp
Supplemental Material - 41586_2020_2536_MOESM1_ESM.docx
Supplemental Material - 41586_2020_2536_MOESM2_ESM.pdf
Supplemental Material - 41586_2020_2536_MOESM3_ESM.xlsx
Supplemental Material - 41586_2020_2536_MOESM4_ESM.xlsx
Supplemental Material - 41586_2020_2536_MOESM5_ESM.xlsx
Supplemental Material - 41586_2020_2536_MOESM6_ESM.xlsx
Supplemental Material - 41586_2020_2536_MOESM7_ESM.xlsx
Supplemental Material - 41586_2020_2536_MOESM8_ESM.xlsx
Supplemental Material - 41586_2020_2536_MOESM9_ESM.mp4
Files
Name | Size | Download all |
---|---|---|
md5:500f09c964013a4e21190fe623e0b080
|
351.8 kB | Download |
md5:eb9e828a286dc044bfebc9e4b041708f
|
130.1 kB | Download |
md5:ad4a4d720073d002b40a76d87d6dfa47
|
7.2 MB | Download |
md5:0ce42df4b7fd2356da8c0640c8b977c6
|
26.1 MB | Preview Download |
md5:8bdcb53d3f0b903864858986724f6da6
|
555.5 kB | Download |
md5:86a4b03cc44cf0e7d5a916609162e449
|
302.1 kB | Download |
md5:8a443bfcd51ba5ba62bfa20d5218823e
|
653.1 kB | Download |
md5:95b6c959a24cee47413814a29d007156
|
87.8 kB | Download |
md5:6db4d40300567a571ffd290af771fd51
|
2.0 MB | Download |
md5:4bf2500fe460452ee75121f97133f03f
|
263.5 kB | Download |
md5:e8669576f88ee9fc50990dfceb281984
|
716.9 kB | Download |
md5:e10538416023a54ad044767d0dac0b41
|
281.6 kB | Download |
md5:8e4ce169c7ad8292c16035ed6f051c68
|
105.6 kB | Preview Download |
md5:6728dcfa8287a69ab2219fa3f975f660
|
264.9 kB | Download |
md5:f4f8809158c0071821b43eeb245c9a9e
|
431.7 kB | Download |
md5:0b07e4809503e8a5d50a98985ed652bc
|
842.8 kB | Download |
md5:0f6babd3b91688f00377f5569de172c3
|
240.6 kB | Download |
md5:e7731ea94d7c171ed82ec697c54e91f4
|
734.2 kB | Download |
md5:9b4e7b570a0f7689e83053bba67e81c6
|
937.1 kB | Download |
md5:74ed2fbe1e83cbe0a78576e4e0a82424
|
487.4 kB | Download |
md5:75687936e60594a91c7b7952cc4e7da3
|
91.1 kB | Download |
md5:54e19ac03eff7ea36078eb3ccf555542
|
731.2 kB | Download |
md5:e827ad96b8b0e676057c1e261fb989bf
|
204.7 kB | Download |
md5:4c07c2be0621b8e403cc8645d1330e16
|
21.3 MB | Download |
Additional details
- Eprint ID
- 103947
- Resolver ID
- CaltechAUTHORS:20200618-081148283
- NIH
- U54HG006998
- Caltech Beckman Institute
- NIH
- R24DK106766
- NIH
- R01GM121613
- Arthur McCallum Fund
- NIH
- U54HG006997
- Department of Energy (DOE)
- DE-AC02-05CH11231
- Created
-
2020-06-22Created from EPrint's datestamp field
- Updated
-
2023-06-01Created from EPrint's last_modified field
- Caltech groups
- Millard and Muriel Jacobs Genetics and Genomics Laboratory, Division of Biology and Biological Engineering