A Caltech Library Service

Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

van Baren, Marijke J. and Bachy, Charles and Nahas Reistetter, Emily and Purvine, Samuel O. and Grimwood, Jane and Sudek, Sebastian and Yu, Hang and Poirier, Camille and Deerinck, Thomas J. and Kuo, Alan and Grigoriev, Igor V. and Wong, Chee-Hong and Smith, Richard D. and Callister, Stephen J. and Wei, Chia-Lin and Schmutz, Jeremy and Worden, Alexandra Z. (2016) Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants. BMC Genomics, 17 . Art. No. 267. ISSN 1471-2164. PMCID PMC4815162. doi:10.1186/s12864-016-2585-6.

[img] PDF - Published Version
Creative Commons Attribution.

[img] MS Excel (Table S1) - Supplemental Material
Creative Commons Public Domain Dedication.

[img] PDF (Figure S1) - Supplemental Material
Creative Commons Attribution.


Use this Persistent URL to link to this item:


Background: Prasinophytes are widespread marine green algae that are related to plants. Cellular abundance of the prasinophyte Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these unicellular eukaryotes are important for marine ecology and for understanding Viridiplantae evolution and diversification. Results: We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb genome of Micromonas commoda (RCC299; named herein) shows they share ≤8,141 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequenced eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26 %) GC splice donors. Micromonas has more genus-specific protein families (19 %) than other genome sequenced prasinophytes (11 %). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other classes retain the entire PG pathway, like moss and glaucophyte algae. Surprisingly, multiple vascular plants also have the PG pathway, except the Penicillin-Binding Protein, and share a unique bi-domain protein potentially associated with the pathway. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in PG-pathway retention and implicate a role in chloroplast structure or division in several extant Viridiplantae lineages. Conclusions: Extensive differences in gene loss and architecture between related prasinophytes underscore their divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the plastid, have been selectively retained in multiple plants and algae, implying a biological function. Our studies provide robust genomic resources for emerging model algae, advancing knowledge of marine phytoplankton and plant evolution.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle
Bachy, Charles0000-0001-8013-8066
Purvine, Samuel O.0000-0002-2257-2400
Grimwood, Jane0000-0002-8356-8325
Yu, Hang0000-0002-7600-1582
Kuo, Alan0000-0003-3514-3530
Grigoriev, Igor V.0000-0002-3136-8903
Wong, Chee-Hong0000-0003-4546-4979
Callister, Stephen J.0000-0003-1785-2755
Wei, Chia-Lin0000-0001-6820-0461
Schmutz, Jeremy0000-0001-8062-9172
Worden, Alexandra Z.0000-0002-9888-9324
Additional Information:© 2016 van Baren et al. Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated. Received: 21 November 2015 Accepted: 11 March 2016; Published online: 31 March 2016. We thank D McRose, S Yan and M Cuvelier for assistance with growing algae. We thank N Turland for guidance on the International Code of Nomenclature (http://​www.​iapt-taxon.​org/​nomen/​main.​php) protocol for species naming and N Simon for proof reading it. We are deeply grateful to V Jimenez for leading manual annotation efforts and J-H Lee, C-J Choi, J Guo, M Gutowska, C Poirier and S Wilken for contributions. We also thank the anonymous reviewers for comments on the manuscript. Electron microscopic imaging was supported by an award from the National Institute of General Medical Sciences (GM103412) to MH Ellisman. Proteomics were performed in the EMSL, a DOE/BER national scientific user facility located at PNNL and operated for the DOE by Battelle under Contract DE-AC05-76RLO1830. Additional support was provided by BER as part of the Pan-omics Program. Portal construction for release of Wlab models was supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 to the U.S. Department of Energy Joint Genome Institute. Major support also came from a JGI Technology Development Grant, the David and Lucile Packard Foundation, the Gordon and Betty Moore Foundation (GBMF3788) and NSF (IOS0843119) grants to AZW. Primary funding was by DOE-DE-SC0004765 (to AZW, SJC and RDS). Data deposition: The ribosomal RNA operon sequence from RCC299 was deposited under the accession KU612123. RNA-Seq data has been deposited in the SRA under BioProject accessions PRJNA309330 (CCMP1545) and PRJNA309331 (RCC299). LC-MS/MS peptide data has been deposited in the MASSIVE database under accession MSV000079483. The new gene model sets can be downloaded at http://​genome.​jgi.​doe.​gov/​Micromonas_​pusilla/​ and http://​genome.​jgi.​doe.​gov/​Micromonas_​commoda/​. Author’s contributions: MJvB, AZW, SJC, and RDS conceptualized and designed the study. JG and JS finished the four CCMP1545 chromosomes. ENR, SS and HY grew algae. AZW prepared TEM blocks, TJD performed imaging. CLW and CHW performed Illumina sequencing and initial analyses. RDS and SJC designed and executed empirical proteomics experiments/data generation and peptide identification strategies. SJC and SOP performed informatics for MS data processing, proteomics database construction. MJvB constructed evidence-based gene Wlab models and comparative analyses with input from AZW. MJvB, CB, AZW performed manual inspection and annotation of models. CB and CP carried out phylogenetic analyses and contributed to PG analysis. CB designed and executed antibiotic experiments. AK and IVG generated automatic annotations and constructed new genome browsers for public release of the 2015 Wlab models. SS, CB and AZW performed sequence data deposition. MJvB and AZW wrote the manuscript and all authors read and approved the manuscript. The authors declares that they have no competing interests
Funding AgencyGrant Number
Department of Energy (DOE)DE-AC05-76RLO1830
Department of Energy (DOE)DE-AC02-05CH11231
David and Lucile Packard FoundationUNSPECIFIED
Gordon and Betty Moore FoundationGBMF3788
Department of Energy (DOE)DE-SC0004765
National Institute of General Medical SciencesUNSPECIFIED
Subject Keywords:GreenCut, Archaeplastida evolution, Viridiplantae, Introner Elements, RNA sequencing, Proteomics, Evidence-based gene models, Peptidoglycan, PPASP
PubMed Central ID:PMC4815162
Record Number:CaltechAUTHORS:20160404-152844868
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:65908
Deposited By: Tony Diaz
Deposited On:04 Apr 2016 23:09
Last Modified:05 May 2022 17:12

Repository Staff Only: item control page