CaltechAUTHORS
  A Caltech Library Service

Gene identification and genome annotation in Caenorhabditis briggsae by high throughput 5’ RNA end determination

Jhaveri, Nikita and van den Berg, Wouter and Hwang, Byung Joon and Müller, Hans-Michael and Sternberg, Paul W. and Gupta, Bhagwati P. (2021) Gene identification and genome annotation in Caenorhabditis briggsae by high throughput 5’ RNA end determination. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20210929-161511442

[img] PDF - Submitted Version
Creative Commons Attribution Non-commercial.

2MB
[img] Archive (ZIP) (Supplemental data files 1-10) - Supplemental Material
Creative Commons Attribution Non-commercial.

2MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20210929-161511442

Abstract

The nematode Caenorhabditis briggsae is routinely used in comparative and evolutionary studies involving its well-known cousin C. elegans. The C. briggsae genome sequence has accelerated research by facilitating the generation of new resources, tools, and functional studies of genes. While substantial progress has been made in predicting genes and start sites, experimental evidence is still lacking in many cases. Here, we report an improved annotation of the C. briggsae genome using the Trans-spliced Exon Coupled RNA End Determination (TEC-RED) technique. In addition to identifying 5’ ends of expressed genes, the technique has enabled the discovery of operons and paralogs. Application of TEC-RED yielded 10,243 unique 5’ end sequences with matches in the C. briggsae genome. Of these, 6,395 were found to represent 4,252 unique genes along with 362 paralogs and 52 previously unknown exons. The method also identified 493 operons, including 334 that are fully supported by tags. Additionally, two SL1-type operons were discovered. Comparisons with C. elegans revealed that 40% of operons are conserved. Further, we identified 73 novel operons, including 12 that entirely lack orthologs in C. elegans. Among other results, we found that 14 genes are trans-spliced exclusively in C. briggsae compared with C. elegans. Altogether, the data presented here serves as a rich resource to aid biological studies involving C. briggsae. Additionally, this work demonstrates the use of TEC-RED for the first time in a non-elegans nematode and suggests that it could apply to other organisms with a trans-splicing reaction from spliced leader RNA.


Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription
https://doi.org/10.1101/2021.09.24.461604DOIDiscussion Paper
ORCID:
AuthorORCID
Sternberg, Paul W.0000-0002-7699-0173
Gupta, Bhagwati P.0000-0001-8572-7054
Additional Information:The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license. This version posted September 24, 2021. We thank Wormbase for assistance with some aspects of data analysis, Mary Ann Allen and Tom Blumenthal for discussions on C. elegans operons, and members of the Gupta lab for feedback on the manuscript. This work was supported by grants to BPG (NSERC Discovery) and PWS (U24-HG002223). PWS was an Investigator with the HHMI, which partially supported this work. The authors have declared no competing interest.
Funders:
Funding AgencyGrant Number
Natural Sciences and Engineering Research Council of Canada (NSERC)UNSPECIFIED
NIHU24-HG002223
Howard Hughes Medical Institute (HHMI)UNSPECIFIED
Subject Keywords:Nematode, C. briggsae, Trans-splicing, Spliced leader, Operons, Paralog, Genome annotation
DOI:10.1101/2021.09.24.461604
Record Number:CaltechAUTHORS:20210929-161511442
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20210929-161511442
Official Citation:Gene identification and genome annotation in Caenorhabditis briggsae by high throughput 5’ RNA end determination. Nikita Jhaveri, Wouter van den Berg, Byung Joon Hwang, Hans-Michael Muller, Paul W. Sternberg, Bhagwati P. Gupta. bioRxiv 2021.09.24.461604; doi: https://doi.org/10.1101/2021.09.24.461604
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:111088
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:29 Sep 2021 17:31
Last Modified:16 Nov 2021 19:43

Repository Staff Only: item control page