Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published March 31, 2021 | Supplemental Material + Published
Journal Article Open

Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature


Finding relevant information from newly published scientific papers is becoming increasingly difficult due to the pace at which articles are published every year as well as the increasing amount of information per paper. Biocuration and model organism databases provide a map for researchers to navigate through the complex structure of the biomedical literature by distilling knowledge into curated and standardized information. In addition, scientific search engines such as PubMed and text-mining tools such as Textpresso allow researchers to easily search for specific biological aspects from newly published papers, facilitating knowledge transfer. However, digesting the information returned by these systems—often a large number of documents—still requires considerable effort. In this paper, we present Wormicloud, a new tool that summarizes scientific articles in a graphical way through word clouds. This tool is aimed at facilitating the discovery of new experimental results not yet curated by model organism databases and is designed for both researchers and biocurators. Wormicloud is customized for the Caenorhabditis elegans literature and provides several advantages over existing solutions, including being able to perform full-text searches through Textpresso, which provides more accurate results than other existing literature search engines. Wormicloud is integrated through direct links from gene interaction pages in WormBase. Additionally, it allows analysis on the gene sets obtained from literature searches with other WormBase tools such as SimpleMine and Gene Set Enrichment.

Additional Information

Published by Oxford University Press 2021. This work is written by (a) US Government employee(s) and is in the public domain in the US. Received 15 December 2020; Revised 19 February 2021; Published: 31 March 2021. We thank Ranjana Kishore, Daniela Raciti and Eduardo da Veiga Beltrame for their comments and suggestions on the manuscript. Funding: National Institutes of Health/National Human Genome Research Institute grants [U24HG002223 (WormBase) and U24HG010859 (Alliance Central)]. The authors declare that they have no conflict of interest.

Attached Files

Published - baab015.pdf

Supplemental Material - baab015_supp.zip


Files (38.3 MB)
Name Size Download all
35.1 MB Preview Download
3.2 MB Preview Download

Additional details

August 20, 2023
October 23, 2023