A Caltech Library Service

Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature

Arnaboldi, Valerio and Cho, Jaehyoung and Sternberg, Paul W. (2021) Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature. Database : The Journal of Biological Databases and Curation, 2021 . Art. No. baab015. ISSN 1758-0463. PMCID PMC8011436. doi:10.1093/database/baab015.

PDF - Published Version
Creative Commons Public Domain Dedication.

[img] Archive (ZIP) (Supplementary data) - Supplemental Material
Creative Commons Public Domain Dedication.


Use this Persistent URL to link to this item:


Finding relevant information from newly published scientific papers is becoming increasingly difficult due to the pace at which articles are published every year as well as the increasing amount of information per paper. Biocuration and model organism databases provide a map for researchers to navigate through the complex structure of the biomedical literature by distilling knowledge into curated and standardized information. In addition, scientific search engines such as PubMed and text-mining tools such as Textpresso allow researchers to easily search for specific biological aspects from newly published papers, facilitating knowledge transfer. However, digesting the information returned by these systems—often a large number of documents—still requires considerable effort. In this paper, we present Wormicloud, a new tool that summarizes scientific articles in a graphical way through word clouds. This tool is aimed at facilitating the discovery of new experimental results not yet curated by model organism databases and is designed for both researchers and biocurators. Wormicloud is customized for the Caenorhabditis elegans literature and provides several advantages over existing solutions, including being able to perform full-text searches through Textpresso, which provides more accurate results than other existing literature search engines. Wormicloud is integrated through direct links from gene interaction pages in WormBase. Additionally, it allows analysis on the gene sets obtained from literature searches with other WormBase tools such as SimpleMine and Gene Set Enrichment.

Item Type:Article
Related URLs:
URLURL TypeDescription CentralArticle
https://wormicloud.textpressolab.comRelated ItemWormicloud Database
Arnaboldi, Valerio0000-0002-2563-5374
Sternberg, Paul W.0000-0002-7699-0173
Additional Information:Published by Oxford University Press 2021. This work is written by (a) US Government employee(s) and is in the public domain in the US. Received 15 December 2020; Revised 19 February 2021; Published: 31 March 2021. We thank Ranjana Kishore, Daniela Raciti and Eduardo da Veiga Beltrame for their comments and suggestions on the manuscript. Funding: National Institutes of Health/National Human Genome Research Institute grants [U24HG002223 (WormBase) and U24HG010859 (Alliance Central)]. The authors declare that they have no conflict of interest.
Funding AgencyGrant Number
PubMed Central ID:PMC8011436
Record Number:CaltechAUTHORS:20210406-083450889
Persistent URL:
Official Citation:Valerio Arnaboldi, Jaehyoung Cho, Paul W Sternberg, Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature, Database, Volume 2021, 2021, baab015,
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:108628
Deposited By: Tony Diaz
Deposited On:08 Apr 2021 22:27
Last Modified:19 Apr 2021 17:32

Repository Staff Only: item control page