A Caltech Library Service

COVID19 Tweeter Dataset Sentiment Analysis

Kumar, Anubhav and Yun, Kyongsik and Gebregzabiher, Teklay and Tesfay, Berihu Yohannes and Adane, Solomon Gebremeskel (2021) COVID19 Tweeter Dataset Sentiment Analysis. In: 2021 Fourth International Conference on Computational Intelligence and Communication Technologies (CCICT). IEEE , Piscataway, NJ, pp. 110-115. ISBN 978-1-6654-2392-2.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


COVID19 (define as ‘CO’ stands for corona, ‘VI’ for virus, and ‘D’ for disease) is declared global pandemic by WHO. In starting of year 2020 it was limited with China but now More than 206 countries is affected due to this COVID-19 and more than 3.5 billion people infected on the globe and out of that more than 1 million people died due to this incurable disease. WHO did not approved any vaccine till current date. All people around the globe effected due to COVID19 and they wrote their view on social media mainly in Twitter. In span of last 9 month of time hundreds of billon text is written on twitter. Sentiment Analysis is natural language processing (NLP) application which is used to categories text sentiment as positive view, negative view or neutral. Different machine learning algorithms is used to extract sentiment from the text but those ML algorithms require text in specific. But that is major step in whole process of sentiment analysis because the data available at tweeter is available in raw form which required a lot of preprocessing and cleaning before using for sentiment analysis.In this article tweeter data related to COVID19 is discussed in detail like that what are different ways to use tweeter data for sentiment. What are different difficulties, what are different steps in tweeter data preprocessing, and finally ready form of dataset. Python is used as a programming language for sentiment analysis in this article. Same it is also used for data cleaning & preprocessing. Different python libraries which are used for data preprocessing also discussed.

Item Type:Book Section
Related URLs:
URLURL TypeDescription
Yun, Kyongsik0000-0002-6103-7187
Tesfay, Berihu Yohannes0000-0002-6086-0948
Adane, Solomon Gebremeskel0000-0003-4294-7289
Additional Information:© 2021 IEEE.
Subject Keywords:COVID19, Sentiment Analysis, Tweeter Data Set, Machine Learning, NLP, Python Libraries
Record Number:CaltechAUTHORS:20211217-98208000
Persistent URL:
Official Citation:A. Kumar, K. Yun, T. Gebregzabiher, B. Y. Tesfay and S. G. Adane, "COVID19 Tweeter Dataset Sentiment Analysis," 2021 Fourth International Conference on Computational Intelligence and Communication Technologies (CCICT), 2021, pp. 110-115, doi: 10.1109/CCICT53244.2021.00032
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:112523
Deposited By: George Porter
Deposited On:17 Dec 2021 23:59
Last Modified:17 Dec 2021 23:59

Repository Staff Only: item control page