Publications

A Review of Datasets for Aspect-based Sentiment Analysis

IJCNLP-AACL 2023

Publication date: November 4, 2023

Siva Uday Sampreeth Chebolu, Franck Dernoncourt, Nedim Lipka, Thamar Solorio

Aspect-based sentiment analysis (ABSA) is a natural language processing problem that analyzes user-generated reviews to determine: a) the target entity being reviewed, b) the high-level aspect to which it belongs, and c) the sentiment expressed toward the targets and the aspects. Numerous yet scattered corpora for ABSA make it difficult for researchers to identify corpora best suited for a specific ABSA subtask quickly. This study presents a database of corpora that can be used to train and evaluate autonomous ABSA systems. Additionally, we provide an overview of the major corpora for ABSA and its subtasks and highlight several features that researchers should consider when selecting a corpus. Finally, we discuss the advantages and disadvantages of existing dataset collection approaches and make recommendations for future corpora creation. This survey examines 82 publicly available ABSA datasets covering over 25 domains, including 61 English and 21 other languages datasets (https://anonymous.4open.science/r/ABSA-Datasets-7366).

Learn More