Multi-level structured models for document-level sentiment classification
In this paper, we investigate structured models for document-level sentiment classification. When predicting the sentiment of a subjective document (e.g., as positive or negative), it is well known that not all sentences are equally discriminative or informative. But identifying the useful sentences automatically is itself a difficult learning problem. This paper proposes a joint two-level approach for document-level sentiment classification that simultaneously extracts useful (i.e., subjective) sentences and predicts document-level sentiment based on the extracted sentences. Unlike previous joint learning methods for the task, our approach (1) does not rely on gold standard sentence-level subjectivity annotations (which may be expensive to obtain), and (2) optimizes directly for document-level performance. Empirical evaluations on movie reviews and U.S. Congressional floor debates show improved performance over previous approaches.
© 2010 Association for Computational Linguistics. This work was supported in part by National Science Foundation Grants BCS-0904822, BCS-0624277, IIS- 0535099; by a gift from Google; and by the Department of Homeland Security under ONR Grant N0014-07-1- 0152. The second author was also supported in part by a Microsoft Research Graduate Fellowship. The authors thank Yejin Choi, Thorsten Joachims, Nikos Karampatziakis, Lillian Lee, Chun-Nam Yu, and the anonymous reviewers for their helpful comments.