Controversy and sentiment: An exploratory study

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


Automatic keyword analysis is often performed around the world to limit individual access to online content. To enable citizens to freely and openly communicate on the Internet, research is required to study the predictive quality of single words to detect controversial content. This paper extends our previous work with a larger topic-diverse dataset of 1,068,621 words collected from 23 RSS feeds over a 2 month period. Reliability of prior results and the relationship between controversy and sentiment is examined by reproducing a crowd-sourced experiment. Results from the experiment suggest that controversial and not controversial words are classified by human annotators with a high degree of reliability, but unlike previous research we determine that single words are not useful for detecting controversy. In addition, while we cannot conclude that sentiment alone can be used to predict controversy we find that the variance of sentiment may be a useful metric for partitioning data into distinct clusters. Specifically, we find that higher sentiment variance provides greater discrimination quality compared to using positive and negative sentiment to classify controversial documents.

Original languageEnglish
Title of host publicationProceedings - 10th Hellenic Conference on Artificial Intelligence, SETN 2018
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450364331
StatePublished - 9 Jul 2018
Event10th Hellenic Conference on Artificial Intelligence, SETN 2018 - Patras, Greece
Duration: 9 Jul 201812 Jul 2018

Publication series

NameACM International Conference Proceeding Series


Other10th Hellenic Conference on Artificial Intelligence, SETN 2018


  • Classification
  • Controversy
  • Internet censorship
  • Sentiment analysis


Dive into the research topics of 'Controversy and sentiment: An exploratory study'. Together they form a unique fingerprint.

Cite this