Detecting censorable content on sina weibo: A pilot study

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This study provides preliminary insights into the linguistic features that contribute to Internet censorship in mainland China. We collected a corpus of 344 censored and uncensored microblog posts that were published on Sina Weibo and built a Naive Bayes classifier based on the linguistic, topic-independent, features. The classifier achieves a 79.34% accuracy in predicting whether a blog post would be censored on Sina Weibo.

Original languageEnglish
Title of host publicationProceedings - 10th Hellenic Conference on Artificial Intelligence, SETN 2018
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450364331
DOIs
StatePublished - 9 Jul 2018
Event10th Hellenic Conference on Artificial Intelligence, SETN 2018 - Patras, Greece
Duration: 9 Jul 201812 Jul 2018

Publication series

NameACM International Conference Proceeding Series

Other

Other10th Hellenic Conference on Artificial Intelligence, SETN 2018
Country/TerritoryGreece
CityPatras
Period9/07/1812/07/18

Keywords

  • Chinese social media, censorship detection

Fingerprint

Dive into the research topics of 'Detecting censorable content on sina weibo: A pilot study'. Together they form a unique fingerprint.

Cite this