Our goal is to use natural language processing to identify deceptive and non-deceptive passages in transcribed narratives. We begin by motivating an analysis of language-based deception that relies on specific linguistic indicators to discover deceptive statements. The indicator tags are assigned to a document using a mix of automated and manual methods. Once the tags are assigned, an interpreter automatically discriminates between deceptive and truthful statements based on tag densities. The texts used in our study come entirely from "real world" sources-criminal statements, police interrogations and legal testimony. The corpus was hand-tagged for the truth value of all propositions that could be externally verified as true or false. Classification and Regression Tree techniques suggest that the approach is feasible, with the model able to identify 74.9% of the T/F propositions correctly. Implementation of an automatic tagger with a large subset of tags performed well on test data, producing an average score of 68.6% recall and 85.3% precision when compared to the performance of human taggers on the same subset.
|Title of host publication||Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference|
|Number of pages||8|
|State||Published - 1 Dec 2008|
|Event||22nd International Conference on Computational Linguistics, Coling 2008 - Manchester, United Kingdom|
Duration: 18 Aug 2008 → 22 Aug 2008
|Other||22nd International Conference on Computational Linguistics, Coling 2008|
|Period||18/08/08 → 22/08/08|