Automatic detection of idiomatic clauses

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We describe several experiments whose goal is to automatically identify idiomatic expressions in written text. We explore two approaches for the task: 1) idiom recognition as outlier detection; and 2) supervised classification of sentences. We apply principal component analysis for outlier detection. Detecting idioms as lexical outliers does not exploit class label information. So, in the following experiments, we use linear discriminant analysis to obtain a discriminant subspace and later use the three nearest neighbor classifier to obtain accuracy. We discuss pros and cons of each approach. All the approaches are more general than the previous algorithms for idiom detection - neither do they rely on target idiom types, lexicons, or large manually annotated corpora, nor do they limit the search space by a particular type of linguistic construction.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings
Pages435-446
Number of pages12
EditionPART 1
DOIs
StatePublished - 3 Apr 2013
Event14th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2013 - Samos, Greece
Duration: 24 Mar 201330 Mar 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7816 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other14th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2013
CountryGreece
CitySamos
Period24/03/1330/03/13

Fingerprint

Outlier Detection
Supervised Classification
Discriminant analysis
Discriminant Analysis
Discriminant
Linguistics
Principal component analysis
Search Space
Principal Component Analysis
Outlier
Experiment
Labels
Nearest Neighbor
Classifiers
Experiments
Classifier
Subspace
Target
Class
Corpus

Cite this

Feldman, A., & Peng, J. (2013). Automatic detection of idiomatic clauses. In Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings (PART 1 ed., pp. 435-446). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7816 LNCS, No. PART 1). https://doi.org/10.1007/978-3-642-37247-6_35
Feldman, Anna ; Peng, Jing. / Automatic detection of idiomatic clauses. Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings. PART 1. ed. 2013. pp. 435-446 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).
@inproceedings{b0682dd94ad3447288a48d93d2c20d06,
title = "Automatic detection of idiomatic clauses",
abstract = "We describe several experiments whose goal is to automatically identify idiomatic expressions in written text. We explore two approaches for the task: 1) idiom recognition as outlier detection; and 2) supervised classification of sentences. We apply principal component analysis for outlier detection. Detecting idioms as lexical outliers does not exploit class label information. So, in the following experiments, we use linear discriminant analysis to obtain a discriminant subspace and later use the three nearest neighbor classifier to obtain accuracy. We discuss pros and cons of each approach. All the approaches are more general than the previous algorithms for idiom detection - neither do they rely on target idiom types, lexicons, or large manually annotated corpora, nor do they limit the search space by a particular type of linguistic construction.",
author = "Anna Feldman and Jing Peng",
year = "2013",
month = "4",
day = "3",
doi = "10.1007/978-3-642-37247-6_35",
language = "English",
isbn = "9783642372469",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 1",
pages = "435--446",
booktitle = "Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings",
edition = "PART 1",

}

Feldman, A & Peng, J 2013, Automatic detection of idiomatic clauses. in Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings. PART 1 edn, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 7816 LNCS, pp. 435-446, 14th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2013, Samos, Greece, 24/03/13. https://doi.org/10.1007/978-3-642-37247-6_35

Automatic detection of idiomatic clauses. / Feldman, Anna; Peng, Jing.

Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings. PART 1. ed. 2013. p. 435-446 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7816 LNCS, No. PART 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Automatic detection of idiomatic clauses

AU - Feldman, Anna

AU - Peng, Jing

PY - 2013/4/3

Y1 - 2013/4/3

N2 - We describe several experiments whose goal is to automatically identify idiomatic expressions in written text. We explore two approaches for the task: 1) idiom recognition as outlier detection; and 2) supervised classification of sentences. We apply principal component analysis for outlier detection. Detecting idioms as lexical outliers does not exploit class label information. So, in the following experiments, we use linear discriminant analysis to obtain a discriminant subspace and later use the three nearest neighbor classifier to obtain accuracy. We discuss pros and cons of each approach. All the approaches are more general than the previous algorithms for idiom detection - neither do they rely on target idiom types, lexicons, or large manually annotated corpora, nor do they limit the search space by a particular type of linguistic construction.

AB - We describe several experiments whose goal is to automatically identify idiomatic expressions in written text. We explore two approaches for the task: 1) idiom recognition as outlier detection; and 2) supervised classification of sentences. We apply principal component analysis for outlier detection. Detecting idioms as lexical outliers does not exploit class label information. So, in the following experiments, we use linear discriminant analysis to obtain a discriminant subspace and later use the three nearest neighbor classifier to obtain accuracy. We discuss pros and cons of each approach. All the approaches are more general than the previous algorithms for idiom detection - neither do they rely on target idiom types, lexicons, or large manually annotated corpora, nor do they limit the search space by a particular type of linguistic construction.

UR - http://www.scopus.com/inward/record.url?scp=84875491615&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-37247-6_35

DO - 10.1007/978-3-642-37247-6_35

M3 - Conference contribution

AN - SCOPUS:84875491615

SN - 9783642372469

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 435

EP - 446

BT - Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings

ER -

Feldman A, Peng J. Automatic detection of idiomatic clauses. In Computational Linguistics and Intelligent Text Processing - 14th International Conference, CICLing 2013, Proceedings. PART 1 ed. 2013. p. 435-446. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). https://doi.org/10.1007/978-3-642-37247-6_35