TY - GEN
T1 - MEDs for PETs
T2 - 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024 - Findings of EACL 2024
AU - Lee, Patrick
AU - Trujillo, Alain Chirino
AU - Plancarte, Diana Cuevas
AU - Ojo, Olumide Ebenezer
AU - Liu, Xinyi
AU - Shode, Iyanuoluwa
AU - Zhao, Yuan
AU - Peng, Jing
AU - Feldman, Anna
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.
AB - This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.
UR - http://www.scopus.com/inward/record.url?scp=85188697328&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85188697328
T3 - EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2024
SP - 875
EP - 881
BT - EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2024
A2 - Graham, Yvette
A2 - Purver, Matthew
A2 - Purver, Matthew
PB - Association for Computational Linguistics (ACL)
Y2 - 17 March 2024 through 22 March 2024
ER -