TY - GEN
T1 - Annotating an Arabic learner corpus for error
AU - Abuhakema, Ghazi
AU - Faraj, Reem
AU - Feldman, Anna
AU - Fitzpatrick, Eileen
PY - 2008
Y1 - 2008
N2 - This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysis (CEA) on the data. We adapted the French Interlanguage Database FRIDA tagset (Granger, 2003a) to the data. We chose FRIDA in order to follow a known standard and to see whether the changes needed to move from a French to an Arabic tagset would give us a measure of the distance between the two languages with respect to learner difficulty. The current collection of texts, which is constantly growing, contains intermediate and advanced-level student writings. We describe the need for such corpora, the learner data we have collected and the tagset we have developed. We also describe the error frequency distribution of both proficiency levels and the ongoing work.
AB - This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysis (CEA) on the data. We adapted the French Interlanguage Database FRIDA tagset (Granger, 2003a) to the data. We chose FRIDA in order to follow a known standard and to see whether the changes needed to move from a French to an Arabic tagset would give us a measure of the distance between the two languages with respect to learner difficulty. The current collection of texts, which is constantly growing, contains intermediate and advanced-level student writings. We describe the need for such corpora, the learner data we have collected and the tagset we have developed. We also describe the error frequency distribution of both proficiency levels and the ongoing work.
UR - http://www.scopus.com/inward/record.url?scp=85037170718&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85037170718
T3 - Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
SP - 1347
EP - 1350
BT - Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
PB - European Language Resources Association (ELRA)
T2 - 6th International Conference on Language Resources and Evaluation, LREC 2008
Y2 - 28 May 2008 through 30 May 2008
ER -