Abstract
As big data heads towards big knowledge, data management and machine learning techniques work together to address several interesting problems. In this paper, we address a problem in natural language processing that involves learning by mining from large text databases. More specifically, we deal with the problem of preposition prediction, especially for ESL (English as a second language) learners. Prepositions are function words that typically show a relationship between a noun or a pronoun and other elements of a sentence. They play a key role in determining the meaning of a sentence. Accurate prediction of correct prepositions in a sentence is a challenging job since preposition usage is one of the most subtle aspects of the English grammar, making it difficult for non-native speakers. This paper proposes an approach for preposition prediction called WordPrep based on which we build a tool. WordPrep relies on mining based on the words themselves rather than on their lexical or syntactic connotations. This addresses the challenges of prepositions appearing in idiomatic phrases or in different semantic contexts, due to which the actual words are better than their grammatical positions. Our proposed solution entails a direct data-driven approach to predict the missing preposition in a sentence by learning from matching tokens consisting of ngrams with words before and after the preposition. Using various searches and pattern-matching methods against a large number of database records from big text corpora, this approach predicts the missing preposition(s). We describe our pilot approach, tool implementation and experiments in this paper. This work is particularly helpful for pedagogical applications.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 |
| Editors | Chaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 2169-2176 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781728108582 |
| DOIs | |
| State | Published - Dec 2019 |
| Event | 2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States Duration: 9 Dec 2019 → 12 Dec 2019 |
Publication series
| Name | Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 |
|---|
Conference
| Conference | 2019 IEEE International Conference on Big Data, Big Data 2019 |
|---|---|
| Country/Territory | United States |
| City | Los Angeles |
| Period | 9/12/19 → 12/12/19 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Big Data and Big Knowledge
- ESL Learners
- Intelligent Tutoring Systems
- Machine Learning
- Natural Language Processing
- Pedagogical Tools
- Text Mining
- Writing Aids
Fingerprint
Dive into the research topics of 'WordPrep: Word-based Preposition Prediction Tool'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver