WordPrep: Word-based Preposition Prediction Tool

Pooja Bhagat, Aparna Varde, Anna Feldman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations


As big data heads towards big knowledge, data management and machine learning techniques work together to address several interesting problems. In this paper, we address a problem in natural language processing that involves learning by mining from large text databases. More specifically, we deal with the problem of preposition prediction, especially for ESL (English as a second language) learners. Prepositions are function words that typically show a relationship between a noun or a pronoun and other elements of a sentence. They play a key role in determining the meaning of a sentence. Accurate prediction of correct prepositions in a sentence is a challenging job since preposition usage is one of the most subtle aspects of the English grammar, making it difficult for non-native speakers. This paper proposes an approach for preposition prediction called WordPrep based on which we build a tool. WordPrep relies on mining based on the words themselves rather than on their lexical or syntactic connotations. This addresses the challenges of prepositions appearing in idiomatic phrases or in different semantic contexts, due to which the actual words are better than their grammatical positions. Our proposed solution entails a direct data-driven approach to predict the missing preposition in a sentence by learning from matching tokens consisting of ngrams with words before and after the preposition. Using various searches and pattern-matching methods against a large number of database records from big text corpora, this approach predicts the missing preposition(s). We describe our pilot approach, tool implementation and experiments in this paper. This work is particularly helpful for pedagogical applications.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9781728108582
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: 9 Dec 201912 Dec 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019


Conference2019 IEEE International Conference on Big Data, Big Data 2019
Country/TerritoryUnited States
CityLos Angeles


  • Big Data and Big Knowledge
  • ESL Learners
  • Intelligent Tutoring Systems
  • Machine Learning
  • Natural Language Processing
  • Pedagogical Tools
  • Text Mining
  • Writing Aids


Dive into the research topics of 'WordPrep: Word-based Preposition Prediction Tool'. Together they form a unique fingerprint.

Cite this