Exploration of designing an automatic classifier for questions containing code snippets—A case study of Oracle SQL certification exam questions

Hung Yi Chen, Po Chou Shih, Yunsen Wang

Research output: Contribution to journalArticlepeer-review

Abstract

This study uses the Oracle SQL certification exam questions to explore the design of automatic classifiers for exam questions containing code snippets. SQL’s question classification assigns a class label in the exam topics to a question. With this classification, questions can be selected from the test bank according to the testing scope to assemble a more suitable test paper. Classifying questions containing code snippets is more challenging than classifying questions with general text descriptions. In this study, we use factorial experiments to identify the effects of the factors of the feature representation scheme and the machine learning method on the performance of the question classifiers. Our experiment results showed the classifier with the TF-IDF scheme and Logistics Regression model performed best in the weighted macro-average AUC and F1 performance indices. The classifier with TF-IDF and Support Vector Machine performed best in weighted macro-average Precision. Moreover, the feature representation scheme was the main factor affecting the classifier’s performance, followed by the machine learning method, over all the performance indices.

Original languageEnglish
Article numbere0309050
JournalPLoS ONE
Volume20
Issue number1
DOIs
StatePublished - Jan 2025

Fingerprint

Dive into the research topics of 'Exploration of designing an automatic classifier for questions containing code snippets—A case study of Oracle SQL certification exam questions'. Together they form a unique fingerprint.

Cite this