Interface for querying and data mining for the IMDb dataset

Martin Butler, Stefan Robila

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

2 Citations (Scopus)

Abstract

This paper describes the design and implementation of a tool to extract the IMDb dataset files and import them into a database. This approach differs from other published tools or research in that the previous work used relational databases. This tool uses document oriented data structures, and allows others to augment the code to change structures based on their needs. The project development required the use of technologies currently in demand for web developers and software engineers, which allows other developers to fork a copy of the work and utilize in their own work. In addition, it provided the project team an opportunity to develop additional marketable skills. Finally, a web interface to perform queries against the import data to validate the import process was also developed. These queries include searching by people's names, searching by movie/tv titles, or viewing specific data on an individual person or movie/tv title‥

Original languageEnglish
Title of host publication2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781467384902
DOIs
StatePublished - 16 Jun 2016
EventIEEE Long Island Systems, Applications and Technology Conference, LISAT 2016 - Farmingdale, United States
Duration: 29 Apr 2016 → …

Publication series

Name2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016

Other

OtherIEEE Long Island Systems, Applications and Technology Conference, LISAT 2016
CountryUnited States
CityFarmingdale
Period29/04/16 → …

Fingerprint

Data mining
Data structures
Engineers

Keywords

  • IMDb Database
  • Large Data Set Processing
  • Unstructured Databases

Cite this

Butler, M., & Robila, S. (2016). Interface for querying and data mining for the IMDb dataset. In 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016 [7494103] (2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/LISAT.2016.7494103
Butler, Martin ; Robila, Stefan. / Interface for querying and data mining for the IMDb dataset. 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016. Institute of Electrical and Electronics Engineers Inc., 2016. (2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016).
@inproceedings{06dd8a42e6fd40778d122110fc75d434,
title = "Interface for querying and data mining for the IMDb dataset",
abstract = "This paper describes the design and implementation of a tool to extract the IMDb dataset files and import them into a database. This approach differs from other published tools or research in that the previous work used relational databases. This tool uses document oriented data structures, and allows others to augment the code to change structures based on their needs. The project development required the use of technologies currently in demand for web developers and software engineers, which allows other developers to fork a copy of the work and utilize in their own work. In addition, it provided the project team an opportunity to develop additional marketable skills. Finally, a web interface to perform queries against the import data to validate the import process was also developed. These queries include searching by people's names, searching by movie/tv titles, or viewing specific data on an individual person or movie/tv title{\^a}€¥",
keywords = "IMDb Database, Large Data Set Processing, Unstructured Databases",
author = "Martin Butler and Stefan Robila",
year = "2016",
month = "6",
day = "16",
doi = "10.1109/LISAT.2016.7494103",
language = "English",
series = "2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016",

}

Butler, M & Robila, S 2016, Interface for querying and data mining for the IMDb dataset. in 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016., 7494103, 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016, Institute of Electrical and Electronics Engineers Inc., IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016, Farmingdale, United States, 29/04/16. https://doi.org/10.1109/LISAT.2016.7494103

Interface for querying and data mining for the IMDb dataset. / Butler, Martin; Robila, Stefan.

2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016. Institute of Electrical and Electronics Engineers Inc., 2016. 7494103 (2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

TY - GEN

T1 - Interface for querying and data mining for the IMDb dataset

AU - Butler, Martin

AU - Robila, Stefan

PY - 2016/6/16

Y1 - 2016/6/16

N2 - This paper describes the design and implementation of a tool to extract the IMDb dataset files and import them into a database. This approach differs from other published tools or research in that the previous work used relational databases. This tool uses document oriented data structures, and allows others to augment the code to change structures based on their needs. The project development required the use of technologies currently in demand for web developers and software engineers, which allows other developers to fork a copy of the work and utilize in their own work. In addition, it provided the project team an opportunity to develop additional marketable skills. Finally, a web interface to perform queries against the import data to validate the import process was also developed. These queries include searching by people's names, searching by movie/tv titles, or viewing specific data on an individual person or movie/tv title‥

AB - This paper describes the design and implementation of a tool to extract the IMDb dataset files and import them into a database. This approach differs from other published tools or research in that the previous work used relational databases. This tool uses document oriented data structures, and allows others to augment the code to change structures based on their needs. The project development required the use of technologies currently in demand for web developers and software engineers, which allows other developers to fork a copy of the work and utilize in their own work. In addition, it provided the project team an opportunity to develop additional marketable skills. Finally, a web interface to perform queries against the import data to validate the import process was also developed. These queries include searching by people's names, searching by movie/tv titles, or viewing specific data on an individual person or movie/tv title‥

KW - IMDb Database

KW - Large Data Set Processing

KW - Unstructured Databases

UR - http://www.scopus.com/inward/record.url?scp=84978505032&partnerID=8YFLogxK

U2 - 10.1109/LISAT.2016.7494103

DO - 10.1109/LISAT.2016.7494103

M3 - Conference contribution

T3 - 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016

BT - 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Butler M, Robila S. Interface for querying and data mining for the IMDb dataset. In 2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016. Institute of Electrical and Electronics Engineers Inc. 2016. 7494103. (2016 IEEE Long Island Systems, Applications and Technology Conference, LISAT 2016). https://doi.org/10.1109/LISAT.2016.7494103