Trust-but-Verify

Verifying Result Correctness of Outsourced Frequent Itemset Mining in Data-Mining-As-a-Service Paradigm

Boxiang Dong, Ruilin Liu, Hui Wang

Research output: Contribution to journalArticleResearchpeer-review

6 Citations (Scopus)

Abstract

Cloud computing is popularizing the computing paradigm in which data is outsourced to a third-party service provider (server) for data mining. Outsourcing, however, raises a serious security issue: how can the client of weak computational power verify that the server returned correct mining result? In this paper, we focus on the specific task of frequent itemset mining. We consider the server that is potentially untrusted and tries to escape from verification by using its prior knowledge of the outsourced data. We propose efficient probabilistic and deterministic verification approaches to check whether the server has returned correct and complete frequent itemsets. Our probabilistic approach can catch incorrect results with high probability, while our deterministic approach measures the result correctness with 100 percent certainty. We also design efficient verification methods for both cases that the data and the mining setup are updated. We demonstrate the effectiveness and efficiency of our methods using an extensive set of empirical results on real datasets.

Original languageEnglish
Article number7122916
Pages (from-to)18-32
Number of pages15
JournalIEEE Transactions on Services Computing
Volume9
Issue number1
DOIs
StatePublished - 1 Jan 2016

Fingerprint

Data mining
Outsourcing
Cloud computing
Service provider
Paradigm

Keywords

  • Cloud computing
  • data mining as a service
  • result integrity verification
  • security

Cite this

@article{211c951292ee48f19c9d9eea0da2a024,
title = "Trust-but-Verify: Verifying Result Correctness of Outsourced Frequent Itemset Mining in Data-Mining-As-a-Service Paradigm",
abstract = "Cloud computing is popularizing the computing paradigm in which data is outsourced to a third-party service provider (server) for data mining. Outsourcing, however, raises a serious security issue: how can the client of weak computational power verify that the server returned correct mining result? In this paper, we focus on the specific task of frequent itemset mining. We consider the server that is potentially untrusted and tries to escape from verification by using its prior knowledge of the outsourced data. We propose efficient probabilistic and deterministic verification approaches to check whether the server has returned correct and complete frequent itemsets. Our probabilistic approach can catch incorrect results with high probability, while our deterministic approach measures the result correctness with 100 percent certainty. We also design efficient verification methods for both cases that the data and the mining setup are updated. We demonstrate the effectiveness and efficiency of our methods using an extensive set of empirical results on real datasets.",
keywords = "Cloud computing, data mining as a service, result integrity verification, security",
author = "Boxiang Dong and Ruilin Liu and Hui Wang",
year = "2016",
month = "1",
day = "1",
doi = "10.1109/TSC.2015.2436387",
language = "English",
volume = "9",
pages = "18--32",
journal = "IEEE Transactions on Services Computing",
issn = "1939-1374",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "1",

}

Trust-but-Verify : Verifying Result Correctness of Outsourced Frequent Itemset Mining in Data-Mining-As-a-Service Paradigm. / Dong, Boxiang; Liu, Ruilin; Wang, Hui.

In: IEEE Transactions on Services Computing, Vol. 9, No. 1, 7122916, 01.01.2016, p. 18-32.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Trust-but-Verify

T2 - Verifying Result Correctness of Outsourced Frequent Itemset Mining in Data-Mining-As-a-Service Paradigm

AU - Dong, Boxiang

AU - Liu, Ruilin

AU - Wang, Hui

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Cloud computing is popularizing the computing paradigm in which data is outsourced to a third-party service provider (server) for data mining. Outsourcing, however, raises a serious security issue: how can the client of weak computational power verify that the server returned correct mining result? In this paper, we focus on the specific task of frequent itemset mining. We consider the server that is potentially untrusted and tries to escape from verification by using its prior knowledge of the outsourced data. We propose efficient probabilistic and deterministic verification approaches to check whether the server has returned correct and complete frequent itemsets. Our probabilistic approach can catch incorrect results with high probability, while our deterministic approach measures the result correctness with 100 percent certainty. We also design efficient verification methods for both cases that the data and the mining setup are updated. We demonstrate the effectiveness and efficiency of our methods using an extensive set of empirical results on real datasets.

AB - Cloud computing is popularizing the computing paradigm in which data is outsourced to a third-party service provider (server) for data mining. Outsourcing, however, raises a serious security issue: how can the client of weak computational power verify that the server returned correct mining result? In this paper, we focus on the specific task of frequent itemset mining. We consider the server that is potentially untrusted and tries to escape from verification by using its prior knowledge of the outsourced data. We propose efficient probabilistic and deterministic verification approaches to check whether the server has returned correct and complete frequent itemsets. Our probabilistic approach can catch incorrect results with high probability, while our deterministic approach measures the result correctness with 100 percent certainty. We also design efficient verification methods for both cases that the data and the mining setup are updated. We demonstrate the effectiveness and efficiency of our methods using an extensive set of empirical results on real datasets.

KW - Cloud computing

KW - data mining as a service

KW - result integrity verification

KW - security

UR - http://www.scopus.com/inward/record.url?scp=84961990551&partnerID=8YFLogxK

U2 - 10.1109/TSC.2015.2436387

DO - 10.1109/TSC.2015.2436387

M3 - Article

VL - 9

SP - 18

EP - 32

JO - IEEE Transactions on Services Computing

JF - IEEE Transactions on Services Computing

SN - 1939-1374

IS - 1

M1 - 7122916

ER -