Budget-constrained result integrity verification of outsourced data mining computations

Bo Zhang, Boxiang Dong, Wendy Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.

Original languageEnglish
Title of host publicationData and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings
EditorsSencun Zhu, Giovanni Livraga
PublisherSpringer Verlag
Pages311-324
Number of pages14
ISBN (Print)9783319611754
DOIs
StatePublished - 1 Jan 2017
Event31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2017 - Philadelphia, United States
Duration: 19 Jul 201721 Jul 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10359 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2017
CountryUnited States
CityPhiladelphia
Period19/07/1721/07/17

Fingerprint

Integrity
Data mining
Data Mining
Mining
Heuristic algorithms
Verify
Heuristic algorithm
Outsourcing
Coverage
Set theory
Subset
Incentives
Exceed
NP-complete problem
Paradigm
Heuristics
Costs
Output
Experiments
Requirements

Keywords

  • Budgeted maximization
  • Cloud computing
  • Data-Mining-as-a-Service (DMaS)
  • Result integrity

Cite this

Zhang, B., Dong, B., & Wang, W. (2017). Budget-constrained result integrity verification of outsourced data mining computations. In S. Zhu, & G. Livraga (Eds.), Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings (pp. 311-324). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10359 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-61176-1_17
Zhang, Bo ; Dong, Boxiang ; Wang, Wendy. / Budget-constrained result integrity verification of outsourced data mining computations. Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings. editor / Sencun Zhu ; Giovanni Livraga. Springer Verlag, 2017. pp. 311-324 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{aa7d2881ea9c484489daad657f349e3b,
title = "Budget-constrained result integrity verification of outsourced data mining computations",
abstract = "When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.",
keywords = "Budgeted maximization, Cloud computing, Data-Mining-as-a-Service (DMaS), Result integrity",
author = "Bo Zhang and Boxiang Dong and Wendy Wang",
year = "2017",
month = "1",
day = "1",
doi = "10.1007/978-3-319-61176-1_17",
language = "English",
isbn = "9783319611754",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "311--324",
editor = "Sencun Zhu and Giovanni Livraga",
booktitle = "Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings",

}

Zhang, B, Dong, B & Wang, W 2017, Budget-constrained result integrity verification of outsourced data mining computations. in S Zhu & G Livraga (eds), Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10359 LNCS, Springer Verlag, pp. 311-324, 31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2017, Philadelphia, United States, 19/07/17. https://doi.org/10.1007/978-3-319-61176-1_17

Budget-constrained result integrity verification of outsourced data mining computations. / Zhang, Bo; Dong, Boxiang; Wang, Wendy.

Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings. ed. / Sencun Zhu; Giovanni Livraga. Springer Verlag, 2017. p. 311-324 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10359 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Budget-constrained result integrity verification of outsourced data mining computations

AU - Zhang, Bo

AU - Dong, Boxiang

AU - Wang, Wendy

PY - 2017/1/1

Y1 - 2017/1/1

N2 - When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.

AB - When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.

KW - Budgeted maximization

KW - Cloud computing

KW - Data-Mining-as-a-Service (DMaS)

KW - Result integrity

UR - http://www.scopus.com/inward/record.url?scp=85022074580&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-61176-1_17

DO - 10.1007/978-3-319-61176-1_17

M3 - Conference contribution

AN - SCOPUS:85022074580

SN - 9783319611754

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 311

EP - 324

BT - Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings

A2 - Zhu, Sencun

A2 - Livraga, Giovanni

PB - Springer Verlag

ER -

Zhang B, Dong B, Wang W. Budget-constrained result integrity verification of outsourced data mining computations. In Zhu S, Livraga G, editors, Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings. Springer Verlag. 2017. p. 311-324. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-61176-1_17