TY - GEN
T1 - Budget-constrained result integrity verification of outsourced data mining computations
AU - Zhang, Bo
AU - Dong, Boxiang
AU - Wang, Wendy
N1 - Publisher Copyright:
© IFIP International Federation for Information Processing 2017.
PY - 2017
Y1 - 2017
N2 - When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.
AB - When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.
KW - Budgeted maximization
KW - Cloud computing
KW - Data-Mining-as-a-Service (DMaS)
KW - Result integrity
UR - http://www.scopus.com/inward/record.url?scp=85022074580&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-61176-1_17
DO - 10.1007/978-3-319-61176-1_17
M3 - Conference contribution
AN - SCOPUS:85022074580
SN - 9783319611754
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 311
EP - 324
BT - Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings
A2 - Zhu, Sencun
A2 - Livraga, Giovanni
PB - Springer Verlag
T2 - 31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2017
Y2 - 19 July 2017 through 21 July 2017
ER -