Abstract
When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.
Original language | English |
---|---|
Title of host publication | Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings |
Editors | Sencun Zhu, Giovanni Livraga |
Publisher | Springer Verlag |
Pages | 311-324 |
Number of pages | 14 |
ISBN (Print) | 9783319611754 |
DOIs | |
State | Published - 1 Jan 2017 |
Event | 31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2017 - Philadelphia, United States Duration: 19 Jul 2017 → 21 Jul 2017 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10359 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Other
Other | 31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2017 |
---|---|
Country | United States |
City | Philadelphia |
Period | 19/07/17 → 21/07/17 |
Fingerprint
Keywords
- Budgeted maximization
- Cloud computing
- Data-Mining-as-a-Service (DMaS)
- Result integrity
Cite this
}
Budget-constrained result integrity verification of outsourced data mining computations. / Zhang, Bo; Dong, Boxiang; Wang, Wendy.
Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings. ed. / Sencun Zhu; Giovanni Livraga. Springer Verlag, 2017. p. 311-324 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10359 LNCS).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
TY - GEN
T1 - Budget-constrained result integrity verification of outsourced data mining computations
AU - Zhang, Bo
AU - Dong, Boxiang
AU - Wang, Wendy
PY - 2017/1/1
Y1 - 2017/1/1
N2 - When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.
AB - When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.
KW - Budgeted maximization
KW - Cloud computing
KW - Data-Mining-as-a-Service (DMaS)
KW - Result integrity
UR - http://www.scopus.com/inward/record.url?scp=85022074580&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-61176-1_17
DO - 10.1007/978-3-319-61176-1_17
M3 - Conference contribution
AN - SCOPUS:85022074580
SN - 9783319611754
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 311
EP - 324
BT - Data and Applications Security and Privacy XXXI - 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Proceedings
A2 - Zhu, Sencun
A2 - Livraga, Giovanni
PB - Springer Verlag
ER -