Authenticated Outlier Mining for Outsourced Databases

Boxiang Dong, Hui Wendy Wang, Anna Monreale, Dino Pedreschi, Fosca Giannotti, Wenge Guo

Research output: Contribution to journalArticleResearchpeer-review

Abstract

The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.

Original languageEnglish
JournalIEEE Transactions on Dependable and Secure Computing
DOIs
StateAccepted/In press - 21 Sep 2017

Fingerprint

Outsourcing
Authentication
Data mining
Data structures
Costs
Acoustic waves
Communication
Experiments

Keywords

  • Anomaly detection
  • Authentication
  • Databases
  • Outsourcing
  • Probabilistic logic
  • Servers
  • authentication
  • game theory
  • outlier mining
  • outsourcing
  • probabilistic guarantees

Cite this

Dong, Boxiang ; Wang, Hui Wendy ; Monreale, Anna ; Pedreschi, Dino ; Giannotti, Fosca ; Guo, Wenge. / Authenticated Outlier Mining for Outsourced Databases. In: IEEE Transactions on Dependable and Secure Computing. 2017.
@article{f9f05ceb187947d1b4823c17c8e1ab65,
title = "Authenticated Outlier Mining for Outsourced Databases",
abstract = "The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.",
keywords = "Anomaly detection, Authentication, Databases, Outsourcing, Probabilistic logic, Servers, authentication, game theory, outlier mining, outsourcing, probabilistic guarantees",
author = "Boxiang Dong and Wang, {Hui Wendy} and Anna Monreale and Dino Pedreschi and Fosca Giannotti and Wenge Guo",
year = "2017",
month = "9",
day = "21",
doi = "10.1109/TDSC.2017.2754493",
language = "English",
journal = "IEEE Transactions on Dependable and Secure Computing",
issn = "1545-5971",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Authenticated Outlier Mining for Outsourced Databases. / Dong, Boxiang; Wang, Hui Wendy; Monreale, Anna; Pedreschi, Dino; Giannotti, Fosca; Guo, Wenge.

In: IEEE Transactions on Dependable and Secure Computing, 21.09.2017.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Authenticated Outlier Mining for Outsourced Databases

AU - Dong, Boxiang

AU - Wang, Hui Wendy

AU - Monreale, Anna

AU - Pedreschi, Dino

AU - Giannotti, Fosca

AU - Guo, Wenge

PY - 2017/9/21

Y1 - 2017/9/21

N2 - The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.

AB - The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.

KW - Anomaly detection

KW - Authentication

KW - Databases

KW - Outsourcing

KW - Probabilistic logic

KW - Servers

KW - authentication

KW - game theory

KW - outlier mining

KW - outsourcing

KW - probabilistic guarantees

UR - http://www.scopus.com/inward/record.url?scp=85031101070&partnerID=8YFLogxK

U2 - 10.1109/TDSC.2017.2754493

DO - 10.1109/TDSC.2017.2754493

M3 - Article

JO - IEEE Transactions on Dependable and Secure Computing

JF - IEEE Transactions on Dependable and Secure Computing

SN - 1545-5971

ER -