Privacy-Preserving and Outsourced Multi-user K-Means Clustering

Fang Yu Rao, Bharath K. Samanthula, Elisa Bertino, Xun Yi, Dongxi Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Such techniques, however, usually incur heavy computational and communication cost on the participating parties and thus entities with limited resources may have to refrain from participating in the PPDM process. To address this issue, one promising solution is to outsource the tasks to the cloud environment. In this paper, we propose a novel and efficient solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers through efficient transformation techniques. In addition, we discuss two strategies, namely offline computation and pipelined execution that aim to boost the performance of our protocol. We implement our protocol on a cluster of 16 nodes and demonstrate how our two strategies combined with parallelism can significantly improve the performance of our protocol through extensive experiments using a real dataset.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages80-89
Number of pages10
ISBN (Electronic)9781509000890
DOIs
StatePublished - 1 Mar 2016
Event1st IEEE International Conference on Collaboration and Internet Computing, CIC 2015 - Hangzhou, China
Duration: 28 Oct 201530 Oct 2015

Publication series

NameProceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015

Other

Other1st IEEE International Conference on Collaboration and Internet Computing, CIC 2015
CountryChina
CityHangzhou
Period28/10/1530/10/15

Fingerprint

Data mining
Cluster computing
Clustering algorithms
Communication
Costs
Experiments

Keywords

  • Cloud computing
  • Encrypted data
  • K-means clustering
  • Privacy

Cite this

Rao, F. Y., Samanthula, B. K., Bertino, E., Yi, X., & Liu, D. (2016). Privacy-Preserving and Outsourced Multi-user K-Means Clustering. In Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015 (pp. 80-89). [7423068] (Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CIC.2015.20
Rao, Fang Yu ; Samanthula, Bharath K. ; Bertino, Elisa ; Yi, Xun ; Liu, Dongxi. / Privacy-Preserving and Outsourced Multi-user K-Means Clustering. Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 80-89 (Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015).
@inproceedings{798ad5a0be1e4efba5ecb3e1581e2219,
title = "Privacy-Preserving and Outsourced Multi-user K-Means Clustering",
abstract = "Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Such techniques, however, usually incur heavy computational and communication cost on the participating parties and thus entities with limited resources may have to refrain from participating in the PPDM process. To address this issue, one promising solution is to outsource the tasks to the cloud environment. In this paper, we propose a novel and efficient solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers through efficient transformation techniques. In addition, we discuss two strategies, namely offline computation and pipelined execution that aim to boost the performance of our protocol. We implement our protocol on a cluster of 16 nodes and demonstrate how our two strategies combined with parallelism can significantly improve the performance of our protocol through extensive experiments using a real dataset.",
keywords = "Cloud computing, Encrypted data, K-means clustering, Privacy",
author = "Rao, {Fang Yu} and Samanthula, {Bharath K.} and Elisa Bertino and Xun Yi and Dongxi Liu",
year = "2016",
month = "3",
day = "1",
doi = "10.1109/CIC.2015.20",
language = "English",
series = "Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "80--89",
booktitle = "Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015",

}

Rao, FY, Samanthula, BK, Bertino, E, Yi, X & Liu, D 2016, Privacy-Preserving and Outsourced Multi-user K-Means Clustering. in Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015., 7423068, Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015, Institute of Electrical and Electronics Engineers Inc., pp. 80-89, 1st IEEE International Conference on Collaboration and Internet Computing, CIC 2015, Hangzhou, China, 28/10/15. https://doi.org/10.1109/CIC.2015.20

Privacy-Preserving and Outsourced Multi-user K-Means Clustering. / Rao, Fang Yu; Samanthula, Bharath K.; Bertino, Elisa; Yi, Xun; Liu, Dongxi.

Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015. Institute of Electrical and Electronics Engineers Inc., 2016. p. 80-89 7423068 (Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Privacy-Preserving and Outsourced Multi-user K-Means Clustering

AU - Rao, Fang Yu

AU - Samanthula, Bharath K.

AU - Bertino, Elisa

AU - Yi, Xun

AU - Liu, Dongxi

PY - 2016/3/1

Y1 - 2016/3/1

N2 - Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Such techniques, however, usually incur heavy computational and communication cost on the participating parties and thus entities with limited resources may have to refrain from participating in the PPDM process. To address this issue, one promising solution is to outsource the tasks to the cloud environment. In this paper, we propose a novel and efficient solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers through efficient transformation techniques. In addition, we discuss two strategies, namely offline computation and pipelined execution that aim to boost the performance of our protocol. We implement our protocol on a cluster of 16 nodes and demonstrate how our two strategies combined with parallelism can significantly improve the performance of our protocol through extensive experiments using a real dataset.

AB - Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Such techniques, however, usually incur heavy computational and communication cost on the participating parties and thus entities with limited resources may have to refrain from participating in the PPDM process. To address this issue, one promising solution is to outsource the tasks to the cloud environment. In this paper, we propose a novel and efficient solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers through efficient transformation techniques. In addition, we discuss two strategies, namely offline computation and pipelined execution that aim to boost the performance of our protocol. We implement our protocol on a cluster of 16 nodes and demonstrate how our two strategies combined with parallelism can significantly improve the performance of our protocol through extensive experiments using a real dataset.

KW - Cloud computing

KW - Encrypted data

KW - K-means clustering

KW - Privacy

UR - http://www.scopus.com/inward/record.url?scp=84964812437&partnerID=8YFLogxK

U2 - 10.1109/CIC.2015.20

DO - 10.1109/CIC.2015.20

M3 - Conference contribution

AN - SCOPUS:84964812437

T3 - Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015

SP - 80

EP - 89

BT - Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Rao FY, Samanthula BK, Bertino E, Yi X, Liu D. Privacy-Preserving and Outsourced Multi-user K-Means Clustering. In Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 80-89. 7423068. (Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015). https://doi.org/10.1109/CIC.2015.20