Secure Multiset Intersection Cardinality and its Application to Jaccard Coefficient

Research output: Contribution to journalArticleResearchpeer-review

6 Citations (Scopus)

Abstract

The Jaccard Coefficient, as an information similarity measure, has wide variety of applications, such as cluster analysis and image segmentation. Due to the concerns of personal privacy, the Jaccard Coefficient cannot be computed directly between two independently owned datasets. The problem, secure computation of the Jaccard Coefficient for multisets (SJCM), considers the situation where two parties want to securely compute the random shares of the Jaccard Coefficient between their multisets. During the process, the content of each party's multiset is not disclosed to the other party and also the value of Jaccard Coefficient should be hidden from both parties. Secure computation of multiset intersection cardinality is an important sub-problem of SJCM. Existing methods when applied to solve such a problem can lead to either insecure or inefficient solutions. Our work addresses this gap. We first present a basic SJCM protocol constructed using the existing secure dot product method as a sub-routine. Then, as a major contribution, we propose an approximated version of our basic protocol to improve efficiency without compromising accuracy much. We provide various experimental results to show that the proposed protocols are significantly more efficient than the existing techniques when the domain size is small using both simulated and real datasets.

Original languageEnglish
Article number7065287
Pages (from-to)591-604
Number of pages14
JournalIEEE Transactions on Dependable and Secure Computing
Volume13
Issue number5
DOIs
StatePublished - 1 Sep 2016

Fingerprint

Cluster analysis
Image segmentation

Keywords

  • Garbled circuits
  • Jaccard coefficient
  • Security
  • homomorphic encryption
  • multiset intersection

Cite this

@article{2a1163c631dc4b8fb05b9ec0a6734fa8,
title = "Secure Multiset Intersection Cardinality and its Application to Jaccard Coefficient",
abstract = "The Jaccard Coefficient, as an information similarity measure, has wide variety of applications, such as cluster analysis and image segmentation. Due to the concerns of personal privacy, the Jaccard Coefficient cannot be computed directly between two independently owned datasets. The problem, secure computation of the Jaccard Coefficient for multisets (SJCM), considers the situation where two parties want to securely compute the random shares of the Jaccard Coefficient between their multisets. During the process, the content of each party's multiset is not disclosed to the other party and also the value of Jaccard Coefficient should be hidden from both parties. Secure computation of multiset intersection cardinality is an important sub-problem of SJCM. Existing methods when applied to solve such a problem can lead to either insecure or inefficient solutions. Our work addresses this gap. We first present a basic SJCM protocol constructed using the existing secure dot product method as a sub-routine. Then, as a major contribution, we propose an approximated version of our basic protocol to improve efficiency without compromising accuracy much. We provide various experimental results to show that the proposed protocols are significantly more efficient than the existing techniques when the domain size is small using both simulated and real datasets.",
keywords = "Garbled circuits, Jaccard coefficient, Security, homomorphic encryption, multiset intersection",
author = "Samanthula, {Bharath Kumar} and Wei Jiang",
year = "2016",
month = "9",
day = "1",
doi = "10.1109/TDSC.2015.2415482",
language = "English",
volume = "13",
pages = "591--604",
journal = "IEEE Transactions on Dependable and Secure Computing",
issn = "1545-5971",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "5",

}

Secure Multiset Intersection Cardinality and its Application to Jaccard Coefficient. / Samanthula, Bharath Kumar; Jiang, Wei.

In: IEEE Transactions on Dependable and Secure Computing, Vol. 13, No. 5, 7065287, 01.09.2016, p. 591-604.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Secure Multiset Intersection Cardinality and its Application to Jaccard Coefficient

AU - Samanthula, Bharath Kumar

AU - Jiang, Wei

PY - 2016/9/1

Y1 - 2016/9/1

N2 - The Jaccard Coefficient, as an information similarity measure, has wide variety of applications, such as cluster analysis and image segmentation. Due to the concerns of personal privacy, the Jaccard Coefficient cannot be computed directly between two independently owned datasets. The problem, secure computation of the Jaccard Coefficient for multisets (SJCM), considers the situation where two parties want to securely compute the random shares of the Jaccard Coefficient between their multisets. During the process, the content of each party's multiset is not disclosed to the other party and also the value of Jaccard Coefficient should be hidden from both parties. Secure computation of multiset intersection cardinality is an important sub-problem of SJCM. Existing methods when applied to solve such a problem can lead to either insecure or inefficient solutions. Our work addresses this gap. We first present a basic SJCM protocol constructed using the existing secure dot product method as a sub-routine. Then, as a major contribution, we propose an approximated version of our basic protocol to improve efficiency without compromising accuracy much. We provide various experimental results to show that the proposed protocols are significantly more efficient than the existing techniques when the domain size is small using both simulated and real datasets.

AB - The Jaccard Coefficient, as an information similarity measure, has wide variety of applications, such as cluster analysis and image segmentation. Due to the concerns of personal privacy, the Jaccard Coefficient cannot be computed directly between two independently owned datasets. The problem, secure computation of the Jaccard Coefficient for multisets (SJCM), considers the situation where two parties want to securely compute the random shares of the Jaccard Coefficient between their multisets. During the process, the content of each party's multiset is not disclosed to the other party and also the value of Jaccard Coefficient should be hidden from both parties. Secure computation of multiset intersection cardinality is an important sub-problem of SJCM. Existing methods when applied to solve such a problem can lead to either insecure or inefficient solutions. Our work addresses this gap. We first present a basic SJCM protocol constructed using the existing secure dot product method as a sub-routine. Then, as a major contribution, we propose an approximated version of our basic protocol to improve efficiency without compromising accuracy much. We provide various experimental results to show that the proposed protocols are significantly more efficient than the existing techniques when the domain size is small using both simulated and real datasets.

KW - Garbled circuits

KW - Jaccard coefficient

KW - Security

KW - homomorphic encryption

KW - multiset intersection

UR - http://www.scopus.com/inward/record.url?scp=84985919195&partnerID=8YFLogxK

U2 - 10.1109/TDSC.2015.2415482

DO - 10.1109/TDSC.2015.2415482

M3 - Article

VL - 13

SP - 591

EP - 604

JO - IEEE Transactions on Dependable and Secure Computing

JF - IEEE Transactions on Dependable and Secure Computing

SN - 1545-5971

IS - 5

M1 - 7065287

ER -