N-gram based secure similar document detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

11 Citations (Scopus)

Abstract

Secure similar document detection (SSDD) plays an important role in many applications, such as justifying the need-to-know basis and facilitating communication between government agencies. The SSDD problem considers situations where Alice with a query document wants to find similar information from Bob's document collection. During this process, the content of the query document is not disclosed to Bob, and Bob's document collection is not disclosed to Alice. Existing SSDD protocols are developed under the vector space model, which has the advantage of identifying global similar information. To effectively and securely detect similar documents with overlapping text fragments, this paper proposes a novel n-gram based SSDD protocol.

Original languageEnglish
Title of host publicationData and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings
Pages239-246
Number of pages8
DOIs
StatePublished - 18 Jul 2011
Event25th Annual WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2011 - Richmond, VA, United States
Duration: 11 Jul 201113 Jul 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6818 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other25th Annual WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2011
CountryUnited States
CityRichmond, VA
Period11/07/1113/07/11

Fingerprint

N-gram
Vector spaces
Communication
Query
Vector Space Model
Overlapping
Fragment

Keywords

  • n-gram
  • privacy
  • security

Cite this

Jiang, W., & Samanthula, B. K. (2011). N-gram based secure similar document detection. In Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings (pp. 239-246). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6818 LNCS). https://doi.org/10.1007/978-3-642-22348-8_19
Jiang, Wei ; Samanthula, Bharath Kumar. / N-gram based secure similar document detection. Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings. 2011. pp. 239-246 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{387c08e2eb7043aa84a08765fc04a123,
title = "N-gram based secure similar document detection",
abstract = "Secure similar document detection (SSDD) plays an important role in many applications, such as justifying the need-to-know basis and facilitating communication between government agencies. The SSDD problem considers situations where Alice with a query document wants to find similar information from Bob's document collection. During this process, the content of the query document is not disclosed to Bob, and Bob's document collection is not disclosed to Alice. Existing SSDD protocols are developed under the vector space model, which has the advantage of identifying global similar information. To effectively and securely detect similar documents with overlapping text fragments, this paper proposes a novel n-gram based SSDD protocol.",
keywords = "n-gram, privacy, security",
author = "Wei Jiang and Samanthula, {Bharath Kumar}",
year = "2011",
month = "7",
day = "18",
doi = "10.1007/978-3-642-22348-8_19",
language = "English",
isbn = "9783642223471",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "239--246",
booktitle = "Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings",

}

Jiang, W & Samanthula, BK 2011, N-gram based secure similar document detection. in Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6818 LNCS, pp. 239-246, 25th Annual WG 11.3 Conference on Data and Applications Security and Privacy, DBSec 2011, Richmond, VA, United States, 11/07/11. https://doi.org/10.1007/978-3-642-22348-8_19

N-gram based secure similar document detection. / Jiang, Wei; Samanthula, Bharath Kumar.

Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings. 2011. p. 239-246 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6818 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

TY - GEN

T1 - N-gram based secure similar document detection

AU - Jiang, Wei

AU - Samanthula, Bharath Kumar

PY - 2011/7/18

Y1 - 2011/7/18

N2 - Secure similar document detection (SSDD) plays an important role in many applications, such as justifying the need-to-know basis and facilitating communication between government agencies. The SSDD problem considers situations where Alice with a query document wants to find similar information from Bob's document collection. During this process, the content of the query document is not disclosed to Bob, and Bob's document collection is not disclosed to Alice. Existing SSDD protocols are developed under the vector space model, which has the advantage of identifying global similar information. To effectively and securely detect similar documents with overlapping text fragments, this paper proposes a novel n-gram based SSDD protocol.

AB - Secure similar document detection (SSDD) plays an important role in many applications, such as justifying the need-to-know basis and facilitating communication between government agencies. The SSDD problem considers situations where Alice with a query document wants to find similar information from Bob's document collection. During this process, the content of the query document is not disclosed to Bob, and Bob's document collection is not disclosed to Alice. Existing SSDD protocols are developed under the vector space model, which has the advantage of identifying global similar information. To effectively and securely detect similar documents with overlapping text fragments, this paper proposes a novel n-gram based SSDD protocol.

KW - n-gram

KW - privacy

KW - security

UR - http://www.scopus.com/inward/record.url?scp=79960218384&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-22348-8_19

DO - 10.1007/978-3-642-22348-8_19

M3 - Conference contribution

SN - 9783642223471

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 239

EP - 246

BT - Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings

ER -

Jiang W, Samanthula BK. N-gram based secure similar document detection. In Data and Applications Security and Privacy XXV - 25th Annual IFIP WG 11.3 Conference, DBSec 2011, Proceedings. 2011. p. 239-246. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-22348-8_19