RoVEr

Robust and verifiable erasure code for hadoop distributed file systems

Teng Wang, Nam Son Nguyen, Jiayin Wang, Tengpeng Li, Xiaoqian Zhang, Ningfang Mi, Bin Zhao, Bo Sheng

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Abstract

Erasure Coding based Storage (ECS) is replacing tradition replica-based systems because of its low storage overhead. In an ECS, however, every task needs to fetch remote pieces of data for its execution, and data verification is missing in the current framework. As security issues keep rising and there have been security incidents occurred in big data platforms, the compromised nodes in a computing cluster may manipulate its hosted data fed for other nodes yielding misleading results. Without replicas, it is quite challenging to efficiently verify the data integrity in ECS. In this paper, we develop ROVER, which is an efficient and verifiable ECS for big data platforms. In ROVER, every piece of data is monitored by its checksums stored on a set of witnesses. Bloom filter technique is used on each witness to efficiently keep the records of the checksums. The data verification is based on the majority voting. ROVER also supports a quick reconstruction of Bloom Filter when a node recovers from a failure. We present a complete system framework, security analysis, and a guideline for setting the parameters. The implementation and evaluation show that ROVER is robust and efficient against the attack from the compromised nodes.

Original languageEnglish
Title of host publicationICCCN 2018 - 27th International Conference on Computer Communications and Networks
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538651568
DOIs
StatePublished - 9 Oct 2018
Event27th International Conference on Computer Communications and Networks, ICCCN 2018 - Hangzhou City, Zhejiang Province, China
Duration: 30 Jul 20182 Aug 2018

Publication series

NameProceedings - International Conference on Computer Communications and Networks, ICCCN
Volume2018-July
ISSN (Print)1095-2055

Other

Other27th International Conference on Computer Communications and Networks, ICCCN 2018
CountryChina
CityHangzhou City, Zhejiang Province
Period30/07/182/08/18

Fingerprint

Cluster computing
Security systems
Big data

Cite this

Wang, T., Nguyen, N. S., Wang, J., Li, T., Zhang, X., Mi, N., ... Sheng, B. (2018). RoVEr: Robust and verifiable erasure code for hadoop distributed file systems. In ICCCN 2018 - 27th International Conference on Computer Communications and Networks [8487406] (Proceedings - International Conference on Computer Communications and Networks, ICCCN; Vol. 2018-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCCN.2018.8487406
Wang, Teng ; Nguyen, Nam Son ; Wang, Jiayin ; Li, Tengpeng ; Zhang, Xiaoqian ; Mi, Ningfang ; Zhao, Bin ; Sheng, Bo. / RoVEr : Robust and verifiable erasure code for hadoop distributed file systems. ICCCN 2018 - 27th International Conference on Computer Communications and Networks. Institute of Electrical and Electronics Engineers Inc., 2018. (Proceedings - International Conference on Computer Communications and Networks, ICCCN).
@inproceedings{be78d3d608dc4ef69ebac1113b74f8df,
title = "RoVEr: Robust and verifiable erasure code for hadoop distributed file systems",
abstract = "Erasure Coding based Storage (ECS) is replacing tradition replica-based systems because of its low storage overhead. In an ECS, however, every task needs to fetch remote pieces of data for its execution, and data verification is missing in the current framework. As security issues keep rising and there have been security incidents occurred in big data platforms, the compromised nodes in a computing cluster may manipulate its hosted data fed for other nodes yielding misleading results. Without replicas, it is quite challenging to efficiently verify the data integrity in ECS. In this paper, we develop ROVER, which is an efficient and verifiable ECS for big data platforms. In ROVER, every piece of data is monitored by its checksums stored on a set of witnesses. Bloom filter technique is used on each witness to efficiently keep the records of the checksums. The data verification is based on the majority voting. ROVER also supports a quick reconstruction of Bloom Filter when a node recovers from a failure. We present a complete system framework, security analysis, and a guideline for setting the parameters. The implementation and evaluation show that ROVER is robust and efficient against the attack from the compromised nodes.",
author = "Teng Wang and Nguyen, {Nam Son} and Jiayin Wang and Tengpeng Li and Xiaoqian Zhang and Ningfang Mi and Bin Zhao and Bo Sheng",
year = "2018",
month = "10",
day = "9",
doi = "10.1109/ICCCN.2018.8487406",
language = "English",
series = "Proceedings - International Conference on Computer Communications and Networks, ICCCN",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "ICCCN 2018 - 27th International Conference on Computer Communications and Networks",

}

Wang, T, Nguyen, NS, Wang, J, Li, T, Zhang, X, Mi, N, Zhao, B & Sheng, B 2018, RoVEr: Robust and verifiable erasure code for hadoop distributed file systems. in ICCCN 2018 - 27th International Conference on Computer Communications and Networks., 8487406, Proceedings - International Conference on Computer Communications and Networks, ICCCN, vol. 2018-July, Institute of Electrical and Electronics Engineers Inc., 27th International Conference on Computer Communications and Networks, ICCCN 2018, Hangzhou City, Zhejiang Province, China, 30/07/18. https://doi.org/10.1109/ICCCN.2018.8487406

RoVEr : Robust and verifiable erasure code for hadoop distributed file systems. / Wang, Teng; Nguyen, Nam Son; Wang, Jiayin; Li, Tengpeng; Zhang, Xiaoqian; Mi, Ningfang; Zhao, Bin; Sheng, Bo.

ICCCN 2018 - 27th International Conference on Computer Communications and Networks. Institute of Electrical and Electronics Engineers Inc., 2018. 8487406 (Proceedings - International Conference on Computer Communications and Networks, ICCCN; Vol. 2018-July).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

TY - GEN

T1 - RoVEr

T2 - Robust and verifiable erasure code for hadoop distributed file systems

AU - Wang, Teng

AU - Nguyen, Nam Son

AU - Wang, Jiayin

AU - Li, Tengpeng

AU - Zhang, Xiaoqian

AU - Mi, Ningfang

AU - Zhao, Bin

AU - Sheng, Bo

PY - 2018/10/9

Y1 - 2018/10/9

N2 - Erasure Coding based Storage (ECS) is replacing tradition replica-based systems because of its low storage overhead. In an ECS, however, every task needs to fetch remote pieces of data for its execution, and data verification is missing in the current framework. As security issues keep rising and there have been security incidents occurred in big data platforms, the compromised nodes in a computing cluster may manipulate its hosted data fed for other nodes yielding misleading results. Without replicas, it is quite challenging to efficiently verify the data integrity in ECS. In this paper, we develop ROVER, which is an efficient and verifiable ECS for big data platforms. In ROVER, every piece of data is monitored by its checksums stored on a set of witnesses. Bloom filter technique is used on each witness to efficiently keep the records of the checksums. The data verification is based on the majority voting. ROVER also supports a quick reconstruction of Bloom Filter when a node recovers from a failure. We present a complete system framework, security analysis, and a guideline for setting the parameters. The implementation and evaluation show that ROVER is robust and efficient against the attack from the compromised nodes.

AB - Erasure Coding based Storage (ECS) is replacing tradition replica-based systems because of its low storage overhead. In an ECS, however, every task needs to fetch remote pieces of data for its execution, and data verification is missing in the current framework. As security issues keep rising and there have been security incidents occurred in big data platforms, the compromised nodes in a computing cluster may manipulate its hosted data fed for other nodes yielding misleading results. Without replicas, it is quite challenging to efficiently verify the data integrity in ECS. In this paper, we develop ROVER, which is an efficient and verifiable ECS for big data platforms. In ROVER, every piece of data is monitored by its checksums stored on a set of witnesses. Bloom filter technique is used on each witness to efficiently keep the records of the checksums. The data verification is based on the majority voting. ROVER also supports a quick reconstruction of Bloom Filter when a node recovers from a failure. We present a complete system framework, security analysis, and a guideline for setting the parameters. The implementation and evaluation show that ROVER is robust and efficient against the attack from the compromised nodes.

UR - http://www.scopus.com/inward/record.url?scp=85060496880&partnerID=8YFLogxK

U2 - 10.1109/ICCCN.2018.8487406

DO - 10.1109/ICCCN.2018.8487406

M3 - Conference contribution

T3 - Proceedings - International Conference on Computer Communications and Networks, ICCCN

BT - ICCCN 2018 - 27th International Conference on Computer Communications and Networks

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Wang T, Nguyen NS, Wang J, Li T, Zhang X, Mi N et al. RoVEr: Robust and verifiable erasure code for hadoop distributed file systems. In ICCCN 2018 - 27th International Conference on Computer Communications and Networks. Institute of Electrical and Electronics Engineers Inc. 2018. 8487406. (Proceedings - International Conference on Computer Communications and Networks, ICCCN). https://doi.org/10.1109/ICCCN.2018.8487406