Probabilistic diagnosis of clustered faults for hypercube-based multiprocessor system

Mengjie Lv, Shuming Zhou, Xueli Sun, Guanqin Lian, Jiafei Liu, Dajin Wang

Research output: Contribution to journalArticle

Abstract

As the sizes of multiprocessor systems grow, chances of processors becoming faulty increase, making it an important issue to diagnose faulty nodes in the system. Different models have been proposed and studied. One proposed by Huang et al. employed a probabilistic fault model to determine the status of a cluster of nodes in rectangular grid structures [8]. Later, Tang et al. extended the diagnosis algorithm to more general, regular topologies, in which any pair of adjacent nodes has no common neighbors [25]. In a recent work by Lu et al., the algorithm was further extended to regular topologies where any pair of adjacent nodes has a certain number of common neighbors [18]. In this paper, we extend the threshold to apply the probabilistic diagnosis algorithm for hypercube-based multiprocessor system, and carry out an analysis on the algorithm's effectiveness. The analysis shows a very high rate of correct diagnosis, both for an individual node and for nodes as a whole. Although the analysis is done for a particular regular network (the hypercube), the outcome can serve as a useful reference, and can shed light on the effectiveness of the probabilistic diagnosis for a large group of triangle-free multiprocessor systems.

Original languageEnglish
Pages (from-to)113-131
Number of pages19
JournalTheoretical Computer Science
Volume793
DOIs
StatePublished - 12 Nov 2019

Fingerprint

Multiprocessor Systems
Hypercube
Fault
Vertex of a graph
Topology
Adjacent
Triangle-free
Grid
Model

Keywords

  • Clustered faults
  • Diagnosis algorithm
  • Fault tolerance
  • Hypercube
  • Probabilistic diagnosis model
  • Reliability

Cite this

Lv, Mengjie ; Zhou, Shuming ; Sun, Xueli ; Lian, Guanqin ; Liu, Jiafei ; Wang, Dajin. / Probabilistic diagnosis of clustered faults for hypercube-based multiprocessor system. In: Theoretical Computer Science. 2019 ; Vol. 793. pp. 113-131.
@article{cd8ef05ca796439790458c0081408cb9,
title = "Probabilistic diagnosis of clustered faults for hypercube-based multiprocessor system",
abstract = "As the sizes of multiprocessor systems grow, chances of processors becoming faulty increase, making it an important issue to diagnose faulty nodes in the system. Different models have been proposed and studied. One proposed by Huang et al. employed a probabilistic fault model to determine the status of a cluster of nodes in rectangular grid structures [8]. Later, Tang et al. extended the diagnosis algorithm to more general, regular topologies, in which any pair of adjacent nodes has no common neighbors [25]. In a recent work by Lu et al., the algorithm was further extended to regular topologies where any pair of adjacent nodes has a certain number of common neighbors [18]. In this paper, we extend the threshold to apply the probabilistic diagnosis algorithm for hypercube-based multiprocessor system, and carry out an analysis on the algorithm's effectiveness. The analysis shows a very high rate of correct diagnosis, both for an individual node and for nodes as a whole. Although the analysis is done for a particular regular network (the hypercube), the outcome can serve as a useful reference, and can shed light on the effectiveness of the probabilistic diagnosis for a large group of triangle-free multiprocessor systems.",
keywords = "Clustered faults, Diagnosis algorithm, Fault tolerance, Hypercube, Probabilistic diagnosis model, Reliability",
author = "Mengjie Lv and Shuming Zhou and Xueli Sun and Guanqin Lian and Jiafei Liu and Dajin Wang",
year = "2019",
month = "11",
day = "12",
doi = "10.1016/j.tcs.2019.06.023",
language = "English",
volume = "793",
pages = "113--131",
journal = "Theoretical Computer Science",
issn = "0304-3975",
publisher = "Elsevier",

}

Probabilistic diagnosis of clustered faults for hypercube-based multiprocessor system. / Lv, Mengjie; Zhou, Shuming; Sun, Xueli; Lian, Guanqin; Liu, Jiafei; Wang, Dajin.

In: Theoretical Computer Science, Vol. 793, 12.11.2019, p. 113-131.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Probabilistic diagnosis of clustered faults for hypercube-based multiprocessor system

AU - Lv, Mengjie

AU - Zhou, Shuming

AU - Sun, Xueli

AU - Lian, Guanqin

AU - Liu, Jiafei

AU - Wang, Dajin

PY - 2019/11/12

Y1 - 2019/11/12

N2 - As the sizes of multiprocessor systems grow, chances of processors becoming faulty increase, making it an important issue to diagnose faulty nodes in the system. Different models have been proposed and studied. One proposed by Huang et al. employed a probabilistic fault model to determine the status of a cluster of nodes in rectangular grid structures [8]. Later, Tang et al. extended the diagnosis algorithm to more general, regular topologies, in which any pair of adjacent nodes has no common neighbors [25]. In a recent work by Lu et al., the algorithm was further extended to regular topologies where any pair of adjacent nodes has a certain number of common neighbors [18]. In this paper, we extend the threshold to apply the probabilistic diagnosis algorithm for hypercube-based multiprocessor system, and carry out an analysis on the algorithm's effectiveness. The analysis shows a very high rate of correct diagnosis, both for an individual node and for nodes as a whole. Although the analysis is done for a particular regular network (the hypercube), the outcome can serve as a useful reference, and can shed light on the effectiveness of the probabilistic diagnosis for a large group of triangle-free multiprocessor systems.

AB - As the sizes of multiprocessor systems grow, chances of processors becoming faulty increase, making it an important issue to diagnose faulty nodes in the system. Different models have been proposed and studied. One proposed by Huang et al. employed a probabilistic fault model to determine the status of a cluster of nodes in rectangular grid structures [8]. Later, Tang et al. extended the diagnosis algorithm to more general, regular topologies, in which any pair of adjacent nodes has no common neighbors [25]. In a recent work by Lu et al., the algorithm was further extended to regular topologies where any pair of adjacent nodes has a certain number of common neighbors [18]. In this paper, we extend the threshold to apply the probabilistic diagnosis algorithm for hypercube-based multiprocessor system, and carry out an analysis on the algorithm's effectiveness. The analysis shows a very high rate of correct diagnosis, both for an individual node and for nodes as a whole. Although the analysis is done for a particular regular network (the hypercube), the outcome can serve as a useful reference, and can shed light on the effectiveness of the probabilistic diagnosis for a large group of triangle-free multiprocessor systems.

KW - Clustered faults

KW - Diagnosis algorithm

KW - Fault tolerance

KW - Hypercube

KW - Probabilistic diagnosis model

KW - Reliability

UR - http://www.scopus.com/inward/record.url?scp=85071286520&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2019.06.023

DO - 10.1016/j.tcs.2019.06.023

M3 - Article

AN - SCOPUS:85071286520

VL - 793

SP - 113

EP - 131

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

ER -