Automation and management of scientificworkflows in distributed network environments

Qishi Wu, Michelle Zhu, Xukang Lu, Patrick Brown, Yunyue Lin, Yi Gu, Fei Cao, Michael A. Reuter

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

18 Citations (Scopus)

Abstract

Large-scale computation-intensive applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in heterogeneous network environments and optimizing their end-to-end performance are crucial to the success of large-scale collaborative scientific applications. We design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a set of easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in distributed network environments. The current version of SWAMP integrates the graphical user interface of Kepler to compose abstract workflows and employs Condor DAGMan for workflow dispatch and execution. SWAMP provides a web-based user interface to automate and manage workflow executions and uses a special workflow mapper to optimize the end-to-end workflow performance. A case study of the workflow for Spallation Neutron Source datasets in real networks is presented to show the efficacy of the proposed platform.

Original languageEnglish
Title of host publicationProceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
DOIs
StatePublished - 2 Jul 2010
Event2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010 - Atlanta, GA, United States
Duration: 19 Apr 201023 Apr 2010

Publication series

NameProceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010

Other

Other2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
CountryUnited States
CityAtlanta, GA
Period19/04/1023/04/10

Fingerprint

Distributed Networks
Work Flow
Automation
Scientific Workflow
Neutron sources
Heterogeneous networks
Distributed computer systems
Graphical user interfaces
User interfaces
Module
Kepler
Computing
Heterogeneous Networks
Graphical User Interface
Distributed Computing
Neutron
Networking
Web-based
User Interface
Efficacy

Keywords

  • Distributed computing
  • Scientific workflow
  • Workflow management

Cite this

Wu, Q., Zhu, M., Lu, X., Brown, P., Lin, Y., Gu, Y., ... Reuter, M. A. (2010). Automation and management of scientificworkflows in distributed network environments. In Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010 [5470720] (Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010). https://doi.org/10.1109/IPDPSW.2010.5470720
Wu, Qishi ; Zhu, Michelle ; Lu, Xukang ; Brown, Patrick ; Lin, Yunyue ; Gu, Yi ; Cao, Fei ; Reuter, Michael A. / Automation and management of scientificworkflows in distributed network environments. Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010. 2010. (Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010).
@inproceedings{f6c4c3c9629f47918d3690c5aada40e1,
title = "Automation and management of scientificworkflows in distributed network environments",
abstract = "Large-scale computation-intensive applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in heterogeneous network environments and optimizing their end-to-end performance are crucial to the success of large-scale collaborative scientific applications. We design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a set of easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in distributed network environments. The current version of SWAMP integrates the graphical user interface of Kepler to compose abstract workflows and employs Condor DAGMan for workflow dispatch and execution. SWAMP provides a web-based user interface to automate and manage workflow executions and uses a special workflow mapper to optimize the end-to-end workflow performance. A case study of the workflow for Spallation Neutron Source datasets in real networks is presented to show the efficacy of the proposed platform.",
keywords = "Distributed computing, Scientific workflow, Workflow management",
author = "Qishi Wu and Michelle Zhu and Xukang Lu and Patrick Brown and Yunyue Lin and Yi Gu and Fei Cao and Reuter, {Michael A.}",
year = "2010",
month = "7",
day = "2",
doi = "10.1109/IPDPSW.2010.5470720",
language = "English",
isbn = "9781424465347",
series = "Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010",
booktitle = "Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010",

}

Wu, Q, Zhu, M, Lu, X, Brown, P, Lin, Y, Gu, Y, Cao, F & Reuter, MA 2010, Automation and management of scientificworkflows in distributed network environments. in Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010., 5470720, Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010, 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010, Atlanta, GA, United States, 19/04/10. https://doi.org/10.1109/IPDPSW.2010.5470720

Automation and management of scientificworkflows in distributed network environments. / Wu, Qishi; Zhu, Michelle; Lu, Xukang; Brown, Patrick; Lin, Yunyue; Gu, Yi; Cao, Fei; Reuter, Michael A.

Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010. 2010. 5470720 (Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

TY - GEN

T1 - Automation and management of scientificworkflows in distributed network environments

AU - Wu, Qishi

AU - Zhu, Michelle

AU - Lu, Xukang

AU - Brown, Patrick

AU - Lin, Yunyue

AU - Gu, Yi

AU - Cao, Fei

AU - Reuter, Michael A.

PY - 2010/7/2

Y1 - 2010/7/2

N2 - Large-scale computation-intensive applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in heterogeneous network environments and optimizing their end-to-end performance are crucial to the success of large-scale collaborative scientific applications. We design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a set of easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in distributed network environments. The current version of SWAMP integrates the graphical user interface of Kepler to compose abstract workflows and employs Condor DAGMan for workflow dispatch and execution. SWAMP provides a web-based user interface to automate and manage workflow executions and uses a special workflow mapper to optimize the end-to-end workflow performance. A case study of the workflow for Spallation Neutron Source datasets in real networks is presented to show the efficacy of the proposed platform.

AB - Large-scale computation-intensive applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in heterogeneous network environments and optimizing their end-to-end performance are crucial to the success of large-scale collaborative scientific applications. We design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a set of easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in distributed network environments. The current version of SWAMP integrates the graphical user interface of Kepler to compose abstract workflows and employs Condor DAGMan for workflow dispatch and execution. SWAMP provides a web-based user interface to automate and manage workflow executions and uses a special workflow mapper to optimize the end-to-end workflow performance. A case study of the workflow for Spallation Neutron Source datasets in real networks is presented to show the efficacy of the proposed platform.

KW - Distributed computing

KW - Scientific workflow

KW - Workflow management

UR - http://www.scopus.com/inward/record.url?scp=77954040561&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2010.5470720

DO - 10.1109/IPDPSW.2010.5470720

M3 - Conference contribution

SN - 9781424465347

T3 - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010

BT - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010

ER -

Wu Q, Zhu M, Lu X, Brown P, Lin Y, Gu Y et al. Automation and management of scientificworkflows in distributed network environments. In Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010. 2010. 5470720. (Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010). https://doi.org/10.1109/IPDPSW.2010.5470720