Budget constrained dataflow scheduling for minimized completion time on the cloud

Dabin Ding, Fei Cao, Dunren Che, Michelle Zhu, Wen Chi Hou

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Cloud computing provides high-end computing capabilities so that users can access data and applications anywhere in the world on demand and pay for what they use. It is emerging as a promising computing paradigm for large-scale data intensive queries, which are usually modeled as complex Directed Acyclic Graph (DAG)-structured data processing dataflows with arbitrary data operators as nodes and producer-consumer interactions as directed edges. The optimization problem of scheduling dataflows on the Cloud is a very complex and challenging task which is similar to query optimization. Optimization must satisfy a variety of objectives and constraints, while taking into account the particular characteristics of the underlying Cloud environment. In addition to achieving minimum query completion time, the commercialization of Clouds requires policies to take users' economic concerns as well. In this paper, we formulate scheduling of dataflows onto Cloud resources toward the objective of minimizing the query completion time under certain budget constraint. A heuristic scheduling algorithm, Layer-oriented Resource Allocation within Budget constraint (LRA-B) is proposed and evaluated. Experiments are conducted on numerous dataflows and Cloud environment configurations, and the overall results are quite promising and indicate the effectiveness of our algorithm.

Original languageEnglish
Pages (from-to)208-220
Number of pages13
JournalInternational Journal of Computers and their Applications
Volume20
Issue number4
StatePublished - 1 Dec 2013

Fingerprint

Scheduling
Heuristic algorithms
Cloud computing
Scheduling algorithms
Resource allocation
Economics
Experiments

Keywords

  • Budget constraint
  • Cloud computing
  • Dataflows
  • Query completion time
  • Scheduling

Cite this

@article{a7ed96cf45cc4d31931c629787f77da2,
title = "Budget constrained dataflow scheduling for minimized completion time on the cloud",
abstract = "Cloud computing provides high-end computing capabilities so that users can access data and applications anywhere in the world on demand and pay for what they use. It is emerging as a promising computing paradigm for large-scale data intensive queries, which are usually modeled as complex Directed Acyclic Graph (DAG)-structured data processing dataflows with arbitrary data operators as nodes and producer-consumer interactions as directed edges. The optimization problem of scheduling dataflows on the Cloud is a very complex and challenging task which is similar to query optimization. Optimization must satisfy a variety of objectives and constraints, while taking into account the particular characteristics of the underlying Cloud environment. In addition to achieving minimum query completion time, the commercialization of Clouds requires policies to take users' economic concerns as well. In this paper, we formulate scheduling of dataflows onto Cloud resources toward the objective of minimizing the query completion time under certain budget constraint. A heuristic scheduling algorithm, Layer-oriented Resource Allocation within Budget constraint (LRA-B) is proposed and evaluated. Experiments are conducted on numerous dataflows and Cloud environment configurations, and the overall results are quite promising and indicate the effectiveness of our algorithm.",
keywords = "Budget constraint, Cloud computing, Dataflows, Query completion time, Scheduling",
author = "Dabin Ding and Fei Cao and Dunren Che and Michelle Zhu and Hou, {Wen Chi}",
year = "2013",
month = "12",
day = "1",
language = "English",
volume = "20",
pages = "208--220",
journal = "International Journal of Computers and their Applications",
issn = "1076-5204",
publisher = "International Society for Computers and Their Applications (ISCA)",
number = "4",

}

Budget constrained dataflow scheduling for minimized completion time on the cloud. / Ding, Dabin; Cao, Fei; Che, Dunren; Zhu, Michelle; Hou, Wen Chi.

In: International Journal of Computers and their Applications, Vol. 20, No. 4, 01.12.2013, p. 208-220.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Budget constrained dataflow scheduling for minimized completion time on the cloud

AU - Ding, Dabin

AU - Cao, Fei

AU - Che, Dunren

AU - Zhu, Michelle

AU - Hou, Wen Chi

PY - 2013/12/1

Y1 - 2013/12/1

N2 - Cloud computing provides high-end computing capabilities so that users can access data and applications anywhere in the world on demand and pay for what they use. It is emerging as a promising computing paradigm for large-scale data intensive queries, which are usually modeled as complex Directed Acyclic Graph (DAG)-structured data processing dataflows with arbitrary data operators as nodes and producer-consumer interactions as directed edges. The optimization problem of scheduling dataflows on the Cloud is a very complex and challenging task which is similar to query optimization. Optimization must satisfy a variety of objectives and constraints, while taking into account the particular characteristics of the underlying Cloud environment. In addition to achieving minimum query completion time, the commercialization of Clouds requires policies to take users' economic concerns as well. In this paper, we formulate scheduling of dataflows onto Cloud resources toward the objective of minimizing the query completion time under certain budget constraint. A heuristic scheduling algorithm, Layer-oriented Resource Allocation within Budget constraint (LRA-B) is proposed and evaluated. Experiments are conducted on numerous dataflows and Cloud environment configurations, and the overall results are quite promising and indicate the effectiveness of our algorithm.

AB - Cloud computing provides high-end computing capabilities so that users can access data and applications anywhere in the world on demand and pay for what they use. It is emerging as a promising computing paradigm for large-scale data intensive queries, which are usually modeled as complex Directed Acyclic Graph (DAG)-structured data processing dataflows with arbitrary data operators as nodes and producer-consumer interactions as directed edges. The optimization problem of scheduling dataflows on the Cloud is a very complex and challenging task which is similar to query optimization. Optimization must satisfy a variety of objectives and constraints, while taking into account the particular characteristics of the underlying Cloud environment. In addition to achieving minimum query completion time, the commercialization of Clouds requires policies to take users' economic concerns as well. In this paper, we formulate scheduling of dataflows onto Cloud resources toward the objective of minimizing the query completion time under certain budget constraint. A heuristic scheduling algorithm, Layer-oriented Resource Allocation within Budget constraint (LRA-B) is proposed and evaluated. Experiments are conducted on numerous dataflows and Cloud environment configurations, and the overall results are quite promising and indicate the effectiveness of our algorithm.

KW - Budget constraint

KW - Cloud computing

KW - Dataflows

KW - Query completion time

KW - Scheduling

UR - http://www.scopus.com/inward/record.url?scp=84892849656&partnerID=8YFLogxK

M3 - Article

VL - 20

SP - 208

EP - 220

JO - International Journal of Computers and their Applications

JF - International Journal of Computers and their Applications

SN - 1076-5204

IS - 4

ER -