Job scheduler for streaming applications in heterogeneous distributed processing systems

Ali Al-Sinayyid, Michelle Zhu

Research output: Contribution to journalArticle

Abstract

In this study, we investigated the problem of scheduling streaming applications on a heterogeneous cluster environment and, based on our previous work, developed the maximum throughput scheduler algorithm (MT-Scheduler) for streaming applications. The proposed algorithm uses a dynamic programming technique to efficiently map the application topology onto the heterogeneous distributed system based on computing and data transfer requirements, while also taking into account the capacity of the underlying cluster resources. The proposed approach maximizes the system throughput by identifying and minimizing the time incurred at the computing/transfer bottleneck. The MT-Scheduler supports scheduling applications structured as a directed acyclic graph. We conducted experiments using three Storm microbenchmark topologies in both simulation and real Apache Storm environments. In terms of the performance evaluation, we compared the proposed MT-Scheduler with the simulated round robin and the default Storm scheduler algorithms. The results indicated that the MT-Scheduler outperforms the default round robin approach in terms of both the average system latency and throughput.

Original languageEnglish
JournalJournal of Supercomputing
DOIs
StateAccepted/In press - 1 Jan 2020

Keywords

  • Apache Storm
  • DAG scheduling
  • Data stream
  • Distributed systems
  • Heterogeneous scheduling

Fingerprint Dive into the research topics of 'Job scheduler for streaming applications in heterogeneous distributed processing systems'. Together they form a unique fingerprint.

  • Cite this