Abstract
Hadoop YARN is an Apache Software Foundation's open project that provides a resource management framework for large scale parallel data processing, such as MapReduce jobs. Fair scheduler is a dispatcher which has been widely used in YARN to assign resources fairly and equally to applications. However, there exists a problem of the Fair scheduler when the resource requisition of applications is beyond the amount that the cluster can provide. In such a case, the YARN system will be halted if all resources are occupied by Application Masters, a special task of each job that negotiates resources for processing tasks and coordinates job execution. To solve this problem, we propose an automatic and dynamic admission control mechanism to prevent the ceasing situation happened when the requested amount of resources exceeds the cluster's resource capacity, and dynamically reserve resources for processing tasks in order to obtain good performance, e.g., reducing makespans of MapReduce jobs. After collecting resource usage information of each work node, our mechanism dynamically predicts the amount of reserved resources for processing tasks and automatically controls running jobs based on the prediction. We implement the new mechanism in Hadoop YARN and evaluate it with representative MapReduce benchmarks. The experimental results show the effectiveness and robustness of this mechanism under both homogeneous and heterogeneous workloads.
Original language | English |
---|---|
Pages (from-to) | 53-67 |
Number of pages | 15 |
Journal | Scalable Computing |
Volume | 19 |
Issue number | 1 |
DOIs | |
State | Published - 2018 |
Keywords
- Admission control
- Big data
- Cloud computing
- Cluster computing
- Hadoop
- MapReduce
- Resource management
- Scalable computing
- Scheduling
- YARN