Abstract
Efficiently managing resources and improving throughput in a large-scale cluster has become a crucial problem with the explosion of data processing applications in recent years. Hadoop YARN and Mesos, as two universal resource manage- ment platforms, have been widely adopted in the commodity cluster for co-deploying multiple data processing frameworks, such as Hadoop MapReduce and Apache Spark. However, in the existing resource management, a certain amount of resources are exclusively allocated to a running task and can only be re-assigned after that task is completed. This exclusive mode unfortunately leads to a potential problem that may under-utilize the cluster resources and degrade system performance. To address this issue, we propose a novel opportunistic and efficient resource allocation scheme, named O P ERA, which breaks the barriers among the encapsulated resource containers by leveraging the knowledge of actual runtime resource utilizations to re-assign opportunistic available resources to the pending tasks. O P ERA avoids incurring severe performance interference to active tasks by further using two approaches to efficiently balances the starvations of reserved tasks and normal queued tasks. We implement and evaluate O P ERA in Hadoop YARN v2.5.
Original language | English |
---|---|
Journal | IEEE Transactions on Cloud Computing |
DOIs | |
State | Accepted/In press - 28 Aug 2018 |
Keywords
- Containers
- Data processing
- Hadoop YARN
- MapReduce Scheduling
- Opportunistic
- Reservation
- Resource Allocation
- Resource management
- Runtime
- Spark
- Sparks
- Starvation
- Task analysis
- Yarn