Hadoop MapReduce Multi-Job Workloads using Resource Aware scheduler

Main Article Content

Shivakumar. N
Rashmi , Anirban Basu


Cloud computing features a flexible computing infrastructure for large-scale data processing. MapReduce is a typical model providing an logical framework for cloud computing and Hadoop, an open-source implementation of MapReduce, is a common platform to realize such kind of parallel computing model. We present a resource-aware scheduling technique for MapReduce multi-job workloads that aims at improving resource utilization across machines while observing completion time goals. Existing MapReduce schedulers define a static number of slots to represent the capacity of a cluster, creating a fixed number of execution slots per machine. This abstraction works for homogeneous workloads, but fails to capture the different resource requirements of individual jobs in multi-user environments. Our technique leverages job profiling information to dynamically adjust the number of slots on each machine, as well as workload placement across them, to maximize the resource utilization of the cluster.

Key Words- Map Reduce, scheduling, resource-awareness, performance Management, Large-Scale Data Processing, Hadoop.


Download data is not yet available.

Article Details