Implementing and Performance Evaluation of Queue Management Scheme to Schedule Hadoop Jobs to Achieve Utilization Comparable To Distributed Scheme

Mukesh Singla


Finding the best scheduling method for a particular data processing and request leftovers is an important challenge. This works presents performance analysis of static scheduling such as Capacity scheduler and Fair scheduler using four parameters Average Map Time, Average Reduce Time, CPU Time and Physical Memory.Centralized schedulers service predictable execution at the expenditure of utilization on the other hand distributed schedulers increases cluster utilization but suffer from high job completion time when workloads are heterogeneous. To address this trade-off, the idea of presenting queues at worker nodes can be implemented. Contribution of our work is that by using queues for centralized frameworks, we achieve utilization comparable to distributed schemes. We then can develop policies for active queue management, by choosing which task to execute next whenever a running task exits, with the goal of fast job completion times. For ideal utilization of cluster resources works in distributed environment to decrease in completion time of a job. The goal of above work is to reduce execution time of jobs and utilizing system resources in best possible manner.Presents work implements a queue management system which emphases on the criteria of decreasing the job execution time with maximum utilization of resources and performance assessment of planned algorithm using three parameters: Total time, Average map time and Capacity used.


Hadoop Distributed File System (HDFS), First In First Out (FIFO), Shortest Processing Time (SPT), Extraction, Transformation and Loading (ETL), Dynamic Task Splitting Scheduler (DTSS)

Full Text:



  • There are currently no refbacks.

Subscribe to Print Journals

 IJAIKD is currently Indexed By  Journal Seek