Author(s): Mahesh.A, Pallavi.B, Shankar Thalla, Krishna Chaitanya Katkam

Currently most of the cloud applications method great deal of knowledge to supply the required results in the information volumes to be processed by cloud applications is growing a lot of quicker than computing power. This difficulty in growth on new approaches for analyzing the knowledge and process. The employment of Hadoop Map Reduce framework to execute scientific workflows within the cloud this project explores. Cloud computing will providing the monumental clusters for economical giant division and information analysis. Distributed file system is simply a classical model of a file system used as the key building blocks for cloud computing. The above such files systems will partition a file into a number of chunks and allocated each chunk in to the distinct nodes. These Files are dynamically appended, created and deleted. The above result occurs load imbalance in a distributed file system; i.e., the file chunks are not uniformly distributed among the nodes. The distributed file systems in production systems strongly depend on a central node for chunk reallocation. To overcome this problem the distributed load rebalancing algorithm will be presented in this paper. To delete this dependence on the storage nodes each node performs the load rebalancing algorithm independently without acquiring global knowledge.