The effect of local scheduling in load balancing designs
Load balancing is a common approach to task assignment in distributed architectures such as web server farms, database systems, grid computing clusters, and others. In such designs there is a dispatcher that seeks to balance the assignment of service requests (jobs) across the servers in the system so that the response time of jobs at each server is (nearly) the same. Such designs are popular due to the increased robustness they provide to bursts of traffic, server failures, etc., as well as the inherent scalability they provide. However, there is also a major drawback to load balancing designs – some performance is sacrificed. Specifically, it would be possible to reduce user response times by moving away from load balancing designs. Our goal in this paper is to study the degree of inefficiency in load balancing designs. Further, we will show that the degree of inefficiency depends on the scheduling discipline used locally at each of the servers, i.e. the local scheduler. Our results (see Section 3) show that the local scheduling policy has a significant impact on the degree of inefficiency in load balancing designs. In particular, the local scheduler in traditional designs is often modeled by Processor Sharing (PS), which shares the server evenly among all jobs in the system. When the local scheduler is PS, the degree of inefficiency grows linearly with the number of servers in the system. In contrast, if the local scheduler is changed to Shortest Remaining Processing Time first (SRPT), as has been suggested in a variety modern designs [7, 3, 10], the degree of inefficiency can be independent of the number of servers in the system and instead depend only on the heterogeneity of the speed of the servers.
© 2008 ACM.