On Channel Failures, File Fragmentation Policies, and Heavy-Tailed Completion Times
It has been recently discovered that heavy-tailed completion times can result from protocol interaction even when file sizes are light-tailed. A key to this phenomenon is the use of a restart policy where if the file is interrupted before it is completed, it needs to restart from the beginning. In this paper, we show that fragmenting a file into pieces whose sizes are either bounded or independently chosen after each interruption guarantees light-tailed completion time as long as the file size is light-tailed; i.e., in this case, heavy-tailed completion time can only originate from heavy-tailed file sizes. If the file size is heavy-tailed, then the completion time is necessarily heavy-tailed. For this case, we show that when the file size distribution is regularly varying, then under independent or bounded fragmentation, the completion time tail distribution function is asymptotically bounded above by that of the original file size stretched by a constant factor. We then prove that if the distribution of times between interruptions has nondecreasing failure rate, the expected completion time is minimized by dividing the file into equal-sized fragments; this optimal fragment size is unique but depends on the file size. We also present a simple blind fragmentation policy where the fragment sizes are constant and independent of the file size and prove that it is asymptotically optimal. Both these policies are also shown to have desirable completion time tail behavior. Finally, we bound the error in expected completion time due to error in modeling of the failure process.
© 2014 IEEE. Manuscript received February 24, 2014; revised October 03, 2014; accepted October 31, 2014; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor D. Manjunath. Date of publication December 09, 2014; date of current version February 12, 2016. This work was supported by the ARO under MURI Grant W911NF-08-1-0233, the NSF under the NetSE Grant, the Caltech Lee Center for Advanced Networking, and the Australian Research Council under Grant DP0985322. The work of J. Nair was supported in part by an NWO VIDI Grant. A shorter version of this paper appeared in the IEEE International Conference on Computer Communications (INFOCOM), San Diego, CA, USA, March 15–19, 2010. A preliminary version was also presented at the Workshop on MAthematical Performance Modeling and Analysis (MAMA), Seattle, WA, USA, June 15, 2009. The authors thank A. Wierman, L. Chen, and M. Chandy for helpful discussions.
Accepted Version - FULLTEXT01.pdf