A Caltech Library Service

Unicorn: Unified resource orchestration for multi-domain, geo-distributed data analytics

Xiang, Qiao and Wang, X. Tony and Zhang, J. Jensen and Newman, Harvey and Yang, Y. Richard and Liu, Y. Jace (2019) Unicorn: Unified resource orchestration for multi-domain, geo-distributed data analytics. Future Generation Computer Systems, 93 . pp. 188-197. ISSN 0167-739X.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


As the data volume increases exponentially over time, data-intensive analytics benefits substantially from multi-organizational, geographically-distributed, collaborative computing, where different organizations contribute various yet scarce resources, e.g., computation, storage and networking resources, to collaboratively collect, share and analyze extremely large amounts of data. By analyzing the data analytics trace from the Compact Muon Solenoid (CMS) experiment, one of the largest scientific experiments in the world, and systematically examining the design of existing resource management systems for clusters, we show that the multi-domain, geo-distributed, resource-disaggregated nature of this new paradigm calls for a framework to manage a large set of distributively-owned, heterogeneous resources, with the objective of efficient resource utilization, following the autonomy and privacy of different domains, and that the fundamental challenge for designing such a framework is: how to accurately discover and represent resource availability of a large set of distributively-owned, heterogeneous resources across different domains with minimal information exposure from each domain? Existing resource management systems are designed for single-domain clusters and cannot address this challenge. In this paper, we design Unicorn, the first unified resource orchestration framework for multi-domain, geo-distributed data analytics. In Unicorn, we encode the resource availability for each domain into resource state abstraction, a variant of the network view abstraction extended to accurately represent the availability of multiple resources with minimal information exposure using a set of linear inequalities. We then design a novel, efficient cross-domain query algorithm and a privacy-preserving resource information integration protocol to discover and integrate the accurate, minimal resource availability information for a set of data analytics jobs across different domains. In addition, Unicorn also contains a global resource orchestrator that computes optimal resource allocation decisions for data analytics jobs. We implement a prototype of Unicorn and present preliminary evaluation results to demonstrate its efficiency and efficacy. We also give a full demonstration of the Unicorn system at SuperComputing 2017.

Item Type:Article
Related URLs:
URLURL TypeDescription
Xiang, Qiao0000-0002-3394-6279
Newman, Harvey0000-0003-0964-1480
Additional Information:© 2018 Elsevier B.V. Received 31 January 2018, Revised 25 June 2018, Accepted 19 September 2018, Available online 1 November 2018.
Funding AgencyGrant Number
National Natural Science Foundation of China61672385
China Postdoctoral Science Foundation2017-M611618
Army Research Laboratory (ARL)W911NF-16-3-0001
Department of Energy (DOE)DE-AC02-07CH11359
Department of Energy (DOE)000219898
Fermi National Accelerator Laboratory626507
Record Number:CaltechAUTHORS:20181106-140609850
Persistent URL:
Official Citation:Qiao Xiang, X. Tony Wang, J. Jensen Zhang, Harvey Newman, Y. Richard Yang, Y. Jace Liu, Unicorn: Unified resource orchestration for multi-domain, geo-distributed data analytics, Future Generation Computer Systems, Volume 93, 2019, Pages 188-197, ISSN 0167-739X,
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:90675
Deposited By: Tony Diaz
Deposited On:06 Nov 2018 22:18
Last Modified:09 Mar 2020 13:19

Repository Staff Only: item control page