CaltechAUTHORS
  A Caltech Library Service

ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing

Docan, Ciprian and Zhang, Fan and Jin, Tong and Bui, Hoang and Sun, Qian and Cummings, Julian and Podhorszki, Norbert and Klasky, Scott and Parashar, Manish (2015) ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing. Concurrency and Computation: Practice and Experience, 27 (14). pp. 3724-3745. ISSN 1532-0626. https://resolver.caltech.edu/CaltechAUTHORS:20151106-104036001

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20151106-104036001

Abstract

Managing the large volumes of data produced by emerging scientific and engineering simulations running on leadership-class resources has become a critical challenge. The data have to be extracted off the computing nodes and transported to consumer nodes so that it can be processed, analyzed, visualized, archived, and so on. Several recent research efforts have addressed data-related challenges at different levels. One attractive approach is to offload expensive input/output operations to a smaller set of dedicated computing nodes known as a staging area. However, even using this approach, the data still have to be moved from the staging area to consumer nodes for processing, which continues to be a bottleneck. In this paper, we investigate an alternate approach, namely moving the data-processing code to the staging area instead of moving the data to the data-processing code. Specifically, we describe the ActiveSpaces framework, which provides (1) programming support for defining the data-processing routines to be downloaded to the staging area and (2) runtime mechanisms for transporting codes associated with these routines to the staging area, executing the routines on the nodes that are part of the staging area, and returning the results. We also present an experimental performance evaluation of ActiveSpaces using applications running on the Cray XT5 at Oak Ridge National Laboratory. Finally, we use a coupled fusion application workflow to explore the trade-offs between transporting data and transporting the code required for data processing during coupling, and we characterize sweet spots for each option.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1002/cpe.3407DOIArticle
http://onlinelibrary.wiley.com/doi/10.1002/cpe.3407/abstractPublisherArticle
ORCID:
AuthorORCID
Zhang, Fan0000-0002-3643-018X
Additional Information:© 2014 John Wiley & Sons, Ltd. Received 29 March 2014; Revised 18 August 2014; Accepted 25 August 2014. The research presented in this work is supported in part by the US National Science Foundation (NSF) via grant numbers ACI 1339036, ACI 1310283, DMS 1228203, and IIP 0758566; by the Director, Office of Advanced Scientific Computing Research, Office of Science, of the US Department of Energy through the Scientific Discovery through Advanced Computing (SciDAC) Institute of Scalable Data Management, Analysis and Visualization (SDAV) under award number DE-SC0007455; by the Advanced Scientific Computing Research and Fusion Energy Sciences Partnership for Edge Physics Simulations (EPSI) under award number DE-FG02-06ER54857; by the ExaCT Combustion Co-Design Center via subcontract number 4000110839 from UT Battelle; by the RSVP grant via subcontract number 4000126989 from UT Battelle; and by an IBM Faculty Award. The research was conducted as part of the NSF Cloud and Autonomic Computing (CAC) Center at Rutgers University and the Rutgers Discovery Informatics Institute (RDI2).
Funders:
Funding AgencyGrant Number
NSFACI 1339036
NSFACI 1310283
NSFDMS 1228203
NSFIIP 0758566
Department of Energy (DOE)DE-SC0007455
Department of Energy (DOE)DE-FG02-06ER54857
UT Battelle4000110839
UT Battelle4000126989
IBM Faculty AwardUNSPECIFIED
Subject Keywords:dynamic code deployment; in situ data processing; data-intensive application workflows; coupled simulations
Issue or Number:14
Record Number:CaltechAUTHORS:20151106-104036001
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20151106-104036001
Official Citation:Docan, C., Zhang, F., Jin, T., Bui, H., Sun, Q., Cummings, J., Podhorszki, N., Klasky, S., and Parashar, M. (2015) ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing. Concurrency Computat.: Pract. Exper., 27: 3724–3745. doi: 10.1002/cpe.3407
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:61940
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:06 Nov 2015 19:10
Last Modified:09 Mar 2020 13:19

Repository Staff Only: item control page