CaltechAUTHORS
  A Caltech Library Service

Data handling with SAM and art at the NOνA experiment

Aurisano, A. and Backhouse, C. and Davies, G. S. and Illingworth, R. and Mayer, N. and Mengel, M. and Norman, A. and Rocco, D. and Zirnstein, J. (2015) Data handling with SAM and art at the NOνA experiment. Journal of Physics: Conference Series, 664 . Art. No. 042001. ISSN 1742-6596. https://resolver.caltech.edu/CaltechAUTHORS:20160427-075631357

[img] PDF - Published Version
Creative Commons Attribution.

2707Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20160427-075631357

Abstract

During operations, NOvA produces between 5,000 and 7,000 raw files per day with peaks in excess of 12,000. These files must be processed in several stages to produce fully calibrated and reconstructed analysis files. In addition, many simulated neutrino interactions must be produced and processed through the same stages as data. To accommodate the large volume of data and Monte Carlo, production must be possible both on the Fermilab grid and on off-site farms, such as the ones accessible through the Open Science Grid. To handle the challenge of cataloging these files and to facilitate their off-line processing, we have adopted the SAM system developed at Fermilab. SAM indexes files according to metadata, keeps track of each file's physical locations, provides dataset management facilities, and facilitates data transfer to off-site grids. To integrate SAM with Fermilab's art software framework and the NOvA production workflow, we have developed methods to embed metadata into our configuration files, art files, and standalone ROOT files. A module in the art framework propagates the embedded information from configuration files into art files, and from input art files to output art files, allowing us to maintain a complete processing history within our files. Embedding metadata in configuration files also allows configuration files indexed in SAM to be used as inputs to Monte Carlo production jobs. Further, SAM keeps track of the input files used to create each output file. Parentage information enables the construction of self-draining datasets which have become the primary production paradigm used at NOvA. In this paper we will present an overview of SAM at NOvA and how it has transformed the file production framework used by the experiment.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://dx.doi.org/10.1088/1742-6596/664/4/042001DOIArticle
http://iopscience.iop.org/article/10.1088/1742-6596/664/4/042001/metaPublisherArticle
http://chep2015.kek.jp/OrganizationConference Website
Additional Information:© 2015 Published under licence by IOP Publishing Ltd. Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. The author acknowledges support for this research was carried out by the Fermilab scientific and technical staff. Fermilab is operated by Fermi Research Alliance, LLC under contract No. De-AC02-07CH11359 with the United States Department of Energy.
Funders:
Funding AgencyGrant Number
Department of Energy (DOE)DE-AC02-07CH11359
Record Number:CaltechAUTHORS:20160427-075631357
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20160427-075631357
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:66496
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:27 Apr 2016 17:48
Last Modified:03 Oct 2019 09:57

Repository Staff Only: item control page