Speaker
Description
Simulation is an essential pillar of knowledge generation in science. The numerical models used to describe, predict, and understand real-world systems are typically complex. Consequently, applying these models by means of simulation often poses high demands on computational resources, and requires high-performance computing (HPC) or other dedicated hardware architectures. Metadata describing the details of a numerical experiment arise at all stages of the simulation process: the conceptual description of the model, the model implementation, and the tools and machines used to run the simulation. Capturing these metadata and provenance information along the processing chain is a vital requirement for several purposes, e.g. reproducibility, benchmarking and validation, assessment of the reliability of the simulations, and data exploration¹². The ability to search, share, and evaluate metadata and provenance traces from heterogeneous simulations and environments is a major challenge in provenance-driven analysis. The availability of a common metadata framework, which can be adopted by scientists from different scientific domains, would foster the meta-analysis of HPC simulation workflows³. Here, we develop a metadata management framework for generic HPC-based simulation research comprising concepts and tools for efficiently generating, organizing, and exploring metadata along a given simulation workflow. The derived solutions cope with the modularity and flexibility demands of rapidly progressing science and are applicable to diverse research fields. As a proof of concept, we will apply these solutions to use cases from environmental research and computational neuroscience.
References:
1. Guilyardi, E., et. al. (2013) doi: 10.1175/BAMS-D-11-00035.1
2. Manninen, T., et. al. (2018) doi: 10.3389/fninf.2018.00020
3. Ivie, P., & Thain, D. (2018) doi: 10.1145/3186266
Acknowledgements:
The authors would like to thank Jan Bumberger, Helen Kollai, Michael Denker, Rainer Stotzka, Guido Trensch, and Stefan Sandfeld for ongoing fruitful discussion. This project was funded by Helmholtz Metadata Collaboration (HMC) ZT-I-PF-3-026, EU Grant 945539 (HBP), Helmholtz IVF Grant SO-092 (ACA), and Joint lab SMHB; compute time was granted by VSR computation grant JINB33, Jülich. The work was carried out in part within the HMC Hub Information at the Forschungszentrum Jülich.
In addition please add keywords.
Metadata-Framework, High-Performance-Computing, Simulation-Workflow, Reproducibility, Re-usability
Please assign your poster to one of the following keywords. | Processes/Policies |
---|---|
Please assign yourself (presenting author) to one of the stakeholders. | Scientist/ Data Producer |