DESY - Deutsches Elektronen Synchrotron

Cancelled: The National Science Data Fabric: Democratizing Data Access for Science and Society

by Prof. Valerio Pascucci

Europe/Berlin
Seminar Room 03 (DESY)

Seminar Room 03

DESY

Description

The meeting is cancelled!

 

Effective use of data management techniques to analyze and visualize massive scientific data is a crucial ingredient for the success of any experimental facility, supercomputing center, or cyberinfrastructure that supports data-intensive science. This is particularly true for high-volume/high-velocity datasets and resource-constrained institutions. However, universal data delivery remains elusive, limiting the scientific impact of these facilities. 

This talk will present the National Science Data Fabric (NSDF) testbed, which introduces a novel trans-disciplinary data fabric integrating access to and use of shared storage, networking, computing, and educational resources. The NSDF technology addresses the key data management challenges in constructing complex streaming workflows that take advantage of data processing opportunities that may arise while data is in motion. This technology finds practical use in many research and industrial applications, including materials science, precision agriculture, ecology, climate modeling, astronomy, connectomics, and telemedicine. Practical use cases include the real-time data acquisition from an Advanced Photon Source (APS) beamline to allow remote users to monitor the progress of an experiment and direct integration in the Materials Commons community repository. Full integration with Python scripting facilitates the use of external libraries for data processing. For example, hundreds of terabytes of climate modeling data from NASA can be easily distributed and visualized with a Jupyter notebook that I will demonstrate live.

Overall, this leads to building flexible data streaming workflows for massive models without compromising the interactive nature of the exploratory process, the most effective characteristic of discovery activities in science and engineering. The presentation will be combined with a few live demonstrations including running Jupyter notebooks that show (i) how hundreds of terabytes of NASA climate data from the cloud can be easily distributed and visualized on any computer and (ii) how undergraduate students of a minority-serving institution (UTEP) can be provided with real-time access to large-scale materials science data normally used only by established scientists in well-funded research groups. 

The talk takes place at Seminar room 03, building 1b.


Valerio Pascucci is the Inaugural John R. Parks Endowed Chair, the founding Director of the Center for Extreme Data Management Analysis and Visualization (CEDMAV), a Faculty of the Scientific Computing and Imaging Institute, and a Professor of the School of Computing of the University of Utah. Valerio has received the 2022 IEEE VGCT Visualization Technical Achievement Award and the 2022-2023 Distinguished Research Award (DRA) from the University of Utah and has been inducted into the IEEE VGTC Visualization Academy in 2022.

Valerio is also the President of ViSOAR LLC, a University of Utah spin-off, and the founder of Data Intensive Science, a 501(c) nonprofit providing outreach and training to promote the use of advanced technologies for science and engineering. Valerio's research interests include Big Data management and analytics, progressive multi-resolution techniques in scientific visualization, discrete topology, and compression. Valerio is the coauthor of more than two hundred refereed journal and conference papers and was an Associate Editor of the IEEE Transactions on Visualization and Computer Graphics.

Organised by

Uwe Jandt