In an ever-changing world, field surveys, inventories and monitoring data are essential for prediction of biodiversity responses to global drivers such as land use and climate change. This knowledge provides the basis for appropriate management. However, field biodiversity data collected across terrestrial, freshwater and marine realms are highly complex and heterogeneous. The successful...
An important point for the widespread dissemination of FAIR-data is the lowest possible entry barrier for preparing and providing data to other scientists according to the FAIR criteria. If scientists have to manually extract, transform and annotate the data according to the FAIRcriteria and then export it to make it available to the public, this requires a significant investment of time that...
Automating Metadata Handling in Research Software Engineering
Mustafa Soylu^ 1
Anton Pirogov^ 1
Volker Hofmann 1
Stefan Sandfeld 1
^ The authors contributed equally to this work
Institute for Advanced Simulation - Materials Data Science and Informatics (IAS9), Forschungszentrum Jülich, Jülich, Germany
Modern research is heavily dependent on software. The landscape of...
The Helmholtz Metadata Collaboration (HMC) and the Helmholtz Open Science Office have launched a joint initiative at the end of 2022 to strengthen and connect research data repositories in the Helmholtz Association, and to increase their visibility in the international research landscape. Research data repositories form central hubs for metadata on the Road to FAIR: They generate, consolidate...
Bioimaging is an important methodological procedure widely applied in life sciences. Bioimaging unites the power of microscopy, biology, biophysics and advanced computational methods allowing scientists to study different biological functions at the level of the single molecules and up to the complete organism. In parallel, high-content screening (HCS) bioimaging approaches are powerful...
The management of spatial data is facing ever greater challenges. In addition to the high number of data and products, technical aspects such as data size and efficient workflows play an increasingly important role for data users and providers. In addition, open access using FAIR principles is also becoming increasingly important in the field of research data management. Data should be made...
Materials Science and Engineering (MSE) is concerned with the design, synthesis, properties, and the performance of materials. Metals and semiconductors are an important type of crystalline materials that usually have defects. One of the common types of line defects is the “Dislocations” which strongly affect numerous material properties, including strength, fracture toughness, and...
Editing [Linked Data][1] documents represents an enormous challenge to users with limited technical expertise. These users struggle with language rules, relationships between entities, and interconnected concepts. These issues can result in frustration and low data quality. In order to respond to this challenge, we introduce a new editor, designed to facilitate effortless editing of [JSON-LD...
In their endeavor to generate and share FAIR research data, scientist face various challenges. High-level recommendations such as the FAIR principles [^1] require prior knowledge and a set of technical skills which are typically not part of the academic education. Therefore, the successful implementation of FAIR research data guidelines stands in grave need for well-trained, data-literate and...
FAIR research data and the adoption of semantic technologies hold a great promise to improve the quality, openness, and efficiency of research in the physical sciences. However, the FAIR building we wish to constructs rests on foundations that are still shaky: Metadata often lack the quantity and quality to harness the full potential of advanced search functionalities, knowledge graphs, and AI...
Open science promotes innovation, improves the transfer of knowledge to society and the economy, and ensures quality and transparency in research. The Helmholtz Association, Germany's largest research performing organization, has thus adopted an Open Science Policy in September 2022 [1].
This policy supports openness as a central endeavor of science and makes open science the standard for...
The Helmholtz Metadata Collaboration (HMC) has developed the HMC
dashboard on Open and FAIR Data in Helmholtz. The dashboard allows users
to monitor and interactively analyze statistics on open and FAIR data
produced by researchers in the Helmholtz Association. It can be used to
analyze in which repositories Helmholtz researchers make their data
publicly available, to monitor...
In 2021 HMC conducted its first community survey to align its services with the needs of Helmholtz researchers. A question catalogue, with 49 (sub-)questions based on an expertise-adaptive approach, was designed and disseminated among researchers in all six Helmholtz research fields. 631 completed survey replies were obtained for analysis.
The HMC Community Survey 2021 provides insight into...
This research poster dives into the important impact of four simple but crucial elements in research data policies: clear titles, persistent identifiers, publication dates, and open availability. These elements, often underestimated in policy, play a pivotal role in enhancing data discoverability, transparency, and collaboration - ultimately strengthening the foundation of modern scientific...
Since its establishment the Helmholtz Metadata Collaboration (HMC) has compiled a multitude of information on the state of the research data communities and practices within the Helmholtz Association and beyond.
The Information Portal is a web application for capturing FAIR data practices across all Helmholtz domains, offering a unified user interface for collecting and exploring...
Standardized metadata and its proper storage are essential for effective management of scientific research data. The challenge lies in manually compiling such metadata, a process which can be both tedious and prone to human error. To address this problem, we introduce the Mapping Service, developed within the framework of HMC.
The Mapping Service helps to streamline the process of metadata...
To be sustainable and useful, scientific data should be FAIR. These goals can only be achieved by definition and adoption of metadata standards and implementation of tools and services that support these standards. Unfortunately, the diversity of needs with respect to scientific (meta)data leads to a large gap between the scope and pace of large-scale standardization efforts and the day-to-day...
The important increase in efficiency of perovskite-based solar cells (PSCs) in the last decade is a result of scientific work, which produced a huge quantity of literature and data-sets (between 2014 and 2022 almost 30,000 reports were published). The aim of this work is to elaborate an ontology which can primarly be used to classify literature paragraphs according to the subject discussed...
The PATOF project builds on work at MAMI particle physics experiment A4. A4 produced a stream of valuable data for many years which already released scientific output of high quality and still provides a solid basis for future publications. The A4 data set consists of 100 TB and 300 million files of different types (hierarchical folder structure and file format with minimal metadata provided...
The openCost project aims to contribute to a fair reform of the scientific publishing system by establishing comprehensive cost transparency in the publishing process.
To this end, openCost creates the required technical infrastructure to freely access publication costs and exchange these data via automated, standardized interfaces and formats.
Standardized recording and open provision of...
In this presentation we will introduce to the current HMC activities and outcome in HUB Earth and Environment: Our process for developing a guideline is planned as a coordinated procedure. For every single implementation guide, we go through the same questions, up to tests - based on use cases and definition of abstract test classes, in order to be able to validate the implementation. Our...
The Helmholtz digital ecosystem connects diverse scientific domains with differing (domain-specific) standards and best practices for handling metadata. Ensuring interoperability within such a system, e.g. of developed tools, offered services and circulated research data, requires a semantically harmonized, machine-actionable, and coherent understanding of the relevant concepts. Further, this...
Research across the Helmholtz Association is based on inter- and multidisciplinary collaborations across its 18 Centres and beyond. However, the (meta)data generated through Helmholtz research and operations is typically siloed within institutional infrastructures and often within individual teams. The result is that the wealth of the association’s (meta)data is stored in a scattered manner,...
Persistent identifiers (PIDs) are an integral element of the FAIR principles (Wilkinson et al. 2016) as they are recommended to refer to data sets and metadata. They are, however, also considered to be used to refer to other data entities, like people, organizations, projects, laboratories, repositories, publications, vocabularies, samples, instruments, licenses, methods and others....
[Research Object Crate][1] (RO-Crate) is an open, community driven data package specification to describe all kinds of file-based data, as well as entities outside the package. In order to do so, it uses the widespread JSON-format, representing Linked Data ([JSON-LD][2]), allowing to link to external information. This makes the format flexible and machine-readable. These packages are being...
NeXus is a well established standard for data exchange of neutron, x-ray and muon large scale facilities. Being around for over 20 years with dedicated governance structures it serves as a successful example of a long-lived standard. NeXus as an ecosystem can be difficult to navigate as people refer to its parts using varying terminology and sometimes having different concepts in mind even...
Computer simulations are an essential pillar of knowledge generation in science. Understanding, reproducing, and exploring the results of simulations relies on tracking and organizing metadata describing numerical experiments. However, the models used to understand real-world systems, and the computational machinery required to simulate them, are typically complex, and produce large amounts of...
In the last two years, the endeavor of realizing FAIR Digital Objects (FDOs) took a huge leap on the international as well as on the national level in Germany and in particular within HMC. By finding consensus on a common Helmholtz Kernel Information Profile [(1)][1] defining basic kernel metadata attributes each FDO must provide to serve as top-level commonality across all research fields,...
Controlled vocabularies are used to describe knowledge within a particular domain, encompassing a comprehensive collection of domain specific terms. Using controlled vocabularies not only mitigates the challenge of data ambiguity, but also offers several advantages, including references to term definitions, particularly within metadata schemas. Additionally, they foster semantic...