Making research reproducible and FAIR (Findable, Accessible, Interoperable, and Reusable) often requires more information than what is commonly published within scientific articles. There is a growing number of repositories for publishing additional material like data or code. However, articles are still at the center of most scientific work and thus efforts on gathering information which is...
Annotation is one of the oldest cultural techniques of mankind. While in past centuries pen and paper were the means of choice to add annotations to a source, this activity has increasingly shifted to the digital world in recent years. With the W3C recommendation 'Web Annotation Data Model', a powerful tool has been available since 2017 to model annotations in a wide variety of disciplines and...
The year 2022 marks the 10th anniversary of the Registry of Research Data Repositories - re3data. The global index currently lists over 2,800 digital repositories across all scientific disciplines – critical infrastructures to enable the global exchange of research data. The openly accessible service is used by researchers and services worldwide. It provides extensive descriptions of...
Flanders Research Information Space a.k.a. the FRIS-portal is a research discovery platform hosted by the Flemish department Economy, Science and Innovation, where you can find information on publicly financed research in Flanders. The FRIS-portal operates as a regional metadata hub where you can search for metadata on researchers, research groups, projects, and publications and recently also...
Recording data with the help of photons and neutrons is limited to bigger institutes. Besides the limited time slots, this process is also quite expensive. To save resources, DAPHNE4NFDI focuses on creating ontologies and infrastructure to make all data from its participants FAIR. This enables users not only to use existing data but also to automatically fetch data for analysis. This analysis...
For research data to be used efficiently, it must be easy to find and access. This is a requirement in all areas of science. The Data Collections Explorer, developed within NFDI4Ing for the engineering sciences, targets these needs. It is an information system that provides an overview of research data repositories, archives, databases as well as individual datasets published in the field. ...
Introduction: The National Research Data Infrastructure for Personal Health Data (NFDI4Health) aims to improve the FAIRness of health-related data from epidemiological, public health and clinical studies as well as registries and administrative health databases[1]. One key service of NFDI4Health is the German Central Health Study Hub[2] that supports a standardised publication and search...
Within the current project we plan to optimise data and metadata curation workflow by automating the creation of community standard metadata StationXML and include the generated PIDs as well as link them to the parent dataset DOIs. Moreover, we plan to enrich metadata with terms from standard and community specific vocabularies. Specific guidelines, describing the OBS data management...
The [FAIR Digital Object Lab][1] is an extendable and adjustable software stack for generic FAIR Digital Object (FAIR DO) tasks. It consists of a set of interacting components with services and tools for creation, validation, discovery, curation, and more.
Preprocessing data for research, like finding, accessing, unifying or converting, takes up to 80% of research time spans. The FAIR...
HELIPORT is a data management solution that aims at making the components and steps of the entire research experiment’s life cycle discoverable, accessible, interoperable and reusable according to the FAIR principles.
Among other information, HELIPORT integrates documentation, scientific workflows, and the final publication of the research results - all via already established solutions for...
The Helmholtz digital ecosystem connects diverse scientific domains with differing (domain-specific) standards and best practices for handling metadata. Ensuring interoperability within such a system, e.g. of developed tools, offered services and circulated research data, requires a semantically harmonized, machine-actionable, and coherent understanding of the relevant concepts. Further, this...
The desired Interoperability of data as outlined by the FAIR principles, requires a harmonization of data handling processes among data infrastructures. To support the adoption of agreements on such processes and thus further develop the “ROAD TO FAIR”, HMC is currently establishing a FAIR-IMP-lementation Network (F-IMP). With this communication network we encourage the data management...
HMC Earth and Environment (E&E) strives to define, create and activate a Helmholtz FAIR Data Space (HFDS) as a "decentralized infrastructure for trustworthy data sharing and exchange in data ecosystems based on commonly agreed principles" (Nagel L., Lycklama D., 2021). Within HMC E&E the data space consists of common agreements to implement the FAIR building blocks (see below), leading to...
In pursuit of deep and expressive semantic interoperability, the Earth and Environment Hub is adopting a three-pillared approach to develop strategically and technically aligned capacity within the Helmholtz Association and globally.
The first pillar is implementation of high-quality, future-oriented semantic solutions for Earth and environmental applications. HMC E&E personnel lead the...
PIDs (Persistent Identifiers) are a core concept at the center of FAIR data architectures such as FAIR Digital Objects. They point to a digital resource such as a publication, dataset or a set of information in a distinctive and lasting fashion and are assured to persist over longer, defined periods of time.
We looked into six established PID systems (ROR, ORCID, PIDINST, IGSN, DataCite...
Within the research project LOD-GOESS (https://lod-geoss.gitub.io ) and the Helmholtz Metadata Hub Energy we are developing a distributed data architecture for sharing and improved discovery of research data in the domain of energy systems analysis. A central element is the databus (https://databus.dbpedia.org ) which acts as a central searchable metadata catalog. Data will be annotated on the...
A central mission of HMC is to support the data producers of the Helmholtz community in making their data FAIR. Developing a sustainable strategy for doing so requires a detailed understanding of community-specific practices, strengths, and limitations related to the application of each FAIR data guideline. We have applied the FAIR Data Maturity Model, developed by the respective RDA working...
Demanding requirements of fundamental physics at large-scale facilities are forcing researchers to use and further develop sophisticated computer science for high-efficient data processing, analysis, curation and preservation. PUNCH4NFDI (Particles, Universe, NuClei and Hadrons for the NFDI) is a consortium of particle, astroparticle, astro-, hadron, and nuclear physics, looking forward to...
Metadata plays a key role in the scientific publication process. It is only through metadata and identifiers that each contribution, from research data to article publication and beyond, becomes findable, accessible, interoperable and reusable. The digitization of scholarly communication allows the creation of metadata locally or in a distributed manner, and global exchange, enabled by...
Cross-domain research is often hampered by the lack of harmonized metadata schemas and standards. Metadata of different domains vary in origin, format and scope, so they cannot be merged routinely. In the interdisciplinary field of environmental epidemiology, an efficient linkage of health data with the multitude of environmental and earth observation data is crucial to quantify human...
MetaStore is a metadata repository for managing metadata documents. It supports communities in storing metadata documents in a predefined schema. It is therefore an important building block for more precise automated evaluation and/or retrieval of digital objects.
With the help of the metadata documents, digital objects can also be evaluated/compared according to content-related aspects. XML...
Physical samples with informative metadata are more easily discoverable, shareable, and reusable. Metadata provides the framework for consistent, systematic, and standardized collection and documentation of sample information. This poster explores practical implementation of the FAIR Principles through creation of a framework centralized around biospecimens, linked datasets, sample...
Within NFDI-MatWerk (“National Research Data Infrastructure for Material Sciences”/ “Nationale Forschungsdateninfrastruktur für Materialwissenschaften und Werkstofftechnik“), the Task Area Materials Data Infrastructure (TA-MDI) will provide tools and services to easily store, share, search, and analyze data and metadata. Such a digital materials environment will ensure data integrity,...
Researchers in the social sciences use various software for statistical analysis of rectangular, structured data . The various data formats which are only partially compatible impede data exchange and reuse. In particular, proprietary data formats endanger those in the FAIR principles enshrined demand for interoperability. The project [Open Data Format][1] aims to develop a non-proprietary...
The Open Researcher and Contributor ID [ORCID][1] strives to enable transparent and trustworthy connections between researchers, their contributions, and their affiliations by providing a unique, persistent identifier for individuals to use as they engage in research, scholarship, and innovation activities. ORCID is therefore an essential piece of the puzzle for increasing the discoverability...
For research data to be reusable by scientists or machines, the data and associated meta-data should comply with the so-called "FAIR principles", i.e. it should be findable, accessible, interoperable, and reusable [1]. To realize this, is not a straightforward task, as researchers do not know how FAIR or un-fair their data actually is and how to improve their FAIRness. A quantitative measure,...
[Research Object Crate][1] (RO-Crate) is an open, community driven data package specification to describe all kinds of file-based data, as well as entities outside the package. In order to do so, it uses the widespread JSON-format, representing Linked Data (JSON-LD), allowing to link to external information. This makes the format flexible and machine-readable. These packages are being referred...
Data sharing at both the national and international level benefits genomic medicine. Specifically, in diseases driven by genomic factors, such as cancer types or rare diseases, data sharing maximizes the utility and impact of cohorts, thereby aiding in translating research findings to therapies. The successful discovery of new findings, however, requires linking genomic data to health data and...
Here, we report on our approach to establish a durable, rigid connection between the Aquarius beamline at synchrotron source Bessy II and its digital counterpart build in the simulation software Ray-UI [1]. While simulations play a crucial role in the instrument design as a digital precursor of the real-world object and contain a comprehensive description of the setup, usually the digital...
Supporting Helmholtz's research communities in making their data FAIR is one of the key missions of HMC. A multi-method approach combining quantitative and qualitative methods was developed to understand current data management practices in research field Matter. Quantitative information was obtained from data that was self-reported by Helmholtz's researchers in the HMC Community Survey 2021....