Description
Our Poster and Demo Session. Will take place in conjunction with the evening reception of SE25.
Recent developments in open data policies of meteorological agencies have much expanded the set of up-to-date weather observation and forecast data that is publicly available to meteorological research and education. To improve use of this open data, we have developed 3-D visualization products that extract and display meteorological information in novel ways. In this demo, we present...
The curation of software metadata safeguards their quality and compliance with institutional software policies. Moreover, metadata that was enriched with development and usage information can be used for evaluation and reporting of academic KPIs. Software CaRD ("Software Curation and Reporting Dashboard"; ZT-I-PF-3-080), a project funded by the Helmholtz Metadata Collaboration (HMC), develops...
The Coccinelle project was established to ease maintenance of the Linux kernel driver code, written in the C programming language.
Nowadays Coccinelle belongs to the toolkit of the Linux kernel maintainers.
We are working to enable another ambitious goal -- that of large-scale code refactoring, with HPC and C++ in mind.
This poster tells last year's progress of our collaboration, evidencing...
Code development and maintenance in a team can be a daunting process especially when multiple modules are interconnected with variegated dependencies, dispersed over a few git repositories and/or developed in different versions of the software. Consequently, VENQS is established to set up an infrastructure and workflow for a semi-automated version and dependency management. This is achieved...
Music-related projects dealing with complex metadata have a very long tradition in musicology and have produced a great variety of project-specific data formats and structures. This, however, hinders interoperability between data corpora and, ultimately, the full exploitation of the unprecedented potential of cutting-edge computer science. In this context, the schema defined within the Music...
Interdisciplinary collaborative scientific networks often rely on a multitude of different software systems for data storage and data exchange. Keeping data findable and in sync between different sites, working groups and institutes can be challenging. We developed a solution based on the open source software LinkAhead that combines meta data from different repositories into a single research...
cff2pages is a tool that generates HTML files from metadata collected in the Citation File Format (CFF). It can be used to create a static page on GitHub or GitLab Pages to showcase a project. This is particularly useful for small research software projects, offering an easy-to-use workflow that converts machine-readable metadata into human-readable formats for several purposes:
Enhancing...
In domains with relevant security or privacy concerns, open data sharing among cooperation partners is often not an option. Here, cryptography offers alternative solutions to reconcile cooperation and data protection. Participants engage in peer-to-peer computation on encrypted data, arriving jointly at the intended result, without ever having access to each other’s input data. While elegant...
The Collaborative OPen Omics (COPO) is a data and metadata broker that advances open science by supporting the principles of Findability, Accessibility, Interoperability, and Reuse (FAIR). As reliance on shared data grows, COPO addresses metadata management challenges by using community-sanctioned standards, specifically Darwin Core (DwC) and Minimum Information about any Sequence (MIxS)....
DataLad (Halchenko et al., 2021 [1]) is free and open source software for managing digital objects and their relationship built on top of Git and git-annex. Its initial commit in 2013 marked the beginning of a more than 10 year long academic software history so far, supported by various grants, institutions, and underlying research endeavors. Over time, the software became an extendable...
The society wants to have a poster. And remember that the society has a 10 minute talk roundabout the first keynote.
With the ever-increasing data sizes employed at large experiments and their associated computing needs, many applications can benefit from access to dedicated cluster resources, in particular server-grade GPUs for machine learning applications. However, computing clusters are more often tailored to batch job submission and not to online data visualisation. Infrastructure-as-a-Service (IaaS)...
Image-based cell sorting is a key technology in molecular and cellular biology, as well as in medicine, enabling the isolation of desired cells based on spatial and temporal information extracted from live microscopy. Beyond the extensive application of sorting methods in the fields of immunology and oncology, growing interest from other disciplines like personalized medicine underscore the...
The "Digital Edition Levezow Album" project is an interdisciplinary collaboration between the Hub of Computing and Data Science (HCDS), the Department of Art History at the University of Hamburg, and the State and University Library Hamburg. The project aims to digitally process and interactively visualize a previously unexplored sketchbook from the late 17th century, containing drawings on...
Der Zusammenhang zwischen Methoden, diese implementierenden Werkzeugen (Software) und ihrer Nützlichkeit für die Untersuchung einer Forschungsfrage und -gegenstands ist von immanentem Interesse für die computationell arbeitenden Geisteswissenschaften. In der Folge hat sich das Toolverzeichnis in den Digital Humanities inzwischen als eigenes Genre etabliert: von TAPoR (3.0)[^1] (Grant u. a....
As scientific research increasingly relies on software to handle complex data, limited formal training in software development among researchers often leads to issues with documentation, code reliability, and reproducibility. In this study, we conducted an empirical analysis of 5,300 open-source research repositories, focusing on practices aligned with FAIR4RS recommendations. Python was the...
Students, postdocs, and other researchers continuously seek to develop beneficial skills for their work.
One traditional way to up-skill is through workshops, but scheduling conflicts and varied learning styles can be barriers to effective learning. To address these challenges, we propose a learning framework that leverages GitHub’s capabilities. The idea follows from a digital version of a...
In the realm of biomedical research, the ability to accurately assess document-to-document similarity is crucial for efficiently navigating vast amounts of literature. OntoClue is a comprehensive framework designed to evaluate and implement a variety of vector-based approaches to enhance document-to-document recommendations based on similarity, using the RELISH corpus as reference. RELISH is...
The [European Virtual Institute for Research Software Excellence (EVERSE)][1] is an EC-funded project that aims to establish a framework for research software excellence. The project brings together a consortium of European research institutions, universities, and infrastructure providers to collaboratively design and champion good practices for high-quality, sustainable research software. You...
In this demo, we present the [TIDO Viewer][1], a flexible application developed by SUB Göttingen, specifically designed for the interactive presentation of digital texts and objects. In combination with the [TextAPI][2], the TIDO Viewer enables the dynamic integration and visualization of digitized content. This synergy supports various use cases in research and library environments, offering...
Research software development is a fundamental aspect of academic research, and it has now been acknowledged that the FAIR (Findable, Accessible, Interoperable, Reusable) principles, historically established to improve the reusability of research data, should also be applied to research software. However, specific aspects of Research Software like executability or evolution over time require...
Being cross-disciplinary at its core, research in Earth System Science comprises divergent domains such as Climate, Marine, Atmospheric Sciences and Geology. Within the various disciplines, distinct methods and terms for indexing, cataloguing, describing and finding scientific data have been developed, resulting in a large amount of controlled Vocabularies, Taxonomies and Thesauri. However,...
There are many methods for conducting research in literature. The research and transfer cycle within energy system research projects by Stephan Ferenz describes how to carry out a FAIR research project in six steps. However, these steps are very general and do not focus on research software. In energy research, simulation software is especially a vital research artifact. Therefore, we are...
Researchers from a broad spectrum of scientific fields use computers to aid their research, often starting at their own laptop or institutional workstation. At some point in the research, additional help in form of algorithmic or software engineering consultancy or even additional computational resources in form of access to high-performance computing (HPC) systems may become necessary....
Earth System Modeling (ESM) involves a high variety and complexity of processes to be simulated which resulted in the development of numerous models, each aiming on the simulation of different aspects of the system. These components are written in various languages, using different High-Performance Computing (HPC) techniques, tools, and overlap or lack functionalities.
To use the national...
Managing projects with external collaborators sometimes comes with the burden of ensuring inbound contributions respect legal obligations. Where a low-level 'Developer Certificate of Origin (DCO)' approach only introduces certain checks, a 'Contributor License Agreements (CLAs)', on the other hand, relies on documenting signed CLAs and thus dedicated book-keeping.
In this poster, we showcase...
The poster will show what actions we’ve taken to create and engage an RSE Community at FZJ so that other centres might be encouraged to do the same thing at their centre. It will show the initiatives and tools that we have created like a publication monitor, Code of the Month, Open Hours, Newsletters etc. We will show how we’re encouraging good practice through our ‘Resources’ website which...
The relevance of Open Science and Open Data is becoming increasingly obvious in modern day publications. Frequently, scientists write their own analysis code, as the complexity of analysis increases and the combination of methods become more relevant – from code conversion, to measuring and comparing. These functions and methods are not stable, are subject to change, are constrained to the use...
In scientific research, effective data management is crucial, especially when handling experimental data. The increasing volume and complexity of data collected in experimental settings necessitate rigorous methodologies to ensure that such data remains findable, accessible, interoperable, and reusable (FAIR). These requirements are seamlessly met by the RDF graphs, which is a type of...
Particle accelerators are complex machines consisting of hundred of devices. Control systems and commissioning g applications are used to steer, control and optimise them. Online models allow deriving characteristic parameters during operation.
These online models need to combine components that use different views of the same physic quantity. Therefore appropriate support has to be...
Research in linguistics is increasingly data-driven and requires access to language corpora, i.e. “collection[s] of linguistic data, either written texts or a transcription of recorded speech, which can be used as a starting-point of linguistic description or as a means of verifying hypotheses about a language” (Crystal 2003). Here, language itself is the object of study, and not just an...
The growing volume of high-resolution time series data in Earth system science requires the implementation of standardised and reproducible quality control workflows to ensure compliance with the FAIR data standards. Automated tools such as SaQC[1] address this need, but lack the capacity for manual data review and flagging. It is therefore the intention of this project to develop a...
Effective monitoring of (computing) infrastructure, especially in complex systems with various dependencies, is crucial for ensuring high availability and early detection of performance issues. This poster demonstrates the integration of Prometheus and GitLab CI/CD to modernize our existing infrastructure monitoring methods. As infrastructure checks increase, our legacy monitoring system faces...
Increasing energy demand and the need for sustainable energy systems have initiated the global and German energy transition. The building and mobility sectors promise high potential for savings in final energy and greenhouse gas emissions through renewable energy technologies. NESSI was developed to reduce the complexity of decisions for an efficient, resilient, affordable, and low-emission...
ABSTRACT
We present a new cohort-based training program by OLS (formerly Open Life Science). OLS is a non-profit organisation dedicated to capacity building and diversifying leadership in research worldwide (https://we-are-ols.org/). Since 2020, we have trained 380+ participants across 50+ countries in Open Science practices, with the help of 300+ mentors and experts.
**The...
Neuroscience is a multi-disciplinary field that involves scientists from diverse backgrounds such as biology, computer science, engineering, and medicine. These scientists work together to understand how the brain operates in health and disease. The areas of application in neuroscience that require software are as diverse as the scientific backgrounds and programming skills of the scientists,...
Effective management of research data and software is essential for promoting open and trustworthy research. Structured methods are needed to ensure that research artifacts remain accessible and easy to locate, in line with the FAIR principles of making research data and software findable, accessible, interoperable, and reusable [1, 2]. However, fully implementing these principles remains...
The Institute of Neuroscience and Medicine: Brain and Behavior (INM-7) at the research center Jülich combines clinical science with open source software development in different areas: Individual groups independently develop open software tools for data and reproducibility management (DataLad; https://datalad.org; Halchenko et al. 2021), mobile health applications (JTrack;...
OpenLB is one of the leading open source software projects for Lattice Boltzmann Method (LBM) based simulations in computational fluid dynamics and beyond. Developed since 2007 by an interdisciplinary and international community, it not only provides a flexible framework for implementing novel LBM schemes but also contains a large collection of academic and advanced engineering examples. It...
The ParFlow hydrologic model is an integrated variably saturated groundwater, surface
water flow simulator that incorporates subsurface energy transport and land surface
processes through the integration of the Common Land Model (CLM) as a module. In addition ParFlow has been coupled to atmospheric models, such as WRF, COSMO and ICON. ParFlow is also integrated in the German climate and...
an abstract for the poster to their demo.
Helmholtz-Zentrum Hereon operates multiple X-ray diffraction (XRD) experiments for external users and while the experiments are very similar, their analysis is not. The variety in data analysis workflows is challenging for creating FAIR analysis workflows because a lot of the analysis is traditionally done with small scripts and not necessarily easily reproducible.
Pydidas [1, 2] is a...
Particle accelerators are widely used around the world for both research and industrial purposes. The largest facilities consist of synchrotron light sources, high energy physics colliders and nuclear physics research facilities. These are essential tools for scientists in a broad range of fields from life sciences to cultural heritage and engineering, and use significant national or...
Research Software Engineering is fundamental to the German National Research Data Infrastructure (NFDI). Following that, a "deRSE Arbeitskreis NFDI" serves as a connection point for RSEs in the NFDI inside deRSE e.V.
Within the NFDI e.V., several "sections" are dealing with overarching topics, e.g., the "Sektion Common Infrastructures" with its working groups on "Data Integration (DI)",...
Rapid and precise knowledge retrieval is essential to support research in exact sciences like material science, thus optimising time management and enhancing research efficiency. Having a database of over 2,500 materials science research papers, an automated method for reliably and effectively accessing and querying this repository is necessary.
Here we show, a Retrieval-Augmented Generation...
Background:
Research associates at our institute frequently develop methods for investigating building
systems and indoor climate technology. While these researchers excel in their domains and
create valuable computational methods, they often lack formal software development
training. This leads to challenges in code maintainability and accessibility, particularly when
sharing research...
Die nachvollziehbare sowie kolloberative Erfassung und FAIRifizierung von Forschungsdaten wird in der Citizen Science Community immer wichtiger, um so ein Teil eines z.B. archäologischen Wissensgraphen zu werden und das bereits vernetzte Datennetzwerk durch qualifizierte Daten anzureichern. Nur so können diese Daten auch mit anderen Daten verknüpft werden und in internationale Initiativen wie...
The model proposed in this study aims to prevent the loss of key elements within the Scrum framework, commonly used in software development and management processes, and to facilitate their reuse. Software developers handle numerous tasks, and over time, these tasks are completed. New tasks arise, while existing tasks accumulate issues (bugs) or performance improvements. When this historical...
Immer mehr naturhistorische Sammlungen spielen ihre Daten über ihre Sammlungsobjekte in digitalen Katalogen oder Portalen aus. Zu diesen Daten gehören z.B. taxonomische Angaben wie Art, Gattung usw. oder Angaben zum Fundort oder zu Personen. Diese Daten sind für Wissenschaftler und für die breite Öffentlichkeit gleichermaßen von Interesse. Allerdings sind insbesondere taxonomische Angaben aus...
The German National Research Data Infrastructure (NFDI) and its Base4NFDI initiative have introduced the role of Service Stewards to drive the development and integration of NFDI-wide basic services. They support the service developer teams, acting as a crucial interface connecting the teams and NFDI consortia and ensuring basic services are known and meet the communities’ needs. Especially...
The qualitative data analysis software OpenQDA¹ is already available as a free public Beta for anyone to use. In this DEMO we will to showcase the upcoming 1.0 release with a real-world live coding, involving an entirely redesigned user-interface, as well as a set of fundamental AI-plugins for preparation, coding and data analysis.
1 https://github.com/openqda
The MaterialDigital Platform (PMD) project, launched in 2019, aims to advance digitalization in material science and engineering in Germany. The project focuses on creating a robust infrastructure for managing and sharing material-related data.
The PMD Workflow Store is a key component of this initiative. It serves as a repository where scientists and engineers can access, collaborate on,...
There is a large variety of types of research software at different stages of evolution. Due to the nature of research and its software, existing models from software engineering often do not cover the unique needs of RSE projects. This lack of clear models can confuse potential software users, developers, funders, and other stakeholders who need to understand the state of a particular...
Computational methods are in full swing in communication science. Part of their promise is to make communication research more reproducible. However, how this plays out in practice has not been systematically studied. We verify the reproducibility of the entire cohort of 30 substantive and methods papers published in the journal Computational Communication Research (CCR), the official...
‘Personas’ are widely used within traditional software contexts during design to represent groups of users or developer types by generating descriptive concepts from user data.
‘Social coding’ practices and version control ‘code forges’ including GitHub allow fine-grained exploration of developers’ coding behaviours though analysis of commit data and usage of repository and development...
We present the GitHub organization WIAS-PDELib, which provides an
ecosystem of free and open source solvers for nonlinear systems of PDEs written in Julia.
WIAS-PDELib is a collection of a finite element package (ExtendableFEM.jl), a finite volume package (VoronoiFVM.jl), as well as grid managers (e.g., ExtendableGrids.jl) and other related tools for grid generation and visualization.
The...