The Research Software Quality toolkit (RSQkit, https://everse.software/RSQKit/) is a knowledge hub developed within the EVERSE project (https://everse.software/) that aims to become a permanent, community-driven resource for research software quality expertise. This interactive workshop provides hands-on experience with collaborative content development, where participants will have a chance...
Understanding EU data and digital legislations is crucial for research software engineers, as they are constantly faced by the legal implications of developing, licensing, and reusing software. The popularity achieved by working with AI models in research and the use of generative AI tools in software development has further entangled an already complex legal situation.
**The aim of the...
This meet-up is for anyone teaching HPC skills, and anyone who is interested in community-led training.
HPC Carpentry is an open source community of RSEs, facility operators, developers, and others, who want to empower researchers through better, more inclusive training in HPC skills.
The project is currently in incubation to join [The...
While the FAIR principles provide some guidelines for research artifacts to be findable, accessible, interoperable and reusable, the FAIR Digital Objects (FDOs) add layers so implementations are more machine-actionable, e.g., guidelines for identifiers, typing and operations. Using web-based technologies makes it easier for researchers to implement FAIR and FDO guidelines as it reuses...
Simulation software packages are fundamental for advancing modern scientific
research. These tools vary widely in scale, from a few thousand lines of code
to millions, demanding significant human expertise and computational resources
for their development and long-term maintenance. Yet, despite this critical
role, both the developers and the process of scientific software development
are...
The aim of this presentation is to demonstrate the benefits and constraints of using a Continuous Integration and Continuous Development (CI/CD) for the testing, documentation, build and release of numerous Gitlab projects (science modules) as well as the desktop GUI application for the "Modelling software for quantum sensors in space" (MoQSpace) project at DLR (German Aerospace Centre)...
Research software development is a fundamental aspect of modern academic research, and it has now been acknowledged that the FAIR (Findable, Accessible, Interoperable, Reproducible) principles, historically established for research data, should also be applied to research software.
As software is by nature executable and evolving over time, the FAIR principles had to be adapted to this...
the teachingRSE project is community of interest formed around the idea of finding structures for the most effective education of new and developing RSEs in an academic landscape
In this working group meeting we plan to work on our forthcoming publication on detailing how to create structures for the future training of both new and practising RSEs.
In both research and industry, significant effort is devoted to the creation of standardized data models that ensure data adheres to a specific structures, enabling the development and use of common tools. These models (also called schemas) enable data validation and facilitate collaboration by making data interoperable across various systems. Tools can assist in the creation and maintenance...
Scientific research at the FAIR accelerator facility spans a wide range of fields, including Nuclear Physics, Atomic Physics, and Heavy Ion Physics. Workflows for simulations and data analysis in FAIR experiments range from High Throughput Computing to OpenMPI calculations and traditional batch processing. Operating a shared computing cluster that is scalable enough to meet the diverse needs...
We started end of 2023 with building an RSE community at ETH Zurich. After receiving some funding, we and oterhs are have been starting similar activities at other Swiss research institutions. The presentation will give an overview of our activities so far and lessons learned. We will present our ideas for the future of a Swiss wide RSE community.
The comprehensible/collaborative creation and FAIRification of research data is becoming increasingly important in the Citizen Science community to become part of an interdisciplinary knowledge graph and enrich the already interconnected data network with qualified data. Only in this way can this data be linked to other data and actively integrated into international initiatives (e.g. NFDI)...
Metadata have shown to be one of the success factors for the so-called FAIRification of research software, especially in improving the findability and reusability of research software [1], [2]. Creating high-quality metadata can be resource-intensive [3]. Moreover, users often find it challenging to utilize metadata effectively for retrieval [4], [5]. To support researchers from various...
Software license management is a critical but often overlooked aspect of Research Software Engineering (RSE). For both open-source and proprietary software projects, proper license management is increasingly important for sustainability, compliance, and collaboration. Our talk presents three key lessons learned from our experiences in license management, based on interdisciplinary projects and...
The concept of software management plans (SMPs) is similar to Data management plans (DMPs) but focusing on the research software lifecycle aligned to the FAIR for research software (FAIR4RS). DMPs consist of a series of questions and answers to outline how data will be handled during and after a research project. Similarly, an SMP helps us outline some important elements to handle and share...
Large Language Models (LLMs) have revolutionized the field of artificial intelligence, offering numerous new applications in natural language processing, such as text generation, translation, sentiment analysis and conversational interfaces. Early studies show that LLMs have not only been utilized in everyday life but have also found their way in daily work of researchers, for example, in...
How do you build online workshops that are engaging throughout the event, accessible to both novices and experts, and effective in helping students apply tools to their work? This talk introduces a proven approach to teaching Research Software Engineering (RSE) through learner-centered methodologies. Our workshops, designed around 90-minute teaching units, use concise lectures, small social...
The field of empirical software engineering faces a significant gap in standardized tools for conducting rapid and efficient Test-Driven Software Experiments (TDSEs). These experiments involve executing software subjects and observing their runtime behavior (i.e., dynamic program analysis). To address this gap, I present LASSO, a general-purpose software code analysis platform that provides a...
Research software engineering (RSE) plays a pivotal role in advancing science, yet its integration and funding vary significantly across different research programs and disciplines. In this talk, we will explore how RSE expertise has been crucial in three distinct contexts, revealing the diverse ways funding structures support—or overlook—the critical need for sustainable software...
SUS is a new HDL under development at the Paderborn Center for Parallel Computing. At its core, SUS is an RTL language intended to be used side-by-side with existing SystemVerilog and VHDL codebases. SUS has many interesting features, ranging from compile-time metaprogramming, to IDE information about clock domains and pipelining depths and metaprogramming debugging. Though this talk will...
Young researchers often are highly dependent on research software for their work. While some software skills are a basic
necessity in many, if not most, scientific fields today, the skill set so acquired is usually not sufficient to effectively develop, maintain, and design the larger software projects upon which much of modern scientific collaborative work is built.
This deficit not only...
The RSE community has created and contributed to many high quality, Open Source training materials (Code Refinery, HiDA, The Carpentries Incubator, UNIVERSE-HPC, etc). Taken in isolation, these are valuable resources. But learners and project contributors would benefit from increased findability and interoperability of individual lessons and curricula.
Simultaneously, it remains a challenge...
In a modern research organisation, recognition, career paths and visibility of RSEs depends on the integration into the organisational structure. In this talk we present our apporach at the DLR Institute for Networked Energysystems for integrating RSE roles in a knowledge hierarchy in our institute. We created a role-skill matrix, with different focuses of RSEs, which helps RSEs in identifying...
OpenLB is one of the leading open source software projects for Lattice Boltzmann Method (LBM) based simulations in computational fluid dynamics and beyond. Developed since 2007 by an international and interdisciplinary community, it not only provides a flexible framework for implementing novel LBM schemes but also contains a large collection of academic and advanced engineering examples. It...
Automatic Differentiation (AD) is an important technique for both scientific computing and machine learning. AD frameworks from the machine learning world often lack the ability to differentiate programming patterns common in scientific computing, such as mutation and parallelism.
In my talk, I will cover the AD framework in Enzyme and how it can be used to differentiate scientific codes in...
Data are now recognised as an essential research output. Data Management Plans (DMPs) have therefore become an integral part of research project planning, and are usually required by funding organisations. Research software (ranging from data-specific scripts to standalone software products) plays a crucial role in the reproducibility of scientific results, and, similar to research data, is...
After a first successful meeting of RSEs within the Leibniz Association at deRSE24, we would like to use the opportunity again and invite all RSEs which are either working on a Leibniz institute or interested in the work of RSEs in Leibniz to join us and discuss the current state of affairs within Leibniz and how we can further grow and raise awarness as RSEs in Leibniz.
Before looking into...
Buildings and their energy systems contribute to 16 % of global greenhouse emissions. These emissions may be reduced during planning and operation. However, most buildings are unique: In terms of architecture, used protocols, energy carriers, etc. Thus, deployment of optimal planning and operation does not scale as well as, e.g. the automotive industry. Therefore, both digital planning and...
With the rise of cloud computing in many areas of industry, commercial services, or science, data privacy is a growing concern for researchers and practitioners alike. In addition, with more data being processed in the cloud, the impact of a potential data breach increases as well, especially when sensitive information such as engineering, financial, or medical data is concerned. The use of...
Reliably deploying binary dependencies to users on various architectures is a non-trivial problem for package authors. More often than not this task is delegated to the user or automated using assumptions that don't always hold.
The Julia programming language built a tool-chain for robust deployment of binaries and binary dependencies called BinaryBuilder.jl that is useful well outside the...
Design decisions for research software and IT infrastructure must reflect the unique needs of academia.
Deviations from conventional best practices may be necessary to meet the requirements of academic work environments and scientific purposes.
We present our lessons learned and best practice guidelines derived from building a new specialized software environment for a large-scale...
Since the inception of the discipline at the NATO Software Engineering Conferences in the late 1960s, software engineering research and practice have primarily concentrated on business and embedded software, particularly in industrial sectors like finance and automotive. Research software that is designed and developed to facilitate research activities in various fields of science or...
Software Engineering Researchers (SERs) and Research Software Engineers (RSEs) can potentially benefit from each other: SERs can provide RSEs with state-of-the-art research knowledge, methods and tools from software engineering that can help create better software for better research. RSEs can help SERs understand the specific challenges they face in research software engineering, and thus...
Analysing data typically consists of several steps with dedicated tools, chained one after another. In theory, all this can be achieved with well-written scripts. However, workflow managers help developers increase the reproducibility of their pipelines and results by providing features for workflow and data provenance, portability, readability, and fast prototyping.
Nextflow is a...
The rapid increase of Machine Learning (ML) models and the research associated with them has created a need for efficient tools to discover, understand, and utilize these resources. Researchers often need help traversing the large collection of ML repositories and finding models that align with their specific requirements, such as open-source availability, FAIR principles, and performance...
The use of research software in digital health is becoming increasingly vital, particularly in the remote monitoring of neurological and psychiatric conditions. My work focuses on the development and implementation of the JTrack platform, an open-source solution designed for continuous data collection from smartphones, which serves as a scalable and privacy-compliant tool for digital biomarker...
Collaborative software development for demands rigorous code review processes to ensure maintainability, reliability, and efficiency. This work explores the integration of Large Language Models (LLMs) into the code review process, with a focus on utilizing both commercial and open models. We present a comprehensive code review workflow that incorporates LLMs, integrating various enhancements...
To address the urgent need to understand changes in greenhouse gas
(GHG) emissions, there has been dramatic growth in GHG measurement
and modelling systems in recent years. However, this growth has led to
substantial challenges; to date, there has been little standardisation of data products, and the interpretation of GHG data requires combined information from numerous models.
OpenGHG is...
Recent advances in large language models (LLMs) like chatGPT have demonstrated their potential for generating human-like text and reasoning about topics with natural language. However, applying these advanced LLMs requires significant compute resources and expertise that are out of reach for most academic researchers. To make scientific LLMs more accessible, we have developed Helmholtz...
Als mit der Veröffentlichung von ChatGPT die Aufmerksamkeit in einem verstärkten Maße auf Large Language Models gelenkt wurde, wusste man zwar, dass sich damit vieles verändern würde, doch die konkreten Auswirkungen waren noch nicht absehbar. Mit unserem Vortrag wollen wir für die Geschichtswissenschaften aufzeigen, was diese neue Technologie ganz konkret für unser Fach leisten kann. Am...
An investigation of the intricacies of the human brain is contingent upon the ability to encompass the diverse array of its structural and functional organization within a common reference framework. Despite the substantial advancements in brain imaging and mapping, a significant challenge persists in using information from different scales and modalities in a coherent manner within the...
With the role of the Research Software Engineer in the academic landscape now better defined, it is time to ask the question to a broader audience of how can we adapt traditional Software Engineering practices to most effectively fit the needs of the research community. What differentiates researchers who write code from RSEs? How do their aims, drivers and motivations differ and how does this...
Consider you are a reviewer checking the correctness of a research artifact or a data scientist searching for a data cleaning step or visualization to reuse.
Either way you are confronted with hundreds of lines of code, usually involving various datasets and several different plots, making it difficult to understand the code's purpose and the data flow within the program.
Addressing this...
Plant breeding and genetics demand fast, exact and reproducible phenotyping. Efficient statistical evaluation of phenotyping data requires standardised data storage ensuring long-term data availability while maintaining intellectual property rights. This is state of the art at phenomics centres, which, however, are unavailable for most scientists. For them we developed a simple and...
Research software developers are usually very familiar with the functional requirements of the software they are developing, as it is often closely linked to their research discipline. Quality requirements are often less well understood due to a lack of background in computer science [1]. Usability is one of these quality requirements that may be critical for the research software to support...
Research software is attracting increasing attention from both society and funding agencies, such as the German Research Foundation (DFG). There are lots of exciting opportunities for research into how software engineering practices can be best applied to help the people who develop research software. However, many potential research projects in this area falter because of either...
In the field of extragalactic astronomy we have typical two groups: The observers and the theorists. The nature of the data these two groups work with is very different: Observers count photons with the instrument detectors and theorists work with particles that have specific physical properties. This results in a rather small scientific exchange between both groups.
Generally, there are two...
Numerical modeling has a long history in climate and weather forecasting, with advancements being made continually over the last century due to technological progress. In the early 2000s, the development of ICON as an icosahedral grid-based, nonhydrostatic model started. It is Germany's primary model for weather predictions and climate studies (https://www.icon-model.org/). ICON is a flexible,...
This presentation explores experimental platform and software engineering approaches for providing high-performance computing infrastructure to interdisciplinary research projects in the humanities and the applied sciences. Two research projects are presented that demonstrate very different use cases, both in terms of scale and functional requirements.
"[#Vortanz][1]" was a project running...
A rapidly emerging community is developing and publishing several software components and application cases on top of the coupling library [preCICE][1] for partitioned simulations. While several community-building measures have led to more users and contributors, the resulting contributions are often not readily findable, accessible, interoperable, and reusable. The DFG project [preECO][2]...
Qualitative research involves a large number of different analysis methods, which are increasingly supported by the use of qualitative data analysis software. In addition to larger closed commercial software products, there are a number of open source projects that implement individual analysis methods. A major problem for the sustainability of these fragmented software projects is the lack of...
Generative AI has generated enormous interest since ChatGPT was launched in 2022. However, adoption of this new technology in research has been limited due to concerns about the accuracy and consistency of the outputs produced by generative AI.
In an exploratory study on the application of this new technology in research data processing, we identified tasks for which rule-based or...
Building Information Modelling (BIM) is extensively used in the AEC (Architecture, Engineering,
and Construction) industry to optimize processes throughout the design, construction, and
operation of buildings and to promote collaboration among stakeholders. Despite long-standing
efforts to facilitate interoperability through open standards, most notably the Industry Foundation
Classes...
High resolution video recordings at high frame rates are necessary for a variety of research projects. This can pose a challenge for systems in terms of hard- and software, particularly if multiple streams need to be recorded simultaneously. The aim of this project was to design a setup that would allow for the recording of multiple streams at high frame rates and various image resolutions,...
Energy research software (ERS) plays a vital role because it enables and supports numerous tasks in energy research. The complexity of ERS ranges from simple scripts and libraries, e.g., for Python, to full software solutions. [1]
ERS is often developed by energy researchers with diverse backgrounds (e.g., physics, mechanical engineering, electrical engineering, computer science, social...
In the natESM sprint process, RSEs work closely with scientists to tackle technical challenges within a collaborative research environment, presenting unique interpersonal and communication challenges. In this meetup we want to highlight and discuss some hurdles we as natESM RSEs encounter, including bridging gaps in technical knowledge, adapting to diverse communication styles, and balancing...
Open-source software development has become a fundamental driver for innovation in both academia and industry, fostering transparency and enabling collaboration among individuals who may not have formal training in computer science. Academic researchers benefit from open-source collaboration in several aspects: (1) User engagement in feature development and interfaces, enhancing the...
Jupyter notebooks have revolutionized the way researchers share code, results, and documentation, all within an interactive environment, promising to make science more transparent and reproducible. In research contexts, Jupyter notebooks often coexist with other software and various resources such as data, instruments, and mathematical models, all of which may affect scientific...
Based on my experience as developer and maintainer of some numerical open-source libraries (libcerf, libkww, libformfactor), I will explain key concepts for writing code that computes a special function or integral with high accuracy and high speed.
- Choose different numerical algorithms for different argument regions.
- Don't be afraid of divergent series or ill-conditioned...
The long-term preservation and accessibility of research data will accelerate future research. To reduce structural and financial risks in research data management (RDM) in Germany, the National Research Data Infrastructure (German acronym: NFDI) was established for "bundling expertise and creating universal access to services for research data management” (1).
NFDI-MatWerk is one of 26...
We'll explore the key lessons learned from planning and running several community workshops aimed at fostering the adoption and use of open-source tools, in this case the models REMIND and MAgPIE and their encompassing open source ecosystem. Our goals were to educate interested people who had no prior experience, deepen the understanding and capabilities of users already familiar with the...
Effective management of research data and software is essential for promoting open and trustworthy research. Structured methods are needed to ensure that research artifacts remain accessible and easy to locate, in line with the FAIR principles of making research data and software findable, accessible, interoperable, and reusable [1, 2]. However, fully implementing these principles remains...
Software discovery is a crucial aspect of research, yet it remains a challenging process due to various reasons: The lack of a centralized or domain-tailored search and publication infrastructure, insufficient software citations, the prevailing unavailability of software (versions) and many others. Researchers tend to utilize general search engines and their social network before considering...
Each NFDI consortium works on establishing research data infrastructures tailored to its specific
domain. To facilitate interoperability across different domains and consortia, the NFDIcore
ontology was developed and serves as a mid level ontology for representing metadata about
NFDI resources such as individuals, organizations, projects, and data portals [1]. The NFDIcore
ontology has...
Have you developed an open source scientific software and it has now become popular? Congratulations! Your software has entered a new phase of its life cycle, and you are now a community manager. Your new role includes: training the next generation of users, identifying and converting power users into contributors, fostering networking opportunities, and making your software visible to a wider...
The reproducibility of scientific simulations is one of the key challenges of scientific research.
Current best practices involve version-controlled code, tracking dependencies, specifying hardware configurations, and sometimes using Docker containers to enable one-click simulation setups. However, these approaches still fall short of achieving true reproducibility. For example, Docker...
RSEs are required to publish reproducible software to satisfy the FAIR for Research Software Principles. To save RSEs the arduous labor of manual publication of each version, they can use the tools developed in the HERMES project. HERMES (HElmholtz Rich MEtadata Software Publication) is an open source project funded by the Helmholtz Metadata Collaboration. The HERMES tools help users automate...
This contribution is an extended abstract of the paper originally published in the proceedings of the 2024 IEEE 32nd International Requirements Engineering Conference (RE). The paper assesses the potential of requirements classification approaches to identify parts of requirements that are irrelevant for automated traceability link recovery between requirements and code. We were able to show...
It is certainly too early to herald the end of RSEng already, but
the future is in constant flux and we should discuss how RSEng will change in response to the upcoming challenges in an open discussion.
Some questions to get the discussion started:
- How will RSE change in the face of the ongoing digitalization of society? If people are better prepared in school?
- How will AI impact...
The MaterialDigital initiative represents a major driver towards the digitalization of material science. Next to providing a prototypical infrastructure required for building a shared data space and working on semantic interoperability of data, a core focus area of the Platform MaterialDigital (PMD) is the utilisation of workflows to encapsulate data processing and simulation steps in...
After a lively and productive meet-up “Building a community around your Open Source research software” at the deRSE2024, we summarized and clustered all your input. Now at the deRSE2025 will present the key findings regarding the following questions:
- How to prepare research software for third-party users/developers?
- How to attract new third-party users/developers?
- ...
In this contribution, we introduce the German Reproducibility Network (GRN) to the Open Source and Research Software Engineering community. The GRN aims to increase trustworthiness, transparency, and reproducibility in scientific research in Germany and beyond, as part of a broader international network of reproducibility initiatives. Since its founding in 2020, the GRN has grown into a...
Recent developments in open data policies of meteorological agencies have much expanded the set of up-to-date weather observation and forecast data that is publicly available to meteorological research and education. To improve use of this open data, we have developed 3-D visualization products that extract and display meteorological information in novel ways. In this demo, we present...
The curation of software metadata safeguards their quality and compliance with institutional software policies. Moreover, metadata that was enriched with development and usage information can be used for evaluation and reporting of academic KPIs. Software CaRD ("Software Curation and Reporting Dashboard"; ZT-I-PF-3-080), a project funded by the Helmholtz Metadata Collaboration (HMC), develops...
The Coccinelle project was established to ease maintenance of the Linux kernel driver code, written in the C programming language.
Nowadays Coccinelle belongs to the toolkit of the Linux kernel maintainers.
We are working to enable another ambitious goal -- that of large-scale code refactoring, with HPC and C++ in mind.
This poster tells last year's progress of our collaboration, evidencing...
Code development and maintenance in a team can be a daunting process especially when multiple modules are interconnected with variegated dependencies, dispersed over a few git repositories and/or developed in different versions of the software. Consequently, VENQS is established to set up an infrastructure and workflow for a semi-automated version and dependency management. This is achieved...
Music-related projects dealing with complex metadata have a very long tradition in musicology and have produced a great variety of project-specific data formats and structures. This, however, hinders interoperability between data corpora and, ultimately, the full exploitation of the unprecedented potential of cutting-edge computer science. In this context, the schema defined within the Music...
Interdisciplinary collaborative scientific networks often rely on a multitude of different software systems for data storage and data exchange. Keeping data findable and in sync between different sites, working groups and institutes can be challenging. We developed a solution based on the open source software LinkAhead that combines meta data from different repositories into a single research...
cff2pages is a tool that generates HTML files from metadata collected in the Citation File Format (CFF). It can be used to create a static page on GitHub or GitLab Pages to showcase a project. This is particularly useful for small research software projects, offering an easy-to-use workflow that converts machine-readable metadata into human-readable formats for several purposes:
Enhancing...
The Collaborative OPen Omics (COPO) is a data and metadata broker that advances open science by supporting the principles of Findability, Accessibility, Interoperability, and Reuse (FAIR). As reliance on shared data grows, COPO addresses metadata management challenges by using community-sanctioned standards, specifically Darwin Core (DwC) and Minimum Information about any Sequence (MIxS)....
The society wants to have a poster. And remember that the society has a 10 minute talk roundabout the first keynote.
With the ever-increasing data sizes employed at large experiments and their associated computing needs, many applications can benefit from access to dedicated cluster resources, in particular server-grade GPUs for machine learning applications. However, computing clusters are more often tailored to batch job submission and not to online data visualisation. Infrastructure-as-a-Service (IaaS)...
The "Digital Edition Levezow Album" project is an interdisciplinary collaboration between the Hub of Computing and Data Science (HCDS), the Department of Art History at the University of Hamburg, and the State and University Library Hamburg. The project aims to digitally process and interactively visualize a previously unexplored sketchbook from the late 17th century, containing drawings on...
Der Zusammenhang zwischen Methoden, diese implementierenden Werkzeugen (Software) und ihrer Nützlichkeit für die Untersuchung einer Forschungsfrage und -gegenstands ist von immanentem Interesse für die computationell arbeitenden Geisteswissenschaften. In der Folge hat sich das Toolverzeichnis in den Digital Humanities inzwischen als eigenes Genre etabliert: von TAPoR (3.0)[^1] (Grant u. a....
As scientific research increasingly relies on software to handle complex data, limited formal training in software development among researchers often leads to issues with documentation, code reliability, and reproducibility. In this study, we conducted an empirical analysis of 5,300 open-source research repositories, focusing on practices aligned with FAIR4RS recommendations. Python was the...
Students, postdocs, and other researchers continuously seek to develop beneficial skills for their work.
One traditional way to up-skill is through workshops, but scheduling conflicts and varied learning styles can be barriers to effective learning. To address these challenges, we propose a learning framework that leverages GitHub’s capabilities. The idea follows from a digital version of a...
In the realm of biomedical research, the ability to accurately assess document-to-document similarity is crucial for efficiently navigating vast amounts of literature. OntoClue is a comprehensive framework designed to evaluate and implement a variety of vector-based approaches to enhance document-to-document recommendations based on similarity, using the RELISH corpus as reference. RELISH is...
The [European Virtual Institute for Research Software Excellence (EVERSE)][1] is an EC-funded project that aims to establish a framework for research software excellence. The project brings together a consortium of European research institutions, universities, and infrastructure providers to collaboratively design and champion good practices for high-quality, sustainable research software. You...
In this demo, we present the [TIDO Viewer][1], a flexible application developed by SUB Göttingen, specifically designed for the interactive presentation of digital texts and objects. In combination with the [TextAPI][2], the TIDO Viewer enables the dynamic integration and visualization of digitized content. This synergy supports various use cases in research and library environments, offering...
Research software development is a fundamental aspect of academic research, and it has now been acknowledged that the FAIR (Findable, Accessible, Interoperable, Reusable) principles, historically established to improve the reusability of research data, should also be applied to research software. However, specific aspects of Research Software like executability or evolution over time require...
Being cross-disciplinary at its core, research in Earth System Science comprises divergent domains such as Climate, Marine, Atmospheric Sciences and Geology. Within the various disciplines, distinct methods and terms for indexing, cataloguing, describing and finding scientific data have been developed, resulting in a large amount of controlled Vocabularies, Taxonomies and Thesauri. However,...
There are many methods for conducting research in literature. The research and transfer cycle within energy system research projects by Stephan Ferenz describes how to carry out a FAIR research project in six steps. However, these steps are very general and do not focus on research software. In energy research, simulation software is especially a vital research artifact. Therefore, we are...
Researchers from a broad spectrum of scientific fields use computers to aid their research, often starting at their own laptop or institutional workstation. At some point in the research, additional help in form of algorithmic or software engineering consultancy or even additional computational resources in form of access to high-performance computing (HPC) systems may become necessary....
Earth System Modeling (ESM) involves a high variety and complexity of processes to be simulated which resulted in the development of numerous models, each aiming on the simulation of different aspects of the system. These components are written in various languages, using different High-Performance Computing (HPC) techniques, tools, and overlap or lack functionalities.
To use the national...
Managing projects with external collaborators sometimes comes with the burden of ensuring inbound contributions respect legal obligations. Where a low-level 'Developer Certificate of Origin (DCO)' approach only introduces certain checks, a 'Contributor License Agreements (CLAs)', on the other hand, relies on documenting signed CLAs and thus dedicated book-keeping.
In this poster, we showcase...
The relevance of Open Science and Open Data is becoming increasingly obvious in modern day publications. Frequently, scientists write their own analysis code, as the complexity of analysis increases and the combination of methods become more relevant – from code conversion, to measuring and comparing. These functions and methods are not stable, are subject to change, are constrained to the use...
Particle accelerators are complex machines consisting of hundred of devices. Control systems and commissioning g applications are used to steer, control and optimise them. Online models allow deriving characteristic parameters during operation.
These online models need to combine components that use different views of the same physic quantity. Therefore appropriate support has to be...
Research in linguistics is increasingly data-driven and requires access to language corpora, i.e. “collection[s] of linguistic data, either written texts or a transcription of recorded speech, which can be used as a starting-point of linguistic description or as a means of verifying hypotheses about a language” (Crystal 2003). Here, language itself is the object of study, and not just an...
The growing volume of high-resolution time series data in Earth system science requires the implementation of standardised and reproducible quality control workflows to ensure compliance with the FAIR data standards. Automated tools such as SaQC[1] address this need, but lack the capacity for manual data review and flagging. It is therefore the intention of this project to develop a...
Effective monitoring of (computing) infrastructure, especially in complex systems with various dependencies, is crucial for ensuring high availability and early detection of performance issues. This poster demonstrates the integration of Prometheus and GitLab CI/CD to modernize our existing infrastructure monitoring methods. As infrastructure checks increase, our legacy monitoring system faces...
Increasing energy demand and the need for sustainable energy systems have initiated the global and German energy transition. The building and mobility sectors promise high potential for savings in final energy and greenhouse gas emissions through renewable energy technologies. NESSI was developed to reduce the complexity of decisions for an efficient, resilient, affordable, and low-emission...
ABSTRACT
We present a new cohort-based training program by OLS (formerly Open Life Science). OLS is a non-profit organisation dedicated to capacity building and diversifying leadership in research worldwide (https://we-are-ols.org/). Since 2020, we have trained 380+ participants across 50+ countries in Open Science practices, with the help of 300+ mentors and experts.
**The...
Neuroscience is a multi-disciplinary field that involves scientists from diverse backgrounds such as biology, computer science, engineering, and medicine. These scientists work together to understand how the brain operates in health and disease. The areas of application in neuroscience that require software are as diverse as the scientific backgrounds and programming skills of the scientists,...
Effective management of research data and software is essential for promoting open and trustworthy research. Structured methods are needed to ensure that research artifacts remain accessible and easy to locate, in line with the FAIR principles of making research data and software findable, accessible, interoperable, and reusable [1, 2]. However, fully implementing these principles remains...
The Institute of Neuroscience and Medicine: Brain and Behavior (INM-7) at the research center Jülich combines clinical science with open source software development in different areas: Individual groups independently develop open software tools for data and reproducibility management (DataLad; https://datalad.org; Halchenko et al. 2021), mobile health applications (JTrack;...
OpenLB is one of the leading open source software projects for Lattice Boltzmann Method (LBM) based simulations in computational fluid dynamics and beyond. Developed since 2007 by an interdisciplinary and international community, it not only provides a flexible framework for implementing novel LBM schemes but also contains a large collection of academic and advanced engineering examples. It...
Helmholtz-Zentrum Hereon operates multiple X-ray diffraction (XRD) experiments for external users and while the experiments are very similar, their analysis is not. The variety in data analysis workflows is challenging for creating FAIR analysis workflows because a lot of the analysis is traditionally done with small scripts and not necessarily easily reproducible.
Pydidas [1, 2] is a...
Particle accelerators are widely used around the world for both research and industrial purposes. The largest facilities consist of synchrotron light sources, high energy physics colliders and nuclear physics research facilities. These are essential tools for scientists in a broad range of fields from life sciences to cultural heritage and engineering, and use significant national or...
Research Software Engineering is fundamental to the German National Research Data Infrastructure (NFDI). Following that, a "deRSE Arbeitskreis NFDI" serves as a connection point for RSEs in the NFDI inside deRSE e.V.
Within the NFDI e.V., several "sections" are dealing with overarching topics, e.g., the "Sektion Common Infrastructures" with its working groups on "Data Integration (DI)",...
Background:
Research associates at our institute frequently develop methods for investigating building
systems and indoor climate technology. While these researchers excel in their domains and
create valuable computational methods, they often lack formal software development
training. This leads to challenges in code maintainability and accessibility, particularly when
sharing research...
Die nachvollziehbare sowie kolloberative Erfassung und FAIRifizierung von Forschungsdaten wird in der Citizen Science Community immer wichtiger, um so ein Teil eines z.B. archäologischen Wissensgraphen zu werden und das bereits vernetzte Datennetzwerk durch qualifizierte Daten anzureichern. Nur so können diese Daten auch mit anderen Daten verknüpft werden und in internationale Initiativen wie...
The model proposed in this study aims to prevent the loss of key elements within the Scrum framework, commonly used in software development and management processes, and to facilitate their reuse. Software developers handle numerous tasks, and over time, these tasks are completed. New tasks arise, while existing tasks accumulate issues (bugs) or performance improvements. When this historical...
Immer mehr naturhistorische Sammlungen spielen ihre Daten über ihre Sammlungsobjekte in digitalen Katalogen oder Portalen aus. Zu diesen Daten gehören z.B. taxonomische Angaben wie Art, Gattung usw. oder Angaben zum Fundort oder zu Personen. Diese Daten sind für Wissenschaftler und für die breite Öffentlichkeit gleichermaßen von Interesse. Allerdings sind insbesondere taxonomische Angaben aus...
The German National Research Data Infrastructure (NFDI) and its Base4NFDI initiative have introduced the role of Service Stewards to drive the development and integration of NFDI-wide basic services. They support the service developer teams, acting as a crucial interface connecting the teams and NFDI consortia and ensuring basic services are known and meet the communities’ needs. Especially...
The qualitative data analysis software OpenQDA¹ is already available as a free public Beta for anyone to use. In this DEMO we will to showcase the upcoming 1.0 release with a real-world live coding, involving an entirely redesigned user-interface, as well as a set of fundamental AI-plugins for preparation, coding and data analysis.
1 https://github.com/openqda
The MaterialDigital Platform (PMD) project, launched in 2019, aims to advance digitalization in material science and engineering in Germany. The project focuses on creating a robust infrastructure for managing and sharing material-related data.
The PMD Workflow Store is a key component of this initiative. It serves as a repository where scientists and engineers can access, collaborate on,...
There is a large variety of types of research software at different stages of evolution. Due to the nature of research and its software, existing models from software engineering often do not cover the unique needs of RSE projects. This lack of clear models can confuse potential software users, developers, funders, and other stakeholders who need to understand the state of a particular...
‘Personas’ are widely used within traditional software contexts during design to represent groups of users or developer types by generating descriptive concepts from user data.
‘Social coding’ practices and version control ‘code forges’ including GitHub allow fine-grained exploration of developers’ coding behaviours though analysis of commit data and usage of repository and development...
We present the GitHub organization WIAS-PDELib, which provides an
ecosystem of free and open source solvers for nonlinear systems of PDEs written in Julia.
WIAS-PDELib is a collection of a finite element package (ExtendableFEM.jl), a finite volume package (VoronoiFVM.jl), as well as grid managers (e.g., ExtendableGrids.jl) and other related tools for grid generation and visualization.
The...
Research outputs in general require certain qualities to facilitate reuse as described by the FAIR Principles. For research software specifically, software engineering methods can help realize these goals. However, the desired qualities may differ between commercial and research software or even software in HPC environments. The focus on performance introduces challenges such as additional...
Musicological research is challenged with the complexities of analyzing multiple revisions and variants of musical compositions [2]. The need for systematic tools to handle this variability has become increasingly important as musicologists rely more on computational methods for analysis. This talk presents an approach that introduces feature-based versioning known from software engineering to...
This skill-up series, promoted by the EVERSE project, consists of two 45-minute sessions designed to help the audience understand how to improve their software's sustainability and FAIRness. The sessions combine presentations with demonstrations of key tools.
Session 1: Software Metadata
- Introduction to metadata for research software
- Exploring key metadata standards (Citation File...
Which software tools do we rely on?
Which tools do we take for granted?
What are their properties?
Are they legacy? Obsolete? Free? Freeware? Proprietary?
Do we contribute to them? Is their cost justified?
Are they maintainable, extensible, repairable?
What is astroturfing? What is a monopoly and how does affect us?
Is open-source better than freeware ? And than free?
What's the...
For interdisciplinary research, software engineering has to take into consideration the different scientific perspectives on interacting processes, non-matching terminologies and the coordination of research teams from multiple institutions. This contribution presents an example from the field of water quality modelling in rivers, that requires the coupling of a complex biological model to a...
What defines an "RSE unit"? Where does it fit into the German academic research environment? What are typical tasks of such a unit? What could its structure look like? What are typical challenges the units face?
These are only some of the questions that the upcoming (at the time of talk submission) paper with the same name (https://github.com/DE-RSE/2023_paper-RSE-groups) addresses and which...
In a digitalized world, the use and development of research software is fundamental for research. Reusing research software can improve the quality and efficiency of research. Therefore, Chue Hong et al. defined the FAIR principles for research software [1] which describe how FAIR search software looks like. Ideally, making research software FAIR is not the last step in the research process....
In December 2024, roughly 35 members of the German RSE community will follow an invitation by the VolkswagenStiftung to Hannover for a symposium on "Code for Science or: Better Research Software through better research software competencies". The symposium is co-located with three other symposia as part of a larger event on "Digital Competenencies in the Academic System". Our idea is to bring...
For the vast majority of researchers across disciplines, software use is an everyday practice, and data analysis is not the only way of scientific sensemaking with software. The talk presents survey results showing that research pipelines are populated with diverse types of software – among them software tailored for research purposes („research software“) as well as software covering broader...
Places as main access to everyday environments (Cresswell 2004) are no fixed entities, but are in a constant change. Practice theory (Schatzki 2002) describes how places are composed of social and material arrangements influencing how people interact with them and alter them according to their needs. How people read places depends on and number of factors including their current need and their...
The Scientific Software Center (SSC), established in fall 2020 at the Interdisciplinary Centre for Scientific Computing of the University of Heidelberg, provides institutional support to researchers of all faculties in software development and software engineering best practices. The SSC promotes reproducibility and sustainability of research software. The support offered by the SSC is based...
The increasing demand for accessible Natural Language Processing (NLP) tools, as well as the continuing growth of data and the associated processing time, in the Digital Humanities (DH) community has highlighted the need for platforms that lower the barrier to advanced textual analysis across various research fields in the humanities. MONAPipe, short for “Modes of Narration and Attribution...
Researchers often come to us for RSE help with quite specific technical questions, such as "how can I parallelise this Python code?", or "why does this matlab function use all my RAM?", and it can seem natural to dive straight into directly answering their question.
Sometimes this is the best approach to help them, but in many cases it is worth asking some more questions about what it is...
With the retirement of a colleague we were handed the Fortran source code for a computational software.
At that stage the software was feature complete, offering a large variety of options for simulation of semiconductors. Along with the implementation of advances physical models for semiconductors and optoelectronic devices,
key features at the time of development where a custom scripting...
In this hands-on workshop, we introduce the open source software LinkAhead, which promotes agility in semantic data management: LinkAhead is a semantic research data management system, facilitating enhanced data findability and reusability through data embedding into context. Its flexible data model (the data structure can be changed without migration of existing data) allows to leverage...
Research software has been categorized in different contexts to serve different goals. We start with a look at what research software is, before we discuss the purpose of research software categories. We propose a multi-dimensional categorization of research software. We present a template for characterizing such categories. As selected dimensions, we present our proposed role-based,...
As research software is becoming increasingly fundamental in almost all scientific domains, its development and maintenance is of significant importance for scientists. Currently, scientists often lack the profound knowledge and tools to develop and maintain this software throughout its often long lasting life cycle. To promote high-quality software, adequate support, and appropriate...
As part of the Incubator Initiative, the Helmholtz Association has promoted the field of research software engineering. Among other activities, the Helmholtz Research Software Directory (RSD) was developed and the Helmholtz Software Award was launched.
But these great ideas have raised questions:
• How exactly do you find the great software?
• How do you encourage the development...
The Semantic Web is a treasure trove of interconnected knowledge graphs, providing access to datasets that are invaluable for research in cultural heritage and archaeology. Resources such as triplestores (e.g., the NFDI4Objects Knowledge Graph), Wikibase instances (e.g., Wikidata and FactGrid), and Solid Pods housing geoscientific data open new avenues for interdisciplinary exploration....
Imagine: A 30 year old Fortran code. 10K lines of three-letter variables, almost no commentary of varying correctness and no one left to remember how it works. Amazingly it is still in use - even though it is unclear how exactly it calculates what it calculates…
Somewhere buried in these dusty bits and bytes supposedly lies an algorithm that promises to be better than the tools that a...
Precision psychiatry faces significant challenges, including limited sample sizes and the generalizability of findings, variability in clinical phenotyping, and the need for robust biomarkers to guide personalized treatment approaches. Additionally, the integration of diverse data sources—such as multi-omics, electrophysiology, neuroimaging, clinical records, and cognitive-behavioural...
Research software plays a pivotal role in the Helmholtz Association. So HGF decided to also include software in its research evaluation. A dedicated task group has proposed a new evaluation approach that recognizes software quality through multiple dimensions rather than a single score. This multi-faceted framework considers different factors (like the FAIR4RS criteria), acknowledging the...
The earth system modelling framework MESSy (Modular Earth Submodel System: https://messy-interface.org/) consists of around 3.5 Mio. lines of pure code, most of it written in Fortran, and is mainly used on large HPC clusters. Here, users as well as developers usually have to configure and build the software package on their own with the help of a build system which is an essential part of...
Visualization is an important and ubiquitous tool in the daily work of weather forecasters and atmospheric researchers to analyse data from simulations and observations. The domain-specific meteorological research software Met.3D (documentation including installation instructions available at https://met3d.readthedocs.org) is an open-source effort to make interactive, 3-D, feature-based, and...
Research Software Engineering is fundamental to the German National Research Data Infrastructure (NFDI). Following that, a "deRSE Arbeitskreis NFDI" serves as a connection point for RSEs in the NFDI inside deRSE e.V. Within the NFDI e.V., several "sections" are dealing with overarching topics, e.g., the "Sektion Common Infrastructures" with its working groups on Data Integration (DI),...
Julia is a friendly, fast and flexible programming language for scientific (and beyond) computing. In this tutorial, we will introduce Julia for high-performance computing with a focus on performance portability.
Using various examples, we will show how to write parallel programs for GPUs, shared-memory and distributed parallelism. We will use both a high-level array-based programming and a...
How researchers discover new software, which systems they use and how these systems must be designed to improve the process of software discovery - these are driving questions in the area of software discovery.
There is a wide range of options for software discovery, such as code and publication repositories, domain, geographic or institution specific catalogs, classical search engines,...
SustainKieker is a software sustainability research project that aims to improve the reusability and maintainability of research software. Our project employs the Kieker Observability Framework, which started in 2006, to monitor and analyze software systems. The Kieker framework provides monitoring, analysis, and visualization support for performance evaluation and reverse engineering of the...
Abstract
From pioneering work on intelligent code completion to large language models, AI has have significant impact on software engineering over the past two decades. This keynote presentation traces the evolution of AI-assisted programming, highlighting advancements and outlining future directions.
The talk is structured in three parts. First, we’ll journey back to 2000-2010,...
Sustainability in research software is crucial to ensure others can understand, reproduce, and build upon your work effectively, potentially extending its functionality with new algorithms and its applications to new domains. In this tutorial, you will learn about tools you can leverage in the MATLAB ecosystem to effectively enhance the maintainability and reusability of your research...
During our 2-hour visit of the ZKM, you can freely explore the exhibitions and/or join one of the 25-minute tours at 16:00, 16:30, 17:00, and 17:30.
As a participant of SE, you can sign up for one of the tours here:
https://terminplaner6.dfn.de/b/6929c2654c818b42711582002ec91c79-1056327
More details about the ZKM and its current exhibitions are available at https://zkm.de
Building and maintaining industrial systems comes with a unique set of challenges. They have to live over extended periods of time, meet stringent reliability and safety standards, have to deal with budgets in low-margin industries, and need to adapt to modern expectations on ease-of-use and fast innovation.
In this keynote, we will elaborate on how ABB deals with these competing forces....
As electric vehicles become more software-centric, AI-driven features increasingly shape the driving experience—from adaptive navigation to proactive diagnostics—yet they often rely on vast amounts of sensitive data. In this session, we will explore two complementary strategies to address these challenges: first, how synthetic data generated or augmented via Large Language Models and...
Künstliche Intelligenz (KI) hat in den letzten Jahren immense Fortschritte gemacht, die niemand vorhergesehen hat, und die Arbeitswelt auch in der Software-Entwicklung nachhaltig verändern. Zwar ist dieser Sprung eigentlich schon vorüber, er hat aber Investitionen unvorstellbaren Ausmaßes angestoßen, sodass es auch weiterhin große Fortschritte geben wird. Ein Ende ist erst dann abzusehen, wenn...
Abstract:
KI revolutioniert die Produktentwicklung – doch was bedeutet das konkret für Produktverantwortliche und Produktentwickelnde? In diesem Vortrag entdecken wir, wie Large Language Models die Spielregeln verändern: vom Innovationsprozess bis hin zu völlig neuen Produkten. Mit praxisnahen Einblicken zeigen wir, welche Chancen und Herausforderungen KI für die Zukunft unserer Arbeit...