Mar 5 – 7, 2024
Julius-Maximilians-Universität Würzburg
Europe/Berlin timezone

Research Software Engineering in NFDI4Objects: Community building, implementation of FAIRification Tools and scripting in Digital Archaeology

Mar 6, 2024, 11:30 AM
20m
HS5

HS5

Talk (15min + 5min) Research Software for Digital Humanities RSE in Digital Humanities

Speakers

Florian Thiery (Leibniz-Zentrum für Archäologie (LEIZA)) Lutz Krister Schubert (University of Cologne)

Description

Research Software plays an increasing role in the context of Humanities and, specifically, Archaeology to support the analysis of the vast and ever-growing data. As more and more disciplines come together and perform advanced analyses (e.g., with ancient DNA analysis), the demand for reproducible and testable results becomes more serious. So far, most tools have been created ad-hoc to test a hypothesis, but this does not comply with modern objective research practices. Instead, well-designed, and proven tools and methods are needed that allow reproducible and well-structured results. Tools must thereby be equally accessible and FAIR as the data itself, in compliance with the standard right of access to and participation in culture.

NFDI4Objects (N4O) is a broad community dealing with material remains of human history, the FAIR and CARE principles as well as FAIR4RS. The goal is to integrate the community into so-called Community Clusters to strengthen Software as Research Data, Publication and Citation of Research Software as well as the RSE profile.

This paper presents the community participation possibilities, N4O FAIRification Tools (e.g., Alligator, AMT, SPARQL Unicorn) and examples from Computational Archaeology (e.g., R and Python scripts, AI) in the context of CAA-DE.

During the last German chapter CAA conference in Würzburg (September 2023), multiple software tools were presented and discussed, such as for modelling stratigraphy (implemented in Python), cluster analysis for archaeological finds (implemented in R), using AI techniques for detection of archaeological sites on satellite images or classification of Celtic coins. All these approaches were designated using different tools, programming methods and methodologies. It could be noted that very few of them adhered to (Research) Software Engineering principles, making it difficult for any uptaker to understand or re-use the code. What is worse, few were published or made accessible, as the results (aka the data) were deemed more important than the means for generating them. This impacts reproducibility and therefore, reduces the value and credibility of the results. Most tools were developed for analysis and, therefore, have a notion of "quick hacks", which developed into more complete programs as the questions asked started to develop with the analysis. This also led to a lack of re-use of existing tools and methods.

These challenges are not new in the context of IT and are representative of any scientific code development, but awareness of their relevance for good scientific work is slowly rising. Areas new to IT are however more susceptible to these pitfalls. It is therefore even more relevant to identify these issues from the beginning and develop and teach good RSE principles with these communities; to demonstrate the relevance of said principles and to not see them as a burden but as a potential for continuation and improvement of research.

In this paper we want to highlight the challenges and approaches to engage the archaeological community in Research Software Engineering.

Primary authors

Fabian Fricke (Deutsches Archäologisches Institut) Florian Thiery (Leibniz-Zentrum für Archäologie (LEIZA)) Lutz Krister Schubert (University of Cologne)

Co-authors

Agnes Schneider (CAA Deutschland) Jürgen Landauer (CAA Deutschland)

Presentation materials