Speaker
Description
Sustainability and the ecological impact of deep-sea mining operations are critical concerns addressed through environmental monitoring. Utilizing environmental DNA (eDNA) sequencing coupled with Machine Learning (ML) has proven effective in accurate monitoring, particularly in coastal environments. Currently, our goal is to broaden the application of this effective approach to encompass deep-sea environments, taking advantage of its speed and reliability.
Our goal is to understand and predict alterations in the environmental quality of the deep-sea ecosystem induced by nodule harvesting. Working with microbial communities identified through eDNA sequencing approaches, we seek to uncover species interactions and reactions to changes in environmental parameters. While Supervised Machine Learning (SML) has proven effective in coastal settings, its applicability in the deep-sea remains uncertain. Tree-based methods, such as Random Forest, emerge as potential tools for the deep sea, given the expected high dimensionality of ecological datasets derived from sequencing data. We also want to explore ML clustering approaches like k-means clustering and network analysis to extract information without prior ecological knowledge of microorganisms, crucial in the largely unexplored deep-sea environment. Overall, the prediction of various objectives, such as microbial community interactions, the prediction of biological responses, and sample categorization are enabled through classification and regression analysis provided by a multitude of ML algorithms.
We here present the MANIDE project, dedicated to this exploration, comprehensively tests sequencing approaches—metabarcoding, metagenomics, and metatranscriptomics—with ML. We want to depict recent finding as well as discuss potential bottlenecks e.g. spatial and temporal heterogeneity as they present a challenge, requiring separation from the essential impact information we aim to extract. Furthermore, the project is committed to transparency, making all data, workflow, and findings available to the scientific community.