21–23 Mar 2023
LaBRI
Europe/Paris timezone

More on I/O scheduling

23 Mar 2023, 14:00
20m
Salle Ada Lovelace (INRIA)

Salle Ada Lovelace

INRIA

Project talk I/O, storage and in-situ processing Project Talks on I/O, Storage and Workflows

Speaker

Dr Frédéric Vivien (Inria)

Description

This is the report for the project 'Optimization of Fault-Tolerance Strategies for Workflow Applications'

Checkpoint operations are periodic and high-volume I/O operations and, as such, are particularly sensitive to interferences. Indeed, HPC applications execute on dedicated nodes but share the I/O system. As a consequence, interferences surge when several applications perform I/O operations simultaneously: each I/O operation takes much longer than expected because each application is only allotted a fraction of the I/O bandwidth.

This is the motivation for our study about I/O interference.
We design and evaluate several new algorithms for bandwidth sharing,
which we compare with existing work. We do not assume any knowledge
of the applications nor any regularity pattern in I/O operations.

Overall, this project talk is NOT about resilience, even though concurrent checkpoints were the initial motivation.

JLESC topic I/O

Primary author

Mr Lucas Perotin (Inria)

Co-authors

Presentation materials

There are no materials yet.