Helmholtz AI Consultant Retreat

Europe/Berlin
Brauhaus zum Löwen Felchtaer Str. 2-4 99974 Mühlhausen/Thüringen
Adeniyi Mosaku, Antony Zappacosta, Florian Kofler (HAI), Helene Hoffmann, Lisa Barros de Andrade e Sousa (Helmholtz AI), Marie Weiel (Karlsruhe Institute of Technology), Sabrina Benassou
Description

This is the event page for the internal Helmholtz AI consultant retreat!

Dear fellow AI Consultants,

we are happy to officially announce the 2024 Helmholtz AI Consultants Retreat! This retreat will be a mixed one, ranging from actual science stuff over community building to organizational matters of our consultant work.

Location: Hotel Brauhaus zum Löwen, Felchtaer Str. 2-4, 99974 Mühlhausen / Thüringen

Start: Mon, Jan 29, 12:00-13:30 (arrival & lunch), welcome starts at 14:00.
End: Wed, Jan 31, 14:00

If you plan to attend, please register by Fri, Dec 15.

On the Indico page , you also find a separate form to submit contributions to parallel focus sessions (approx. 1h) on a topic of your choice: https://events.hifis.net/event/1172/abstracts/ 
These sessions provide you the opportunity to present your ongoing research project to interested peers, discuss challenges and problems you are currently facing, or spread the word about your favorite tool. You are completely free in what and how you present. Please remember: This is (y)our retreat, i.e., no contributions – no retreat :heavy_heart_exclamation_mark_ornament:

There should be at least one contribution from each center.

Make sure to join the retreat Mattermost channel where you can ask questions and get the latest info from the organizers.

We look forward to seeing you soon!

All the best, 
Marie, Adeniyi, Antony, Florian, Helene, Lisa, and Sabrina

Registration
Registration for HAI Consultants Retreat Winter 2024
    • 12:00 14:00
      Arrival and Lunch 2h
    • 14:00 15:00
      Welcome and Icebreaker 1h

      Get in touch with each other!

    • 15:00 16:15
      Contributions 1

      Your contributions here!

      • 15:00
        Gaussian Processes in the current AI era 1h 15m

        Today we mainly use neural nets to solve many of our tasks. Although they give great results, they do have their shortcomings, such as the limited ability to quantify uncertainty for example. In this open session we can focus on Gaussian Processes/GPs as an alternative to neural nets in certain situations, and discuss some of the recent trends and work that has been carried out on this topic. One framework of interest is gpytorch, that can be used in conjunction with PyTorch to create hybrid models, such as Deep Gaussian Process models. We can address Gaussian Processes for the problem of image in-painting, or for other tasks that you've had experience with. Let's discuss tasks and settings for which Gaussian Processes are more appropriate than neural nets, as well as promising mature tools that are the state of the art when dealing with GPs nowadays.

      • 15:00
        Helmholtz AI Consulting Outreach 1h 15m

        Let's discuss outreach activities that help us finding cool voucher projects. I'd like to address indiviual interests of different centers as I think they are quite diverse. I'm planning to send around a survey to all consultants before the retreat to obtain an overview of the status quo. Potential questions and subtopics may be for example:

        • Satisfaction with currently running voucher projects (What makes a voucher more or less intersting? Is it solely the prospect of publication?)
        • Number of voucher proposals and selection criteria (Are there enough requests? Are there too many requests? How can you identify interesting projects in the beginning?)
        • Goal of outreach activities for each center (Stay in your own research field? Foster collaboration between centers?)
        • Different kinds of outreach activities and how to organize and execute them (roadshows, networking activities, follow-up vouchers, ...)
      • 15:00
        Uncertainty Quantification 1h 15m

        Uncertainty quantification (UQ) methods enable us to equip machine learning model predictions with "error bars". These are useful in many areas from active learning to out-of-distribution detection and in general to tackle trustworthiness issues (think driving, health, legal applications).

        We'll also touch on recent methods to make generative models (hello ChatGPT) express confidence in their answers (i.e. is the answer likely to be true or a hallucination?).

        We'll give an introduction to the main ideas and most common methods, followed by a relaxed open discussion.

        This won't be a talk or hands-on workshop, but rather a casual meetup for people who apply and/or are interested in uncertainty quantification (UQ) for machine learning models. Whether you are an expert or have never heard of UQ, let's talk!

        List of notes and papers from a previous version of this format: https://github.com/elcorto/37c3_uq_meetup

    • 16:15 16:45
      Coffee Break 30m
    • 16:45 18:00
      Contributions 2

      Your contributions!

      • 16:45
        Helmholtz Blablador: An Inference Server for Scientific Large Language Models 1h

        Recent advances in large language models (LLMs) like chatGPT have demonstrated their potential for generating human-like text and reasoning about topics with natural language. However, applying these advanced LLMs requires significant compute resources and expertise that are out of reach for most academic researchers. To make scientific LLMs more accessible, we have developed Helmholtz Blablador, an open-source inference server optimized for serving predictions from customized scientific LLMs.

        Blablador provides the serving infrastructure to make models accessible via a simple API without managing servers, firewalls, authentication or infrastructure. Researchers can add their pretrained LLMs to the central hub. Other scientists can then query the collective model catalog via web or using the popular OpenAI api to add LLM functionality in other tools, like programming IDEs.

        This enables a collaborative ecosystem for scientific LLMs:

        • Researchers train models using datasets and GPUs from their own lab. No need to set up production servers. They can even provide their models with inference happening on cpus, with the use of tools like llama.cpp.
        • Models are contributed to the Blablador hub through a web UI or API call. Blablador handles loading models and publishing models for general use.
        • Added models become available for querying by other researchers.

        A model catalog displays available LLMs from different labs and research areas.

        Besides that, one can train, quantize, fine-tune and evaluate LLMs directly with Blablador.

        The inference server is available at https://helmholtz-blablador.fz-juelich.de
        Blablador is an inference server for LLM for Scientists of the Helmholtz Foundation. Anyone with Eduroam access can use it at https://helmholtz-blablador.fz-juelich.de

        Speaker: Alexandre Strube (Helmholtz AI)
      • 16:45
        Research Software Engineering Practices @ KIT 1h

        Scientific software is becoming increasingly important for the success of research and to generate new knowledge. However, the importance of codes for science has not yet been fully recognized in all research groups, labs and scientific consulting units. Some of the practical challenges are that people are lacking formal education in software engineering approaches or that the conventional approaches from industrial contexts cannot be directly mapped to academia. Therefore, we would like to give a little insight on Research Software Engineering (RSE) approach we are using at the local unit Energy@KIT. We attempt to show case each of the techniques we are using at hand of a real-world software package (co-)developed by us.

    • 18:00 19:00
      Dinner 1h
    • 09:00 12:30
      Communication Workshop 3h 30m
    • 10:45 11:15
      Coffee Break 30m
    • 12:30 13:30
      Lunch 1h
    • 13:30 18:00
      Dicussions in Interest Groups 4h 30m
    • 18:00 19:00
      Dinner 1h
    • 19:30 21:00
      Pub Quiz 1h 30m
    • 09:00 09:15
      Warm Up 15m
    • 09:15 10:45
      Contributions 3

      You contributions!

      • 09:15
        AI Ethics (in the life sciences) 1h 30m

        In today's rapidly evolving technological landscape, the ethical implications of emerging innovations are a matter of paramount concern. In this session, we will discuss current challenges regarding AI ethics, and how they matter to us as AI consultants.

        As a basis for discussion, I will give a brief input about AI ethics (in healthcare / the life sciences). We will then open the floor to questions and share knowledge and experiences of how we (should) incorporate ethical thinking in our consulting day to day work.

        If there is time and folks are interested I could
        a) can give an overview of some methods used in AI ethics in healthcare and/or the life sciences to assess and mitigate ethical challenges
        b) talk about responsibility gaps in handling AI applications in clinical practice
        c) summarize the particular biases related to single-cell sequencing modeling

        (we can decide together during this session)

        I look forward to this!

      • 09:15
        Reviewing and creating content for a dimensionality reduction course 1h 30m

        We are currently starting to devise a course entitled "A practical guide to dimension reduction and feature selection". The idea is to showcase dimensionality reduction methods, feature selection and stability analysis not on toy data but on a real-world high-dimensional dataset. We already selected a gene expression dataset as an example to discuss advantages of and issues with dimension reduction approaches. During the retreat, we would like to invite our fellow consultants to comment on the planned content, test and extend some preliminary jupyter notebooks, or add methods explanations and further contributions. All kind of input is welcome, no matter whether it is hands-on, theoretical, or conceptual, and whether you have expertise in the topic or approach it from a learner's perspective.

    • 10:45 11:15
      Coffee Break 30m
    • 11:15 12:30
      The Future of Helmholtz AI - News and Discussion 1h 15m
    • 12:30 13:30
      Lunch 1h
    • 13:30 14:00
      Wrap-up and feedback 30m