25 February 2025 to 1 March 2025
Building 30.95
Europe/Berlin timezone

Modernizing Legacy Infrastructure Monitoring: Enhancing Performance with Prometheus and GitLab CI/CD

26 Feb 2025, 18:00
2h
Audimax Foyer (Building 30.95)

Audimax Foyer

Building 30.95

Str. am Forum 1, 76131 Karlsruhe
Poster best practices for research software development Poster and Demo Session together with Reception

Speaker

Benjamin Bruns (FZJ)

Description

Effective monitoring of (computing) infrastructure, especially in complex systems with various dependencies, is crucial for ensuring high availability and early detection of performance issues. This poster demonstrates the integration of Prometheus and GitLab CI/CD to modernize our existing infrastructure monitoring methods. As infrastructure checks increase, our legacy monitoring system faces growing challenges such as performance bottlenecks, limited scalability, and maintenance difficulties. Prometheus, with its real-time monitoring and alerting capabilities, offers a scalable and flexible solution. It supports both horizontal and vertical scaling, efficient data storage, and a modular architecture that facilitates the seamless integration of various existing monitoring tools, such as specialized exporters.
Using Prometheus as our backend involves setting up a containerized system, creating data sources and targets, and configuring (custom) metrics and alerts. The use of GitLab’s CI/CD pipeline further automates the building, deployment and testing processes. Additionally, Grafana, when used alongside Prometheus, provides a robust visualization tool to display statistics and reports, such as CPU and GPU usage or file quotas. This approach not only enhances efficiency and ensures timely alerts for potential issues but also keeps the monitoring system up-to-date and resilient. It also provides users with valuable statistics through a modern and flexible backend. Furthermore, containerizing the new monitoring system offers significant advantages, including portability, scalability, and modularization.
The poster presents selected infrastructure systems, directly comparing the usability and performance of our legacy script-based monitoring system and the new Prometheus-based monitoring system.

I want to participate in the youngRSE prize no

Primary author

Presentation materials