Data Visualization for Biological Sciences
In this practical, interactive course, you'll learn to create engaging, scientifically accurate visualizations specifically tailored for biological research. Using programming tools like R and Python, along with graphic editing software such as Inkscape, you'll gain essential skills for clearly and effectively visualizing genomic and biological data.
Some prior programming experience is required for this course. Our aim is not to dive deeply into every coding detail, but rather to equip you with practical skills to adapt existing visualization scripts for your own data. Participants will collaborate in small groups (3-4 participants per group, about 10 groups), creating visualizations from provided datasets that cover topics like allele frequencies, mutation patterns, and other genomic data.
Additionally, participants are encouraged to explore AI tools like ChatGPT and Gemini for coding support, troubleshooting, and inspiration, as they work on their visualizations. We'll discuss effective and responsible ways to use such tools, addressing both their advantages and potential pitfalls.
Each session concludes with group discussions and instructor-led feedback to help refine visualizations and deepen understanding of best practices, visualization design principles, and publication standards.
This course is designed for PhD students, postdocs and early-career researchers in biology, bioinformatics, and related fields. While the concepts and skills taught are broadly applicable across the life sciences, our use cases and datasets are primarily drawn from genomics.
What to Bring:
- A laptop (Windows, macOS, or Linux). Please have R and Python installed. Alternatively, you can use Google Colab for Python and R-based exercises.
- Access to your preferred AI assistant (e.g., ChatGPT, Gemini, or another system you already use) for coding support and troubleshooting or a Google account.
- Inkscape (free, open-source graphic editing software).
Course Schedule Overview
This course will take place from 18 to 20 November 2025 (6 hours per day) online via Zoom.
Tue, Nov 18, 10 am - 5 pm: Foundations and Basic Plotting
Begin by exploring fundamental plot types essential to biological research, such as scatter plots, bar plots, violin plots, histograms, and boxplots. You’ll become familiar with core data handling functions (head(), View()), data formats (TSV vs. CSV, numeric formatting), and critical concepts like adding annotations to your plots. By the end of the day, you'll confidently create and modify essential visualizations for biological datasets.
Wed, Nov 19, 10 am - 5 pm: Advanced Visualization Techniques
Progress to more sophisticated visualization methods commonly used in genomics and molecular biology. You'll learn dimensionality reduction techniques, particularly PCA, and how to create detailed, annotated heatmaps using tools like ComplexHeatmap in R. Additionally, you'll understand the importance of color choices, and effective layouts that highlight critical biological insights.
Thu, Nov 20, 10 am - 5 pm: From Plot to Publication
This final session will guide you through transforming your visualizations into polished, publication-ready graphics. You'll learn basic editing techniques using Inkscape for vector graphics, explore critical details like DPI settings, appropriate file formats, and best practices for assembling multi-panel figures. This day emphasizes clarity, readability, and visual impact, ensuring your research graphics effectively communicate your scientific findings.