Description
Genetic analysis of medical imaging traits can help uncover disease mechanisms. Although digital pathology datasets offer vast opportunities for these analyses, methods for genetic analysis of such datasets are lacking. To fill this gap, we introduce HistoGWAS, a toolkit for genome-wide association studies (GWAS) of histological images. HistoGWAS utilizes semantic autoencoders to generate embeddings of the images, paired with a scalable variance component test for GWAS of the extracted embeddings. We apply HistoGWAS to eleven tissue types from the GTEx cohort, where we identified four genome-wide significant loci after correction for multi-trait testing (P<1.6⋅10-10). A critical advancement of HistoGWAS is its capability to visualize the histological traits associated with the discovered variants, which we used alongside traditional post-GWAS analyses to study identified genome-wide significant loci. Finally, we provide a comprehensive power analysis, laying the groundwork for the strategic design of future histology cohorts tailored for genetic investigation with HistoGWAS.