The sheep pangenome project

Pangenomes, represented as variation graphs, have become valuable genomic resources for investigating the breadth of genetic diversity across species. These graphs capture various types of genetic variation including single nucleotide variants (SNVs), structural variants (SVs), and copy number variants (CNVs) which contribute significantly to phenotypic diversity within and between species. The Ovis genus exhibits extensive biodiversity and phenotypic variation. In this study, we utilize a pangenome constructed from 25 sheep breeds—comprising 22 domestic breeds and three wild relatives—to investigate the genetic drivers of trait variation within Ovis. We compare two tools for constructing variation graphs: Pangenome Graph Builder (PGGB) and Minigraph-Cactus (MC). PGGB captures a broader spectrum of genetic variation, resulting in more complex graphs, whereas MC simplifies graph structure through pruning, making it more suitable for downstream genotyping. We discuss the computational challenges associated with processing PGGB graphs, particularly the increased resource demands. Additionally, we present examples of SVs identified through these graphs that are linked to known biological traits in sheep. While variation graph-based approaches offer powerful insights into SVs within species, the computational burden of genotyping short reads using complex graphs remains a significant limitation.