Comparative analyses of newly available human genome assemblies highlight extensive variation that peaks at centromeres. Reliance on a single generic reference genome can thus hinder whole-genome analysis of sequencing data derived from laboratory cell lines and limit their accurate genomic manipulation. Here, we demonstrate that using an “isogenomic” diploid reference genome – specific for the experimental cell line – substantially improves the accuracy of genomic, epigenomic, transcriptomic analyses and genome editing compared to a non-matched reference. Using our recently generated reference genome of the widely used diploid human cell line RPE-1, we uncover haplotype-specific genetic and epigenetic divergence across all centromeres. Mapping quality of RPE-1 data – DNA- and RNA-seq reads, improves both genome-wide and at highly divergent loci when using the matched RPE1v1.1 reference, resolving haplotype-specific enrichment. For genome engineering experiments, centromeric CRISPR guide RNA efficiency and chromosome specificity are best achieved using the RPE-1 reference. Leveraging high-confidence CUT&RUN read mapping using the matched reference, we define the site of the human kinetochore and identify a wide variation in the position, size and structural organization between haplotypes and chromosomes. This work establishes matched-reference genomics as a powerful framework for high-precision cell biology, calling for the systematic assembly of experimentally relevant cell line genomes.
Cell line-matched reference enables high-precision functional genomics
Colantoni A.;
2025-01-01
Abstract
Comparative analyses of newly available human genome assemblies highlight extensive variation that peaks at centromeres. Reliance on a single generic reference genome can thus hinder whole-genome analysis of sequencing data derived from laboratory cell lines and limit their accurate genomic manipulation. Here, we demonstrate that using an “isogenomic” diploid reference genome – specific for the experimental cell line – substantially improves the accuracy of genomic, epigenomic, transcriptomic analyses and genome editing compared to a non-matched reference. Using our recently generated reference genome of the widely used diploid human cell line RPE-1, we uncover haplotype-specific genetic and epigenetic divergence across all centromeres. Mapping quality of RPE-1 data – DNA- and RNA-seq reads, improves both genome-wide and at highly divergent loci when using the matched RPE1v1.1 reference, resolving haplotype-specific enrichment. For genome engineering experiments, centromeric CRISPR guide RNA efficiency and chromosome specificity are best achieved using the RPE-1 reference. Leveraging high-confidence CUT&RUN read mapping using the matched reference, we define the site of the human kinetochore and identify a wide variation in the position, size and structural organization between haplotypes and chromosomes. This work establishes matched-reference genomics as a powerful framework for high-precision cell biology, calling for the systematic assembly of experimentally relevant cell line genomes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

