TY - JOUR
T1 - A bioinformatics pipeline for identifying homoplasmic and heteroplasmic mitochondrial DNA SNVs in single-cell RNA-Seq datasets
AU - Guan, Zhiling
AU - Lindsey, Patrick
AU - Kamps, Rick
AU - Smeets, Hubert J. M.
PY - 2025/11/1
Y1 - 2025/11/1
N2 - Mitochondrial DNA (mtDNA) single nucleotide variants (SNVs) are associated with various pathologies, predominantly in energy-demanding tissues like muscles and brain. Characterizing these SNVs at the single-cell level is crucial for understanding their mechanism and clinical manifestation. Publicly available single-cell RNA sequencing (scRNA-seq) data could be an invaluable resource, but existing pipelines fall short in reliable detection of mtDNA SNVs from scRNA-seq data. Therefore, we developed a novel bioinformatics pipeline, that includes quality control, alignment to the mitochondrial genome, SNV calling, and annotation, and that filtersout sequencing errors. Coverage-dependent thresholds are customizable for detecting heteroplasmic SNVs. Duplicate reads can be retained as the majority were valid biological duplicates. Strand bias errors, exceeding a 1:3 ratio, RNA modification-induced errors, identified by the presence of multiple alternative alleles at the same position, and overrepresented SNVs were removed. Our data demonstrated that this pipeline effectively detects homoplasmic and heteroplasmic mtDNA SNVs in scRNA-Seq data.
AB - Mitochondrial DNA (mtDNA) single nucleotide variants (SNVs) are associated with various pathologies, predominantly in energy-demanding tissues like muscles and brain. Characterizing these SNVs at the single-cell level is crucial for understanding their mechanism and clinical manifestation. Publicly available single-cell RNA sequencing (scRNA-seq) data could be an invaluable resource, but existing pipelines fall short in reliable detection of mtDNA SNVs from scRNA-seq data. Therefore, we developed a novel bioinformatics pipeline, that includes quality control, alignment to the mitochondrial genome, SNV calling, and annotation, and that filtersout sequencing errors. Coverage-dependent thresholds are customizable for detecting heteroplasmic SNVs. Duplicate reads can be retained as the majority were valid biological duplicates. Strand bias errors, exceeding a 1:3 ratio, RNA modification-induced errors, identified by the presence of multiple alternative alleles at the same position, and overrepresented SNVs were removed. Our data demonstrated that this pipeline effectively detects homoplasmic and heteroplasmic mtDNA SNVs in scRNA-Seq data.
KW - Mitochondrial DNA
KW - Single-cell RNA sequencing
KW - SNVs calling
KW - RADIATION
U2 - 10.1016/j.ygeno.2025.111122
DO - 10.1016/j.ygeno.2025.111122
M3 - Article
SN - 0888-7543
VL - 117
JO - Genomics
JF - Genomics
IS - 6
M1 - 111122
ER -