Targeted RNA next generation sequencing analysis of cervical smears can predict the presence of hrHPV-induced cervical lesions

Karolina M Andralojc, Duaa Elmelik, Menno Rasing, Bernard Pater, Albert G Siebers, Ruud Bekkers, Martijn A Huynen, Johan Bulten, Diede Loopik, Willem J G Melchers, William P J Leenders*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


BACKGROUND: Because most cervical cancers are caused by high-risk human papillomaviruses (hrHPVs), cervical cancer prevention programs increasingly employ hrHPV testing as a primary test. The high sensitivity of HPV tests is accompanied by low specificity, resulting in high rates of overdiagnosis and overtreatment. Targeted circular probe-based RNA next generation sequencing (ciRNAseq) allows for the quantitative detection of RNAs of interest with high sequencing depth. Here, we examined the potential of ciRNAseq-testing on cervical scrapes to identify hrHPV-positive women at risk of having or developing high-grade cervical intraepithelial neoplasia (CIN).

METHODS: We performed ciRNAseq on 610 cervical scrapes from the Dutch cervical cancer screening program to detect gene expression from 15 hrHPV genotypes and from 429 human genes. Differentially expressed hrHPV- and host genes in scrapes from women with outcome "no CIN" or "CIN2+" were identified and a model was built to distinguish these groups.

RESULTS: Apart from increasing percentages of hrHPV oncogene expression from "no CIN" to high-grade cytology/histology, we identified genes involved in cell cycle regulation, tyrosine kinase signaling pathways, immune suppression, and DNA repair being expressed at significantly higher levels in scrapes with high-grade cytology and histology. Machine learning using random forest on all the expression data resulted in a model that detected 'no CIN' versus CIN2+ in an independent data set with sensitivity and specificity of respectively 85 ± 8% and 72 ± 13%.

CONCLUSIONS: CiRNAseq on exfoliated cells in cervical scrapes measures hrHPV-(onco)gene expression and host gene expression in one single assay and in the process identifies HPV genotype. By combining these data and applying machine learning protocols, the risk of CIN can be calculated. Because ciRNAseq can be performed in high-throughput, making it cost-effective, it can be a promising screening technology to stratify women at risk of CIN2+. Further increasing specificity by model improvement in larger cohorts is warranted.

Original languageEnglish
Article number206
Number of pages12
JournalBMC Medicine
Issue number1
Publication statusPublished - 9 Jun 2022


  • Early Detection of Cancer/methods
  • Female
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Papillomaviridae/genetics
  • Papillomavirus Infections/complications
  • RNA
  • Uterine Cervical Neoplasms/diagnosis
  • Vaginal Smears
  • Targeted RNA sequencing
  • HPV
  • High risk human papilloma virus
  • Screening
  • Cervical intraepithelial neoplasia
  • Machine learning

Cite this