Use of deep learning methods to translate drug-induced gene expression changes from rat to human primary hepatocytes

Shauna D O'Donovan*, Kurt Driessens, Daniel Lopatta, Florian Wimmenauer, Alexander Lukas, Jelmer Neeven, Evgueni Smirnov, Michael Lenz, Gokhan Ertaylan, Danyel G J Jennen, Natal A W van Riel, Rachel Cavill, Ralf L M Peeters, Theo M C M de Kok

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


In clinical trials, animal and cell line models are often used to evaluate the potential toxic effects of a novel compound or candidate drug before progressing to human trials. However, relating the results of animal and in vitro model exposures to relevant clinical outcomes in the human in vivo system still proves challenging, relying on often putative orthologs. In recent years, multiple studies have demonstrated that the repeated dose rodent bioassay, the current gold standard in the field, lacks sufficient sensitivity and specificity in predicting toxic effects of pharmaceuticals in humans. In this study, we evaluate the potential of deep learning techniques to translate the pattern of gene expression measured following an exposure in rodents to humans, circumventing the current reliance on orthologs, and also from in vitro to in vivo experimental designs. Of the applied deep learning architectures applied in this study the convolutional neural network (CNN) and a deep artificial neural network with bottleneck architecture significantly outperform classical machine learning techniques in predicting the time series of gene expression in primary human hepatocytes given a measured time series of gene expression from primary rat hepatocytes following exposure in vitro to a previously unseen compound across multiple toxicologically relevant gene sets. With a reduction in average mean absolute error across 76 genes that have been shown to be predictive for identifying carcinogenicity from 0.0172 for a random regression forest to 0.0166 for the CNN model (p < 0.05). These deep learning architecture also perform well when applied to predict time series of in vivo gene expression given measured time series of in vitro gene expression for rats.

Original languageEnglish
Article numbere0236392
Number of pages20
Issue number8
Publication statusPublished - 11 Aug 2020



Cite this