High-throughput single cell data analysis - A tutorial

G.H. Tinnevelt*, K. Wouters, G.J. Postma, R. Folcarelli, J.J. Jansen

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

White blood cells protect the body against disease but may also cause chronic inflammation, auto-immune diseases or leukemia. There are many different white blood cell types whose identity and function can be studied by measuring their protein expression. Therefore, high-throughput analytical instruments were developed to measure multiple proteins on millions of single cells. The information-rich biochemistry information may only be fully extracted using multivariate statistics. Here we show an overview of the most essential steps for multivariate data analysis of single cell data. We used white blood cells (immunology) as a case study, but a similar approach may be used in environment or biotech research. The first step is analyzing the study design and subsequently formulating a research question. The three main designs are immunophenotyping (finding different cell types), cell activation and rare cell discovery. When preparing the data it is essential to consider the design and focus on the cell type of interest by removing all unwanted events. After pre-processing, the ten-thousands to millions of single cells per sample need to be converted into a cellular distribution. For immunophenotyping a clustering method such as Self-Organizing Maps is useful and for cell activation a model that describes the covariance such as Principal Component Analysis is useful. In rare cell discovery it is useful to first model all common cells and remove them to find the rare cells. Finally discriminant analysis based on the cellular distribution may highlight which cell (sub)types are different between groups.

Original languageEnglish
Article number338872
Number of pages14
JournalAnalytica Chimica Acta
Volume1185
DOIs
Publication statusPublished - 15 Nov 2021

Keywords

  • ARTIFICIAL NEURAL NETWORKS
  • AUTOMATED IDENTIFICATION
  • COMPENSATION
  • FLOW-CYTOMETRY DATA
  • FLUORESCENCE
  • MASS CYTOMETRY
  • ORGANIZING MAPS
  • PARTIAL LEAST-SQUARES
  • SOLVING CHEMICAL PROBLEMS
  • VISUALIZATION

Cite this