Multi-set Pre-processing of Multicolor Flow Cytometry Data

R. Folcarelli; G.H. Tinnevelt; B. Hilvering; K. Wouters; S. van Staveren; G.J. Postma; N. Vrisekoop; L.M.C. Buydens; L. Koenderman; J.J. Jansen

doi:10.1038/s41598-020-66195-3

Multi-set Pre-processing of Multicolor Flow Cytometry Data

R. Folcarelli^*, G.H. Tinnevelt^*, B. Hilvering, K. Wouters, S. van Staveren, G.J. Postma, N. Vrisekoop, L.M.C. Buydens, L. Koenderman, J.J. Jansen

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Flow Cytometry is an analytical technology to simultaneously measure multiple markers per single cell. Ten thousands to millions of single cells can be measured per sample and each sample may contain a different number of cells. All samples may be bundled together, leading to a 'multi-set' structure. Many multivariate methods have been developed for Flow Cytometry data but none of them considers this structure in their quantitative handling of the data. The standard pre-processing used by existing multivariate methods provides models mainly influenced by the samples with more cells, while such a model should provide a balanced view of the biomedical information within all measurements. We propose an alternative 'multi-set' preprocessing that corrects for the difference in number of cells measured, balancing the relative importance of each multi-cell sample in the data while using all data collected from these expensive analyses. Moreover, one case example shows how multi-set pre-processing may benefit removal of undesired measurement-to-measurement variability and another where class-based multi-set pre-processing enhances the studied response upon comparison to the control reference samples. Our results show that adjusting data analysis algorithms to consider this multi-set structure may greatly benefit immunological insight and classification performance of Flow Cytometry data.

Original language	English
Article number	9716
Number of pages	12
Journal	Scientific Reports
Volume	10
Issue number	1
DOIs	https://doi.org/10.1038/s41598-020-66195-3
Publication status	Published - 16 Jun 2020

Keywords

heterogeneity
neutrophils
reveals
visualization
VISUALIZATION
HETEROGENEITY
NEUTROPHILS
REVEALS

Access to Document

10.1038/s41598-020-66195-3Licence: CC BY

Cite this

@article{e55c67535b6d48709c0e443b6c36ae98,

title = "Multi-set Pre-processing of Multicolor Flow Cytometry Data",

abstract = "Flow Cytometry is an analytical technology to simultaneously measure multiple markers per single cell. Ten thousands to millions of single cells can be measured per sample and each sample may contain a different number of cells. All samples may be bundled together, leading to a 'multi-set' structure. Many multivariate methods have been developed for Flow Cytometry data but none of them considers this structure in their quantitative handling of the data. The standard pre-processing used by existing multivariate methods provides models mainly influenced by the samples with more cells, while such a model should provide a balanced view of the biomedical information within all measurements. We propose an alternative 'multi-set' preprocessing that corrects for the difference in number of cells measured, balancing the relative importance of each multi-cell sample in the data while using all data collected from these expensive analyses. Moreover, one case example shows how multi-set pre-processing may benefit removal of undesired measurement-to-measurement variability and another where class-based multi-set pre-processing enhances the studied response upon comparison to the control reference samples. Our results show that adjusting data analysis algorithms to consider this multi-set structure may greatly benefit immunological insight and classification performance of Flow Cytometry data.",

keywords = "heterogeneity, neutrophils, reveals, visualization, VISUALIZATION, HETEROGENEITY, NEUTROPHILS, REVEALS",

author = "R. Folcarelli and G.H. Tinnevelt and B. Hilvering and K. Wouters and {van Staveren}, S. and G.J. Postma and N. Vrisekoop and L.M.C. Buydens and L. Koenderman and J.J. Jansen",

note = "Funding Information: This research received funding from the Netherlands Organization for Scientific Research (NWO) in the framework of the Technology Area COAST of the Fund New Chemical Innovations. Publisher Copyright: {\textcopyright} 2020, The Author(s).",

year = "2020",

month = jun,

day = "16",

doi = "10.1038/s41598-020-66195-3",

language = "English",

volume = "10",

journal = "Scientific Reports",

issn = "2045-2322",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - Multi-set Pre-processing of Multicolor Flow Cytometry Data

AU - Folcarelli, R.

AU - Tinnevelt, G.H.

AU - Hilvering, B.

AU - Wouters, K.

AU - van Staveren, S.

AU - Postma, G.J.

AU - Vrisekoop, N.

AU - Buydens, L.M.C.

AU - Koenderman, L.

AU - Jansen, J.J.

N1 - Funding Information: This research received funding from the Netherlands Organization for Scientific Research (NWO) in the framework of the Technology Area COAST of the Fund New Chemical Innovations. Publisher Copyright: © 2020, The Author(s).

PY - 2020/6/16

Y1 - 2020/6/16

N2 - Flow Cytometry is an analytical technology to simultaneously measure multiple markers per single cell. Ten thousands to millions of single cells can be measured per sample and each sample may contain a different number of cells. All samples may be bundled together, leading to a 'multi-set' structure. Many multivariate methods have been developed for Flow Cytometry data but none of them considers this structure in their quantitative handling of the data. The standard pre-processing used by existing multivariate methods provides models mainly influenced by the samples with more cells, while such a model should provide a balanced view of the biomedical information within all measurements. We propose an alternative 'multi-set' preprocessing that corrects for the difference in number of cells measured, balancing the relative importance of each multi-cell sample in the data while using all data collected from these expensive analyses. Moreover, one case example shows how multi-set pre-processing may benefit removal of undesired measurement-to-measurement variability and another where class-based multi-set pre-processing enhances the studied response upon comparison to the control reference samples. Our results show that adjusting data analysis algorithms to consider this multi-set structure may greatly benefit immunological insight and classification performance of Flow Cytometry data.

AB - Flow Cytometry is an analytical technology to simultaneously measure multiple markers per single cell. Ten thousands to millions of single cells can be measured per sample and each sample may contain a different number of cells. All samples may be bundled together, leading to a 'multi-set' structure. Many multivariate methods have been developed for Flow Cytometry data but none of them considers this structure in their quantitative handling of the data. The standard pre-processing used by existing multivariate methods provides models mainly influenced by the samples with more cells, while such a model should provide a balanced view of the biomedical information within all measurements. We propose an alternative 'multi-set' preprocessing that corrects for the difference in number of cells measured, balancing the relative importance of each multi-cell sample in the data while using all data collected from these expensive analyses. Moreover, one case example shows how multi-set pre-processing may benefit removal of undesired measurement-to-measurement variability and another where class-based multi-set pre-processing enhances the studied response upon comparison to the control reference samples. Our results show that adjusting data analysis algorithms to consider this multi-set structure may greatly benefit immunological insight and classification performance of Flow Cytometry data.

KW - heterogeneity

KW - neutrophils

KW - reveals

KW - visualization

KW - VISUALIZATION

KW - HETEROGENEITY

KW - NEUTROPHILS

KW - REVEALS

U2 - 10.1038/s41598-020-66195-3

DO - 10.1038/s41598-020-66195-3

M3 - Article

C2 - 32546713

SN - 2045-2322

VL - 10

JO - Scientific Reports

JF - Scientific Reports

IS - 1

M1 - 9716

ER -