Combinatorial Order Pre-processing Search (COPS): A new pre-processing strategy for large-scale interpretable data analysis in process analytical technologies

Wilson Cardoso, Jussara V. Roque, Jeroen J. Jansen, Sin Yong Teng*, Reinaldo F. Teófilo*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Combinatorial Order Pre-processing Search (COPS), a novel approach for optimizing data pre-processing is proposed in this work. Unlike simultaneous hyperparameter optimization, COPS employs a priori optimization to reduce computational time while refining the search space for preprocessing sequences and combinations. It allows for setting a maximum number of pre-processing methods, while efficiently searching through combinations of methods with chemically relevant knowledge. In this work, 67 calibration datasets across various analytical techniques, including fluorescence spectroscopy, gas chromatography (GC), near-infrared spectroscopy (NIR), mid-infrared spectroscopy (MIR), visible-near-infrared spectroscopy (Vis-NIR), Raman spectroscopy, nuclear magnetic resonance (NMR) spectroscopy, and voltammetry were evaluated. COPS yielded significant improvements over existing methodologies based on design of experiment and compounded pre-processing approaches. The COPS outperformed the other methods, resulting in an average root mean square error of prediction (RMSEP) reduction of 31.7%, while also reduced the complexity (number of latent variables) of the model which allows for easier interpretation. This underscores the importance of combinatorial order set theory for the search of pre-processing method combinations (without fixing the sequence of pre-processing methods) to enhance model performance and interpretation. The novel COPS approach can be employed in process analytical technology (such as inline, online or at-line chemical sensing analytics) to enhance predictive accuracy and operational efficiency, fundamentally transforming the quality and reliability of chemical process monitoring and control.
Original languageEnglish
Article number108892
JournalComputers and Chemical Engineering
Volume192
DOIs
Publication statusPublished - 1 Jan 2025

Keywords

  • Combinatorial sets
  • Optimization
  • Pre-processing
  • Process analytical technology, Chemical process monitoring

Fingerprint

Dive into the research topics of 'Combinatorial Order Pre-processing Search (COPS): A new pre-processing strategy for large-scale interpretable data analysis in process analytical technologies'. Together they form a unique fingerprint.

Cite this