Abstract
The proposed method XPCA Gen, introduces a novel approach for synthetic tabular data generation by utilising relevant patterns present in the data. This is performed using principle components obtained through XPCA (probabilistic interpretation of standard PCA) decomposition of original data. Since new data points are obtained by synthesizing the principle components, the generated data is an accurate and noise redundant representation of original data with a good diversity of data points. The experimental results obtained on benchmark datasets (e.g. CMC, PID) demonstrate performance in ML utility metrics (accuracy, precision, recall), showing its ability to capture inherent patterns in the dataset. Along with ML utility metrics, high Hausdorff distance indicates diversity in generated data without compromising statistical properties. Moreover, this is not a data hungry method like other complex neural networks. Overall, XPCA Gen emerges as a promising solution for data privacy preservation and robust model training with diverse samples.
Original language | English |
---|---|
Title of host publication | Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods |
Editors | Modesto Castrillon-Santana, Maria De Marsico, Ana Fred |
Publisher | Science and Technology Publications, Lda |
Pages | 141-151 |
Number of pages | 11 |
Volume | 1 |
ISBN (Print) | 9789897586842 |
DOIs | |
Publication status | Published - 2024 |
Event | 13th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2024 - Rome, Italy Duration: 24 Feb 2024 → 26 Feb 2024 Conference number: 13 https://icpram.scitevents.org/NeroPRAI.aspx?y=2024 |
Conference
Conference | 13th International Conference on Pattern Recognition Applications and Methods, ICPRAM 2024 |
---|---|
Abbreviated title | ICPRAM 2024 |
Country/Territory | Italy |
City | Rome |
Period | 24/02/24 → 26/02/24 |
Internet address |
Keywords
- ML Utility
- Privacy Preservation
- Tabular Data Generation
- XPCA Decomposition