Weight-of-evidence through shrinkage and spline binning for interpretable nonlinear classification

J. Raymaekers; W. Verbeke; T. Verdonck

doi:10.1016/j.asoc.2021.108160

Weight-of-evidence through shrinkage and spline binning for interpretable nonlinear classification

J. Raymaekers, W. Verbeke, T. Verdonck^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

53 Downloads (Pure)

Abstract

In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise and interpretable. Linear modeling methods such as logistic regression are often adopted since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high cardinality or to exploit nonlinear relations in the data. As a solution, data preprocessing methods such as weight of evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad hoc or expert-driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing nonlinear effects in predictor variables and yields highly interpretable predictors that take only a small number of discrete values. Moreover, we extend the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this method offers an improved ability to exploit both nonlinear and categorical predictors to achieve increased classification precision while maintaining the interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in fraud detection and credit risk settings, which illustrate the effectiveness of the presented approach.

Original language	English
Article number	108160
Number of pages	12
Journal	Applied Soft Computing
Volume	115
DOIs	https://doi.org/10.1016/j.asoc.2021.108160
Publication status	Published - 1 Jan 2022

Keywords

Feature engineering
Interpretability
Fraud detection
Credit risk
MODELS
AREA
PERFORMANCE
ALGORITHMS

Access to Document

10.1016/j.asoc.2021.108160

Full TextFinal published version, 1.45 MBLicence: Taverne

Cite this

@article{b5a7ca81a0d94de5b1dfbce515f2ef12,

title = "Weight-of-evidence through shrinkage and spline binning for interpretable nonlinear classification",

abstract = "In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise and interpretable. Linear modeling methods such as logistic regression are often adopted since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high cardinality or to exploit nonlinear relations in the data. As a solution, data preprocessing methods such as weight of evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad hoc or expert-driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing nonlinear effects in predictor variables and yields highly interpretable predictors that take only a small number of discrete values. Moreover, we extend the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this method offers an improved ability to exploit both nonlinear and categorical predictors to achieve increased classification precision while maintaining the interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in fraud detection and credit risk settings, which illustrate the effectiveness of the presented approach.",

keywords = "Feature engineering, Interpretability, Fraud detection, Credit risk, MODELS, AREA, PERFORMANCE, ALGORITHMS",

author = "J. Raymaekers and W. Verbeke and T. Verdonck",

note = "data source: publicly shared datasets for illustration",

year = "2022",

month = jan,

day = "1",

doi = "10.1016/j.asoc.2021.108160",

language = "English",

volume = "115",

journal = "Applied Soft Computing",

issn = "1568-4946",

publisher = "Elsevier Science",

}

TY - JOUR

T1 - Weight-of-evidence through shrinkage and spline binning for interpretable nonlinear classification

AU - Raymaekers, J.

AU - Verbeke, W.

AU - Verdonck, T.

N1 - data source: publicly shared datasets for illustration

PY - 2022/1/1

Y1 - 2022/1/1

N2 - In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise and interpretable. Linear modeling methods such as logistic regression are often adopted since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high cardinality or to exploit nonlinear relations in the data. As a solution, data preprocessing methods such as weight of evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad hoc or expert-driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing nonlinear effects in predictor variables and yields highly interpretable predictors that take only a small number of discrete values. Moreover, we extend the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this method offers an improved ability to exploit both nonlinear and categorical predictors to achieve increased classification precision while maintaining the interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in fraud detection and credit risk settings, which illustrate the effectiveness of the presented approach.

AB - In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise and interpretable. Linear modeling methods such as logistic regression are often adopted since they offer an acceptable balance between precision and interpretability. Linear methods, however, are not well equipped to handle categorical predictors with high cardinality or to exploit nonlinear relations in the data. As a solution, data preprocessing methods such as weight of evidence are typically used for transforming the predictors. The binning procedure that underlies the weight-of-evidence approach, however, has been little researched and typically relies on ad hoc or expert-driven procedures. The objective in this paper, therefore, is to propose a formalized, data-driven and powerful method. To this end, we explore the discretization of continuous variables through the binning of spline functions, which allows for capturing nonlinear effects in predictor variables and yields highly interpretable predictors that take only a small number of discrete values. Moreover, we extend the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators. Together, this method offers an improved ability to exploit both nonlinear and categorical predictors to achieve increased classification precision while maintaining the interpretability of the resulting model and decreasing the risk of overfitting. We present the results of a series of experiments in fraud detection and credit risk settings, which illustrate the effectiveness of the presented approach.

KW - Feature engineering

KW - Interpretability

KW - Fraud detection

KW - Credit risk

KW - MODELS

KW - AREA

KW - PERFORMANCE

KW - ALGORITHMS

U2 - 10.1016/j.asoc.2021.108160

DO - 10.1016/j.asoc.2021.108160

M3 - Article

SN - 1568-4946

VL - 115

JO - Applied Soft Computing

JF - Applied Soft Computing

M1 - 108160

ER -