Acta Univ. Agric. Silvic. Mendelianae Brun. 2013, 61, 2709-2716
Published online 2013-12-24

Indeterminate values of target variable in development of credit scoring models

Martin Řezáč1, Lukáš Toma2

1Department of Mathematics and Statistics, Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
2Polytech Nantes, Université de Nantes 1, quai de Tourville BP 13522, 44035 Nantes Cedex 1, France

In the beginning of every modelling procedure, the first question to ask is what we are trying to predict by the model. In credit scoring the most frequent case is modelling of probability of default; however other situations, such as fraud, revolving of the credit or success of collections could be predicted as well. Nevertheless, the first step is always to define the target variable.
The target variable is generally an ’output’ of the model. It contains the information on the available data that we want to predict in future data. In credit scoring it is commonly called good/bad definition. In this paper we study the effect of use of indeterminate value of target variable in development of credit scoring models. We explain the basic principles of logistic regression modelling and selection of target variable. Next, the focus is given to introduction of some of the widely used statistics for model assessment. The main part of the paper is devoted to development and assessment of 27 credit scoring models on real credit data, which are built up and assessed according various definitions of target variable. We show that there is a valid reason for some target definitions to include the indeterminate value into the modelling process, as it provided us with convincing results.


15 live references