Authors: Christos Grammatikos, Spyros Papathanasiou, Spyros Missiakoulis
Title: Methods Comparison for Credit Risk Evaluation – Estimating the Probability of Default in a Peer-to-Peer Loans Portfolio
Abstract
The main objective of this study focuses on the probability of default (PD) and the quantitative methods of its estimation in the retail banking described by the term credit scoring. These methods include traditional algorithms such as the benchmark method of logistic regression, alternative methods like survival analysis and, especially in recent years with the development of computing power and the increase of available information (Big Data), techniques from the Machine Learning field - random forests, neural networks, SVM and gradient boosting machines, among others. At the same time, supervisory requirements on credit risk have increased, focusing on the transparency and results interpretation for the models used. Following the literature review, the empirical part of our study is performed by applying five basic models to estimate the PD for a P2P loans portfolio. The results are then assessed in terms of discriminative power and accuracy using the corresponding performance metrics, indicating that the random forest and the XGBoost methods achieve by far the best estimates in the development sample compared to the other methods. The difference, though, is significantly reduced when the models’ estimates are applied to the two test samples.

