Credit Risk Evaluation using Credit Card data

1 minute read

A credit lending agency has approached as a client in regard to get help in evaluating the Credit risk of their customers and provided with the customer database. Let’s evaluate.

00. Project Overview
01. Data Overview
02. Modelling Overview
03. Logistic Regression
04. Random Forest
05. XGBoost Classifier
06. Modelling Summary
07. Predicting Missing Loyalty Scores
08. Growth & Next Steps

Project Overview

Context

The client, credit lending agency has approached in regard to get help in evaluating the Credit risk involved in the customer database. The dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients.

The aim of this work is to predict the accuracy if the customer would default or not the next payment.

To achieve this a predictive model is built which finds the relations between age, education, marriage, and delay in previous payments.

Actions

Firstly the necessary data from tables in the database needed to be compiled, gathering key customer metrics that may help predict if the customer may default.

For predicting the outcome, three various modelling approaches are considered. Namely:

Logistic Regression
Random Forest
XGBoost

Results

The testing has found that XGBoost has the highest predictive accuracy.

Metric 1: Precision

Logistic Regression =
Random Forest =
XGBoost =

Metric 2: Recall

Logistic Regression =
Random Forest =
XGBoost =

Metric 3: f1 score

Logistic Regression =
Random Forest =
XGBoost =

As the

Growth/Next Steps

While predictive accuracy was relatively high - other modelling approaches could be tested, especially those somewhat similar to Random Forest, and XGBoost, for example LightGBM to see if even more accuracy could be gained.

From a data point of view, further variables could be collected, and further feature engineering could be undertaken to ensure that we have as much useful information available for predicting if a customer may default or not.

Suraj Karyamapudi

Data Science Portfolio