Project 4: Credit Risk Assessment
A project for finance industry or a hypothetical bank. Using machine learning models to predict whether a particular customer will default on their loan or not, based on different demographic indicators collected for the individual.

Project code:
Github Repository: credit-risk-assessment
Project Hosted on:

Importance:
- Banks want to reduce their credit risk. The recession in 2008 in the US happened because mortgage loans were given to people with bad credit scores resulting in a lot of people defaulting due to which a lot of banks and investors had to face huge financial losses.
Glossary:
- Credit Risk Modeling : Using data about a person to predict the probability of that person paying back the loan.
- Default risk : A person’s probability to not pay back their loan. Banks use this to calculate the interest rate for a person. They charge people interest rates that are proportional to their default risk.
About the Dataset:
Data Source: Credit Risk Dataset, Kaggle
Column | Description | |
---|---|---|
loan_status | Loan status | 0 is non default 1 is default |
person_age | Age | numerical |
person_home_ownership | home ownership status | RENT, MORTGAGE, OWN, OTHER |
person_emp_length | Employment length (in years) | numerical |
loan_intent | Whether they own a home or not | PERSONAL, EDUCATION, MEDICAL, VENTURE, HOMEIMPROVEMENT, DEBTCONSOLIDATION |
loan_grade | Loan grade | ‘D’, ‘B’, ‘C’, ‘A’, ‘E’, ‘F’, ‘G’ |
loan_amnt | Loan amount | numerical |
loan_int_rate | Interest rate | numerical |
loan_percent_income | Loan to income ratio | numerical |
cb_person_default_on_file | Historical default | ‘Y’, ‘N’ |
cb_person_cred_hist_length | credit history length | numerical |
Models and Technologies:
-
Azure Data Studio: Data in csv file loaded to SQL database created in Azure Data Studio. SQL server started on a Docker container.
-
Tableau: Connected to SQL database and built a dashboard for data visualization.
-
Python Jupyter Notebook: Built a machine learning pipeline using sklearn, pandas and XGboost to develop a classifier for predicting whether a person will default on their loan or not.
-
Flask Web application: Developed a python application using flask that takes certain inputs and used the already trained model in the backend to predict whether this new person will default on their loan or not.
-
Google Cloud Platform: Hosted the Flask app on GCP.

References:
[1] Credit Risk Modelling in Python by Rahul Sisodia
[2] Credit Risk Modelling in Python by Paul Bananzi
[3] How to Deploy Machine Learning Model using Flask (Iris Dataset) | Python by Aswin S.
[4] API Series #3 - How to Deploy Flask APIs to the Cloud (GCP) by James Briggs
[5] Azure Data Studio Essential Training by Adam Wilbert