0% found this document useful (0 votes)

15 views28 pages

FinTech Group Project

The document discusses the evolution of credit risk analysis, highlighting the transition from manual, subjective assessments to modern machine learning techniques that enhance predictive accuracy and efficiency in lending. It details various applications of FinTech in credit risk, including automated scoring models, dynamic limit management, and responsible lending practices that ensure compliance and fairness. Additionally, it presents a notebook example demonstrating exploratory data analysis for predicting credit card approval, showcasing methodologies and tools used in the analysis process.

Uploaded by

huyphangia123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views28 pages

FinTech Group Project

Uploaded by

huyphangia123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Final Group Project

Group #1

Price College of Business, University of Oklahoma

FIN 4433-001 FinTech & Applications

Mandy Chan, MBA

Started on April 12th, 2025

Last Modified April 27th, 2025

Introduction

Credit risk, the probability that a borrower fails to meet contractual obligations, remains a

core concern for lenders. Recent advances in machine learning, particularly gradient-boosting

methods, allow institutions to model non-linear feature interactions and improve early detection

of high-risk applicants. Replicating peer-reviewed code from public repositories aligns with best

practices in open science and accelerates innovation in FinTech education.

The Challenges of Credit Risk Analysis before FinTech

Before automated scoring systems and networked databases became commonplace, credit-

risk assessment was hampered by a chronic lack of reliable information. Borrower data were

scattered across individual bank branches and local credit bureaus, often locked away in paper

files. Because lenders could not instantly access a customer’s full payment history, they made

decisions with glaring information gaps, increasing the odds of both approving high-risk applicants

and turning away creditworthy ones.

In the absence of large, consistent datasets, underwriting leaned heavily on human

judgment. Loan officers relied on “character” interviews, letters from employers, or the applicant’s

reputation in the community. Although personal insight sometimes revealed nuances that numbers

missed, it also introduced subjectivity and bias. Lending standards varied from branch to branch,

and discriminatory practices could creep in unnoticed, exposing institutions to compliance and

reputational risks.

Quantitative tools were rudimentary. Basic ratio analysis, such as comparing debt to

income or assets to liabilities was calculated by hand or on simple spreadsheets. Multivariate

statistical models were rare outside the largest banks, largely because the computing power and

skilled analysts needed to build them were expensive. As a result, lenders struggled to price the
risk accurately: interest rates and reserve cushions were either set too high, discouraging good

borrowers, or too low, encouraging future defaults.

Operationally, the entire process was slow and costly. Collecting pay stubs, tax returns, and

bank references require physical mail or in-person visits, so underwriting a single file could take

days. High manual workloads limited a lender’s ability to scale and made it difficult to handle

spikes in application volume. Customers often waited weeks for funding decisions, eroding

satisfaction and driving some to faster competitors.

Even after a loan was approved, monitoring remained largely static. Accounts were

reviewed only at fixed intervals or once payments became delinquent. Because portfolio data were

not linked in real time to economic indicators such as unemployment or regional downturns, credit

quality could deteriorate rapidly before management noticed. By the time remedial action was

taken, losses were often much larger than they would have been with earlier warning signals.

Some of the Application of FinTech in Credit Risk Analysis

Modern credit-scoring engines

The most visible use of data analysis in credit risk is building credit-scoring models that

predict the likelihood a new or existing borrower will repay on time. By feeding statistical or

machine-learning algorithms with thousands of historical loan records—payment histories,

utilization ratios, income stability, and even alternative data such as utility-bill payment or mobile-

phone top-ups—lenders translate raw attributes into a single score or probability of default (PD).

Because the score is automated and continuously recalibrated, it lets banks offer instant credit

decisions, set credit-line limits, and comply with “fair-lending” regulations that demand objective,

data-driven rules rather than subjective judgment.

Limit assignment and dynamic line management

Once an account is opened, lenders still need to decide how much exposure to grant. Data

analysis powers optimization frameworks that set initial credit limits and periodically adjust them

up or down. By linking historical utilization patterns, macroeconomic indicators, and forward-

looking loss forecasts, banks can expand limits for customers who show rising incomes and

responsible usage while trimming or freezing limits for those who exhibit early warning signs such

as maxing-out, higher minimum payments, or deteriorating external scores. These fine-tuning

balances revenue growth against loss containment more effectively than static “one-size-fits-all”

policies.

Responsible-lending and model-explainability analytics

Modern credit-risk teams pair machine-learning power with fairness and explainability

techniques to meet regulatory and ethical standards. Tools such as Shapley values and

counterfactual analysis quantify how each input feature influences an individual prediction,

enabling clear adverse-action notices to consumers who are declined. Bias-detection algorithms

scan for disparate impact across protected classes, prompting data scientists to retrain models or

add constraints that preserve accuracy while reducing unfair outcomes. This analytical layer turns

raw predictive power into transparent, compliant, and socially responsible credit decisions.

Stress testing and portfolio loss forecasting

Regulators and risk committees require lenders to show how their consumer portfolios will

perform under adverse economic scenarios. Analysts build panel data sets that marry internal loan-

level performance with unemployment rates, interest-rate paths, and regional house-price indices.

Econometric or machine-learning models project charge-offs and net-losses under baseline,

adverse, and severely adverse scenarios. These forecasts drive capital-adequacy planning, loan-
loss-reserve calculations (CECL/IFRS 9), and inform management on whether to tighten

underwriting standards or raise pricing before a downturn hits.

Example Notebook of Fintech App. in Credit Risk Analysis (w/ Walkthrough)

Business Problem

This app predicts if an applicant will be approved for a credit card or not. Each time there

is a hard enquiry your credit score is affected negatively. This app predicts the probability of being

approved without affecting your credit score. This app can be used by applicants who want to find

out if they will be approved for a credit card without affecting their credit score.

Method

Methodology that is used for this project includes (1) Exploratory data analysis, (2)

Bivariate analysis, (3) Multivariate correlation

Summary of the Notebook

This notebook serves as the exploratory-data-analysis (EDA) stage for a credit-card approval

project. Its goal is to answer two early-stage questions:

1. What does each variable look like on its own?

Univariate analysis cells profile every feature (categorical and numeric) through value-

frequency tables, histograms, box-plots, pie-charts, and summary statistics. These visuals

expose skew, outliers, dominant categories, and missing-value patterns so that later

cleaning and encoding choices are evidence-based.

2. How does a single feature interact with the target or with another feature?

In the bivariate analysis section, the notebook contrasts each variable against the binary

label Is high risk (default flag). Side-by-side box-plots, risk-segmented bar charts, and

grouped means reveal which characteristics—such as short employment history or certain

dwelling types—are disproportionately represented among bad applicants. These insights

help shortlist promising predictors and highlight relationships (e.g., high-risk applicants

having older accounts yet shorter job tenure) that modelling should capture.

Taken together, the univariate and bivariate explorations give stakeholders an intuitive, data-

driven picture of applicant demographics, financial attributes, and early risk signals before the

project moves on to multivariate modelling and machine-learning steps.

Notebook Walkthrough

0. Import required package

The first code cell, titled “0. import the necessary packages,” is the notebook’s staging

area: it loads every library required for the credit-card-approval project so that later sections

can focus entirely on data cleaning, modeling, and evaluation. General-purpose data

wrangling is handled by NumPy and pandas, while exploratory assistants such as

missingno and pandas-profiling make it easy to visualize gaps or anomalous values in the

dataset. Matplotlib and Seaborn, reinforced by scikit-plot and Yellowbrick, give the author
a full palette of plotting utilities—from quick histograms to ROC curves and feature-

importance bars—rendered inline through the %matplotlib inline magic.

Statistical testing and path management come next. Scipy’s statistical tools (for

example, chi-square tests) support hypothesis checks on categorical variables, and

pathlib.Path offers OS-agnostic file handling. A comprehensive slice of scikit-learn

components then enters: splitters such as train_test_split and cross_val_score,

preprocessing helpers like ColumnTransformer, OneHotEncoder, and MinMaxScaler, plus

virtually every mainstream classification algorithm—from logistic regression and support-

vector machines through decision-tree ensembles and neural networks. Calibration, cross-

validation, permutation-based feature importance, and rich reporting utilities

(classification_report, ConfusionMatrixDisplay, ROC functions) are also pulled in so the

notebook can judge model quality on balanced grounds

1. Import and process data

In [2] – Load the two raw data files
The cell reads application_record.csv into cc_data_full_data, which holds the applicant-level
features, and credit_record.csv into credit_status, which contains month-by-month repayment
information for those same customers. Bringing both tables into memory is the foundation for
every transformation that follows.
In [3] – Engineer risk labels and merge account age
First, it determines each borrower’s oldest account by grouping credit_status and taking the
minimum MONTHS_BALANCE, then merges that “Account age” back onto the main
application data. Next, it flags any serious delinquency: statuses “2”, “3”, “4”, or “5” are marked
“Yes” in a temporary dep_value column. By re-aggregating credit_status at the customer level,
the cell collapses multiple monthly rows into a single “Yes/No” high-risk indicator, merges it
into cc_data_full_data, converts the text label to numeric (1 = high risk, 0 = low risk), and drops
the helper column. A chained-assignment warning is also suppressed for neatness.
In [4] – Make column names human-readable
To improve clarity, this cell renames cryptic bureau codes such as CODE_GENDER or
DAYS_BIRTH into plain English equivalents like “Gender” and “Age.” It also renames the
previously added “Account age,” so the entire DataFrame now reads like a business-friendly
table
In [5] – Define a reusable train/test split function
A small helper called data_split wraps train_test_split, taking the DataFrame and a test-size
fraction (here 0.2) and returning reset-index copies of the train and test subsets. Encapsulating
the logic keeps later code tidy and reproducible.
In [6] – Create the working train and test sets
Using the function above, the full application data is split 80 %/20 % into cc_train_original and
cc_test_original. From this point onward, modeling work proceeds on the training set while
performance will later be checked on the held-out test set.
In [7] – Quick sanity check: training-set shape
Simply prints cc_train_original.shape, letting the analyst verify that roughly 80 % of the original
rows landed in the training partition.
In [8] – Quick sanity check: test-set shape
Likewise prints cc_test_original.shape, confirming that the remaining 20 % of records are in the
test split.
In [9] – Persist the training data to disk
Saves cc_train_original as dataset/train.csv so that downstream notebooks or production
pipelines can load the identical training sample without rerunning the earlier preprocessing steps.
In [10] – Persist the test data to disk
Does the same for cc_test_original, writing it to dataset/test.csv for future out-of-sample
evaluation or model-comparison experiments.
In [11] – Protect the raw splits with working copies
Creates cc_train_copy and cc_test_copy, duplicates of the saved splits. Subsequent cleaning,
encoding, or feature-engineering operations can now proceed on the copies, while the pristine
originals remain untouched as a reference point.

2. Basic analysis of the dataset

In [12] — Build and save an automated EDA report
This cell runs Pandas Profiling on the cleaned training set (cc_train_copy). The ProfileReport
object scans every column, computes descriptive statistics, correlation matrices, and missing-
value charts, and then renders them into a self-contained HTML file. A Path check ensures the
report isn’t regenerated if it already exists; otherwise it is written to
pandas_profile_file/income_class_profile.html. The result is a point-and-click exploratory
dashboard that can be opened in any browser for a deep dive into feature distributions and data-
quality issues.
In [13] — Quick visual peek at the data
Calling cc_data_full_data.head() displays the first five rows of the full, feature-engineered
application table. This gives the analyst a sanity check that the earlier merges and renaming
produced sensible, human-readable columns and that the new “Is high risk” target is present.
In [14] — Structural overview of the DataFrame
cc_data_full_data.info() prints each column’s dtype, the number of non-null observations, and
overall memory usage. It confirms that missing values have been handled as expected and shows
which variables are numeric vs. object (categorical), guiding later encoding and scaling
decisions.
In [15] / Out [15] — Numeric summary statistics
cc_data_full_data.describe() returns a table (shown as Out [15]) of count, mean, standard
deviation, and the 25th, 50th, and 75th percentiles for every numeric feature—including income,
age, employment length, and account age. Analysts use this snapshot to spot unreasonable
ranges, skewed distributions, or potential outliers before moving on to modeling.
3. Input the functions used to explore each feature/ pillar
In [18] – Value-count helper
This cell adds a utility called value_cnt_norm_cal. Given a DataFrame and a column name, the
function returns a tidy two-column table that shows the absolute Count of each distinct value and
its Frequency (%) expressed as a percentage. It is the workhorse behind most categorical plots
and summary printouts that follow, sparing the author from rewriting the same value_counts() /
normalization logic over and over.
In [19] – Quick feature profiler
gen_info_feat is a Swiss-army knife for on-the-fly exploration of a single feature. Using Python
3.10’s match … case syntax, the function tailors its behaviour to each variable: for Age it
converts the raw negative “days” into positive years before printing descriptive stats and plotting
a histogram; for categorical fields such as Education level or Dwelling it prints the value-
frequency table from In [18] and shows a bar chart; for numeric money variables it draws box-
and-histograms with scientific notation turned off. In short, one call delivers a concise textual
and visual profile of whichever column is under investigation.
In [20] – Pie-chart generator
create_pie_plot builds an “at-a-glance” pie chart for selected categorical attributes (e.g.,
Dwelling, Education level). It first grabs the percentage distribution via the helper in [18], then
feeds those percentages into plt.pie, formats the legend, enforces equal aspect so the circle isn’t
distorted, and titles the figure. Because credit datasets often have imbalanced classes, seeing the
relative share of, say, “Rented” vs “Owned” housing in one picture can be more intuitive than a
bar chart.
In [21] – Bar-chart generator
Complementing the pie routine, create_bar_plot produces vertical bar charts that show raw
counts for high-cardinality or business-critical categoricals—marital status, dwelling type, job
title, employment status, education level, and others. Tick labels are rotated and right-justified
for readability, and the same function falls back to a generic branch (case _:) so it will sensibly
plot any categorical column passed to it.
In [22] – Box-plot generator
create_box_plot focuses on the spread and outliers of numeric features. It again branches on the
feature name so that each variable is rendered with units users understand (e.g., Age converted to
years, Employment length converted from negative days to positive years, and incomes shown
with thousands separators). For discrete counts such as number of children it sets integer y-ticks,
while for money values it disables scientific notation. The result is a clean, vertically oriented
boxplot that highlights skewness and extreme observations.
In [23] – Histogram generator
create_hist_plot provides a complementary look at distribution shape. Like the box-plot helper, it
converts and formats special variables (Age, Income, Employment length) before calling
sns.histplot, overlays a kernel-density estimate, and allows the caller to specify the number of
bins. This is the go-to tool for diagnosing normality, skew, or multimodality in any numeric
column.
In [24] – High-risk vs low-risk boxplot
low_high_risk_box_plot dives deeper by splitting the numeric variable of interest (currently Age
or Income) into two groups—borrowers flagged Is high risk = 1 vs those flagged 0. It prints the
mean of each group for quick reference, then draws a side-by-side boxplot so analysts can see if,
for example, higher incomes coincide with fewer defaults or if older applicants have a different
risk profile.
In [25] – High-risk vs low-risk bar chart
Finally, low_high_risk_bar_plot serves the categorical analogue: it groups the data by a chosen
categorical feature, sums the Is high risk indicator to count how many risky customers sit in each
category, sorts those counts in descending order, prints the underlying dictionary, and renders a
bar chart. This immediately spotlights, say, which employment statuses or dwelling types
harbour the largest share of delinquent applicants, guiding subsequent feature engineering or
policy rules.
Together, cells 18-25 equip the notebook with a full exploratory-data-analysis toolkit—tables,
pies, bars, histograms, and risk-segmented boxplots—that can be invoked repeatedly without
cluttering the main narrative of the credit-risk project.
4. Run Univariate Analysis
Core variables

• Gender • Income
• Age • Employment status
• Marital status • Employment length
• Children count • Education level
• Has a property (yes/no) • Account age

• Gender
• Age
• Marital Status
• Income
• Employment Status
• Educational Level
• Employment Length
• Property Ownership
• Account Owner Length
5. Bivariate Analysis (Correlation Test)
• Correlation of Age vs Features
• Correlation of Income vs Features
Key Findings of this Notebook

From the data set, a representative customer in this dataset is a woman around 40 years old

who is married or cohabiting and has no children. She has worked for roughly five years, earns

about $157 k annually, and finished secondary school. While she does not own a car, she does

possess residential real estate (such as a house or flat) and her credit account has been open for

about 26 months.

Statistical tests indicate that neither age nor income shows a meaningful correlation with

the default flag. Borrowers classified as high-risk generally have shorter job tenures and longer-

standing accounts, yet they make up less than two percent of all observations. In contrast, the bulk

of applicants are aged 20–45 and hold accounts that have been active for 25 months or less.

Implication for the Future

The disciplined, pillar-by-pillar EDA framework used in this notebook lays a foundation

for explainable, regulator-ready credit-risk models. Because every variable is first profiled on its

own, then contrasted with the default flag, analysts can trace exactly why a feature is included,

how it behaves across segments, and whether it introduces bias. Embedding those checks as

reusable functions means the same diagnostics can run automatically when new data arrive or

when a model drifts, turning what is often a one-off exploratory step into a living set of controls

that satisfy both internal risk committees and external auditors.

Looking ahead, the modular design also accelerates feature expansion and alternative-data

experimentation. New signals (for example, utility-payment history or mobile-device metadata)

can be dropped into the univariate/bivariate template and instantly subjected to the same scrutiny

as traditional bureau fields. That consistency shortens the path from raw idea to production model

while guarding against “black-box” pitfalls. As lenders move toward real-time approvals and
dynamic credit limits, a methodology that couples rapid exploration with transparent governance

will be crucial for scaling machine-learning risk engines without sacrificing trust or compliance.

Acknowledgement

The notebook used for this report is a derived and simplified version of the project called

“Credit-card-approval-prediction-classification” by Stern Semasuka (username: @semasuka)

from GitHub.

Raymond A. Anderson - Credit Intelligence & Modelling - Many Paths Through The Forest of Credit Rating and Scoring (2022, Oxford University Press) - Libgen - Li
100% (2)
Raymond A. Anderson - Credit Intelligence & Modelling - Many Paths Through The Forest of Credit Rating and Scoring (2022, Oxford University Press) - Libgen - Li
934 pages
Aadarsha ML STW
No ratings yet
Aadarsha ML STW
35 pages
Credit Card Default Prediction PRESENTATION
No ratings yet
Credit Card Default Prediction PRESENTATION
12 pages
AI-Powered Credit Scoring System
No ratings yet
AI-Powered Credit Scoring System
7 pages
Credit Scoring and Data Mining
100% (1)
Credit Scoring and Data Mining
170 pages
Integration of CNN Models and Machine Learning Methods in Credit Score Classification 2D Image Transformation and Feature Extraction
No ratings yet
Integration of CNN Models and Machine Learning Methods in Credit Score Classification 2D Image Transformation and Feature Extraction
45 pages
A Dynamic Credit Risk Assessment Model With Data Mining Techniques Evidence From Iranian Banks
No ratings yet
A Dynamic Credit Risk Assessment Model With Data Mining Techniques Evidence From Iranian Banks
27 pages
Khan Dani
No ratings yet
Khan Dani
49 pages
Paper+13+ (2023.5.6) +Machine+Learning Based+Risk
No ratings yet
Paper+13+ (2023.5.6) +Machine+Learning Based+Risk
17 pages
CSEC - Add Math - Paper 2 Booklet (2016-2023)
No ratings yet
CSEC - Add Math - Paper 2 Booklet (2016-2023)
151 pages
Credit Approval
No ratings yet
Credit Approval
14 pages
Synthetic Feature Generation To Improve Accuracy in Prediction of Credit Limits
No ratings yet
Synthetic Feature Generation To Improve Accuracy in Prediction of Credit Limits
15 pages
2022 Multimedi Tools - Accepted 12 September 2022 - Credit Risk Evaluation - A Comprehensive Study
No ratings yet
2022 Multimedi Tools - Accepted 12 September 2022 - Credit Risk Evaluation - A Comprehensive Study
52 pages
Loan Eligibility Prediction An Analysis of Feature
No ratings yet
Loan Eligibility Prediction An Analysis of Feature
10 pages
IDS Project Report
No ratings yet
IDS Project Report
14 pages
Zom SH Ict701
No ratings yet
Zom SH Ict701
10 pages
DMMLM - Risk Score Prediction Model
No ratings yet
DMMLM - Risk Score Prediction Model
28 pages
Presentation 35966 Content Document 20250506032105PM
No ratings yet
Presentation 35966 Content Document 20250506032105PM
54 pages
Conference Template A4
No ratings yet
Conference Template A4
5 pages
Credit Loan Default Prediction
No ratings yet
Credit Loan Default Prediction
22 pages
Credit Risk Modeling Digital Age 109772
No ratings yet
Credit Risk Modeling Digital Age 109772
12 pages
A Comparative Analysis of Machine Learning Algorithms For Credit Risk Scoring Using Chi-Square Feature Selection
No ratings yet
A Comparative Analysis of Machine Learning Algorithms For Credit Risk Scoring Using Chi-Square Feature Selection
6 pages
Published Research Paper
No ratings yet
Published Research Paper
6 pages
Data Preparation
No ratings yet
Data Preparation
4 pages
1399-Article Text-2909-1-10-20240810
No ratings yet
1399-Article Text-2909-1-10-20240810
20 pages
Preprints202502 2059 v1
No ratings yet
Preprints202502 2059 v1
19 pages
Credit Risk Prediction Presentation
No ratings yet
Credit Risk Prediction Presentation
11 pages
Risk Analysis
No ratings yet
Risk Analysis
19 pages
Make 06 00004
No ratings yet
Make 06 00004
25 pages
FeatureCAM 2015 Reference Help
100% (1)
FeatureCAM 2015 Reference Help
1,985 pages
AI Driven Risk Assessment
No ratings yet
AI Driven Risk Assessment
11 pages
1 s2.0 S1877050923006622 Main
No ratings yet
1 s2.0 S1877050923006622 Main
6 pages
Synopsis
No ratings yet
Synopsis
9 pages
RTA-CSIT 2023 Paper 20
No ratings yet
RTA-CSIT 2023 Paper 20
9 pages
SLR-Big Data For Credit Risk Evaluation
No ratings yet
SLR-Big Data For Credit Risk Evaluation
28 pages
Ai It HW MST Prac
No ratings yet
Ai It HW MST Prac
14 pages
Application of AI in Credit Risk Asseement in Emerging Economies
No ratings yet
Application of AI in Credit Risk Asseement in Emerging Economies
17 pages
New Credit-Risk Models For The Unbanked - McKinsey
No ratings yet
New Credit-Risk Models For The Unbanked - McKinsey
9 pages
ABSTRACT
No ratings yet
ABSTRACT
2 pages
Document
No ratings yet
Document
16 pages
Algorithm Comparison For Data Mining Classification: Assessing Bank Customer Credit Scoring Default Risk
No ratings yet
Algorithm Comparison For Data Mining Classification: Assessing Bank Customer Credit Scoring Default Risk
10 pages
Loan Prediction System Using Machine Learning
No ratings yet
Loan Prediction System Using Machine Learning
4 pages
EasyChair Preprint 8693
No ratings yet
EasyChair Preprint 8693
22 pages
Flog Ug
No ratings yet
Flog Ug
924 pages
3-Invisible Primes Fintech Lending With Alternative Data
No ratings yet
3-Invisible Primes Fintech Lending With Alternative Data
68 pages
Credit Risk Modeling in R
100% (2)
Credit Risk Modeling in R
66 pages
Although Univariate Models Are Still in Use Today
No ratings yet
Although Univariate Models Are Still in Use Today
8 pages
C5 IEEE CreditRiskScoringAnalysisBasedonMachineLearningModels
No ratings yet
C5 IEEE CreditRiskScoringAnalysisBasedonMachineLearningModels
6 pages
Bank Alliance
No ratings yet
Bank Alliance
18 pages
Credit Default Project 23124001
No ratings yet
Credit Default Project 23124001
13 pages
Application of Artificial Intelligence Techniques For Credit Risk Evaluation
No ratings yet
Application of Artificial Intelligence Techniques For Credit Risk Evaluation
7 pages
An Kit
No ratings yet
An Kit
12 pages
Machine Learning Paper BD
No ratings yet
Machine Learning Paper BD
16 pages
Credit Score Prediction.
No ratings yet
Credit Score Prediction.
3 pages
SSRN Id3769854
No ratings yet
SSRN Id3769854
8 pages
Get Signals and Systems Principles and Applications 1st Edition Shaila Dinkar Apte Free All Chapters
No ratings yet
Get Signals and Systems Principles and Applications 1st Edition Shaila Dinkar Apte Free All Chapters
55 pages
Lyceum of Alabang Basic Education
No ratings yet
Lyceum of Alabang Basic Education
41 pages
Credit Approval Data Analysis Using Classification and Regression Models
No ratings yet
Credit Approval Data Analysis Using Classification and Regression Models
2 pages
Predicting Credit Risk For Unsecured Lending
No ratings yet
Predicting Credit Risk For Unsecured Lending
9 pages
Credit Scoring: Case Study in Data Analytics
No ratings yet
Credit Scoring: Case Study in Data Analytics
18 pages
Grade 12 Maths Model Exam
No ratings yet
Grade 12 Maths Model Exam
57 pages
Credit Analysis: Alina Mihaela Dima
No ratings yet
Credit Analysis: Alina Mihaela Dima
22 pages
How To Credit Score With Predictive Analytics: Whitepaper
No ratings yet
How To Credit Score With Predictive Analytics: Whitepaper
7 pages
Mastercam 2017 Readme
No ratings yet
Mastercam 2017 Readme
50 pages
Credit Risk Management Using ML
No ratings yet
Credit Risk Management Using ML
4 pages
Drone Design490 2015 FinalPresentation LDDS
No ratings yet
Drone Design490 2015 FinalPresentation LDDS
118 pages
BE EnTc Mid Sem Examination Digital Image and Video Processing 2021-2022 Sem I
No ratings yet
BE EnTc Mid Sem Examination Digital Image and Video Processing 2021-2022 Sem I
11 pages
Planimeters
No ratings yet
Planimeters
13 pages
Mass and Energy Balances - Basic Principles For Calculation, Design, and Optimization of Macro - Nano Systems
100% (8)
Mass and Energy Balances - Basic Principles For Calculation, Design, and Optimization of Macro - Nano Systems
276 pages
Safety Integrity Level (SIL)
No ratings yet
Safety Integrity Level (SIL)
1 page
02 Chapter 3 - Weight Volume Relationships
No ratings yet
02 Chapter 3 - Weight Volume Relationships
42 pages
Placement With MCTS
No ratings yet
Placement With MCTS
15 pages
Fuzzy Quantifiers: 4y Springer
No ratings yet
Fuzzy Quantifiers: 4y Springer
6 pages
MATH 472: Numerical Methods With Financial Applications: Course Basics Fundamentals
No ratings yet
MATH 472: Numerical Methods With Financial Applications: Course Basics Fundamentals
38 pages
Eda Document Longterm
No ratings yet
Eda Document Longterm
10 pages
Ec0052 Digital Image Processing 2013 14
No ratings yet
Ec0052 Digital Image Processing 2013 14
7 pages
1953 04erdos
No ratings yet
1953 04erdos
5 pages
Parameter Passing and Scoping
No ratings yet
Parameter Passing and Scoping
6 pages
Definite Integration - DPPs
No ratings yet
Definite Integration - DPPs
12 pages
Adam: Adaptive Moment Estimation: The Error To Be Minimized
No ratings yet
Adam: Adaptive Moment Estimation: The Error To Be Minimized
4 pages
Natural Convection Heat Transfer From Inclined Cylinders A Unified Correlation
No ratings yet
Natural Convection Heat Transfer From Inclined Cylinders A Unified Correlation
6 pages
Grade 11 Ap CSP 4TH MP Exam
No ratings yet
Grade 11 Ap CSP 4TH MP Exam
4 pages
UMEP Sample
No ratings yet
UMEP Sample
2 pages
Habs Boys Maths 07 PDF 1
No ratings yet
Habs Boys Maths 07 PDF 1
8 pages
Second Moment of Area
No ratings yet
Second Moment of Area
4 pages
Example Think Aloud Script
No ratings yet
Example Think Aloud Script
1 page
4TH Summative Test in Math4
No ratings yet
4TH Summative Test in Math4
1 page
Next-Gen Risk - A Guide to Enterprise Risk Management in the Era of AI and Digital Assets
From Everand
Next-Gen Risk - A Guide to Enterprise Risk Management in the Era of AI and Digital Assets
Lena Kleinfeldt
No ratings yet
A Study of Banking Risk Management
From Everand
A Study of Banking Risk Management
Michael AK CCBI MCBI Chartered Banker
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

FinTech Group Project

Uploaded by

FinTech Group Project

Uploaded by

Final Group Project

Price College of Business, University of Oklahoma

FIN 4433-001 FinTech & Applications

Mandy Chan, MBA

Started on April 12th, 2025

Last Modified April 27th, 2025

practices in open science and accelerates innovation in FinTech education.

The Challenges of Credit Risk Analysis before FinTech

and turning away creditworthy ones.

In the absence of large, consistent datasets, underwriting leaned heavily on human

income or assets to liabilities was calculated by hand or on simple spreadsheets. Multivariate

borrowers, or too low, encouraging future defaults.

satisfaction and driving some to faster competitors.

Some of the Application of FinTech in Credit Risk Analysis

Modern credit-scoring engines

machine-learning algorithms with thousands of historical loan records—payment histories,

data-driven rules rather than subjective judgment.

Limit assignment and dynamic line management

up or down. By linking historical utilization patterns, macroeconomic indicators, and forward-

as maxing-out, higher minimum payments, or deteriorating external scores. These fine-tuning

Responsible-lending and model-explainability analytics

Stress testing and portfolio loss forecasting

Econometric or machine-learning models project charge-offs and net-losses under baseline,

underwriting standards or raise pricing before a downturn hits.

Example Notebook of Fintech App. in Credit Risk Analysis (w/ Walkthrough)

Bivariate analysis, (3) Multivariate correlation

Summary of the Notebook

project. Its goal is to answer two early-stage questions:

1. What does each variable look like on its own?

cleaning and encoding choices are evidence-based.

grouped means reveal which characteristics—such as short employment history or certain

dwelling types—are disproportionately represented among bad applicants. These insights

project moves on to multivariate modelling and machine-learning steps.

0. Import required package

wrangling is handled by NumPy and pandas, while exploratory assistants such as

importance bars—rendered inline through the %matplotlib inline magic.

example, chi-square tests) support hypothesis checks on categorical variables, and

pathlib.Path offers OS-agnostic file handling. A comprehensive slice of scikit-learn

components then enters: splitters such as train_test_split and cross_val_score,

preprocessing helpers like ColumnTransformer, OneHotEncoder, and MinMaxScaler, plus

virtually every mainstream classification algorithm—from logistic regression and support-

validation, permutation-based feature importance, and rich reporting utilities

(classification_report, ConfusionMatrixDisplay, ROC functions) are also pulled in so the

notebook can judge model quality on balanced grounds

1. Import and process data

2. Basic analysis of the dataset

Implication for the Future

that satisfy both internal risk committees and external auditors.

experimentation. New signals (for example, utility-payment history or mobile-device metadata)

“Credit-card-approval-prediction-classification” by Stern Semasuka (username: @semasuka)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.