Credit & Lending
Assess risk and make fair lending decisions • 52 papers
Credit Scoring & Risk Models
Predict loan defaults and assess creditworthiness
Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy
The Z-score model using MDA; 10,000+ citations and still the benchmark for bankruptcy prediction.
Financial Ratios and the Probabilistic Prediction of Bankruptcy
Introduced logistic regression to default prediction; established logit as industry workhorse.
On the Pricing of Corporate Debt: The Risk Structure of Interest Rates
Structural model treating equity as call option; foundation for KMV/Moody's EDF.
Credit Rationing in Markets with Imperfect Information
Explains equilibrium credit rationing from adverse selection; essential for lending market theory.
XGBoost: A Scalable Tree Boosting System
Dominant ML algorithm for credit scoring; consistently outperforms logistic regression.
Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring: An Update of Research
Definitive benchmark comparing 41 classifiers across 8 datasets; establishes that ensemble methods outperform logistic regression.
Consumer Credit-Risk Models Via Machine-Learning Algorithms
First major paper showing transaction data + ML dramatically improves default prediction; estimates 6-25% cost savings.
Risk and Risk Management in the Credit Card Industry
Uses account-level data from 6 major US banks; finds substantial heterogeneity—no single model works for all.
Fair Lending & Disparate Impact
Ensure lending decisions are fair across groups
Mortgage Lending in Boston: Interpreting HMDA Data
Seminal Boston Fed study documenting lending discrimination; foundational for fair lending enforcement.
Consumer-Lending Discrimination in the FinTech Era
Latinx/African-American borrowers pay 7.9 bps more; FinTech reduces but doesn't eliminate discrimination.
Predictably Unequal? The Effects of Machine Learning on Credit Markets
ML models create distributional impacts favoring advantaged groups even without using race.
How Costly is Noise? Data and Disparities in Consumer Credit
Credit scores are noisier for minority borrowers; quantifies how data disparities translate to lending disparities.
Inherent Trade-Offs in the Fair Determination of Risk Scores
Proves impossibility of satisfying calibration and error rate parity simultaneously—THE foundational fairness theorem.
Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments
Independently derives impossibility result with explicit disparate impact focus; addressed ProPublica/COMPAS controversy.
Pricing Credit Products
Set interest rates based on risk
The Failure of Competition in the Credit Card Market
Documented sticky credit card rates; introduced adverse selection explanations establishing the field.
Adverse Selection in the Credit Card Market
First direct evidence of adverse selection using randomized solicitations.
Estimating Welfare in Insurance Markets Using Variation in Prices
Demand-and-cost curve framework for analyzing selection; widely applied to credit markets.
Selection in Insurance Markets: Theory and Empirics in Pictures
Intuitive graphical framework for understanding selection and welfare in credit/insurance.
Time to Default in Credit Scoring Using Survival Analysis: A Benchmark Study
Benchmark comparing survival methods for WHEN default occurs—critical for IFRS 9 and lifetime expected credit loss.
Collections & Recovery
Optimize strategies for collecting overdue payments
An Empirical Analysis of Personal Bankruptcy and Delinquency
Landmark study finding increased default propensity independent of risk composition; suggests declining stigma.
What Do We Know About Loss Given Default?
Comprehensive LGD estimation review; go-to reference for Basel II/III recovery models.
LossCalc: Model for Predicting Loss Given Default
Industry-standard LGD model using debt type, seniority, and macro factors.
Measuring LGD on Commercial Loans: An 18-Year Internal Study
JPMorgan Chase's 18-year study of 3,761 defaults establishing key LGD drivers.
Alternative Data
Use non-traditional data for credit decisions
Behavior Revealed in Mobile Phone Usage Predicts Credit Repayment
Mobile behavioral data outperforms credit bureaus for thin-file borrowers; foundational fintech paper.
Invisible Primes: Fintech Lending with Alternative Data
Alternative data identifies 'invisible primes' overlooked by traditional scores.
Predicting Poverty and Wealth from Mobile Phone Metadata
Mobile metadata predicts socioeconomic status; opened mobile-based credit scoring in developing countries.
Use of Alternative Data in Credit Process
Documents 45 million 'credit invisible' Americans; foundational regulatory framework.
On the Rise of FinTechs: Credit Scoring Using Digital Footprints
Digital footprints (device, email domain, typing) match credit bureau accuracy; foundational fintech credit study.
Fraud, Credit and Claim Risk & Anomaly Detection
Identify fraudulent transactions and anomalous patterns in credit applications
SMOTE: Synthetic Minority Over-sampling Technique
The foundational imbalanced-learning method used in virtually every fraud detection system.
Statistical Fraud, Credit and Claim Risk: A Review
Authoritative taxonomy of fraud detection methods; still cited for conceptual foundations.
Anomaly Detection: A Survey
Definitive survey covering statistical, ML, and proximity-based anomaly methods.
Credit Card Fraud, Credit and Claim Risk: A Realistic Modeling and a Novel Learning Strategy
Addresses realistic fraud detection challenges: class imbalance, concept drift, and delayed feedback.
Feature Engineering Strategies for Credit Card Fraud, Credit and Claim Risk
Transaction aggregation features improve fraud detection by 40%; widely adopted in industry.
Graph-Based Anomaly Detection and Description: A Survey
Survey on using graph structure to detect fraud rings and coordinated attacks.
APATE: A Novel Approach for Automated Credit Card Transaction Fraud, Credit and Claim Risk Using Network-Based Extensions
Network propagation improves fraud detection by leveraging transaction graph structure.
Insurance Claims & Actuarial ML
Apply ML to insurance pricing, claims prediction, and reserving
Nesting Classical Actuarial Models into Neural Networks
Embedding GLMs into neural networks improves insurance pricing while maintaining interpretability.
Data Driven Binning for Insurance Tariffs
Evolutionary trees optimize premium segmentation while respecting regulatory constraints.
Detecting Insurance Fraud Using Supervised and Unsupervised Machine Learning
Comprehensive comparison of fraud detection methods for insurance claims.
Claims Frequency Modeling Using Telematics Car Driving Data
Telematics data (speed, braking) improves claims prediction; foundational usage-based insurance paper.
Neural Networks Applied to Chain-Ladder Reserving
Neural networks improve reserve estimation over traditional chain-ladder methods.
Safety & Trust Scoring on Platforms
Build and analyze reputation and trust systems for marketplace participants
The Dynamics of Seller Reputation: Evidence from eBay
Foundational empirical study of reputation dynamics and their impact on seller behavior.
The Limits of Reputation in Platform Markets: An Empirical Analysis and Field Experiment
Field experiment showing reputation inflation and limits of feedback systems.
Reputation and Feedback Systems in Online Platform Markets
Comprehensive survey of reputation system design and effectiveness in platforms.
The Value of Reputation Information: Evidence from a Natural Experiment
Natural experiment measuring the causal impact of reputation visibility on market outcomes.
Explainability & Regulatory ML
Build interpretable models that satisfy regulatory requirements
A Unified Approach to Interpreting Model Predictions (SHAP)
SHAP values unify feature attribution methods; now standard for credit model explanations.
'Why Should I Trust You?': Explaining the Predictions of Any Classifier (LIME)
Local interpretable explanations for any black-box model; widely used for adverse action notices.
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead
Argues interpretable models match black-box accuracy for credit; influential regulatory perspective.
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission (GA²M)
Generalized additive models with interactions; template for interpretable credit scoring.
Real-time Decisioning & Deployment
Deploy and maintain credit models in production with concept drift handling
A Survey on Concept Drift Adaptation
Comprehensive survey on detecting and adapting to changing data distributions in credit.
Selection Bias in Credit Scorecard Evaluation
Identifies sample selection bias in scorecard validation; essential for production monitoring.
Reject Inference Methods in Credit Scoring: A Systematic Review and New Approaches
Reviews and advances reject inference methods for handling missing data from declined applications.
Streaming Active Learning Strategies for Real-Life Credit Card Fraud, Credit and Claim Risk
Active learning reduces labeling costs for streaming fraud detection systems.