How We Reduced a Lending Platform's NPL Rate by 44 Points in 3 Months
A digital lending platform was approving loans it couldn't recover. Their NPL rate had climbed to 90% — not because their customers were bad, but because their risk scoring was broken. Here's exactly what we built to fix it.
The Problem
The client was a Nigerian digital lending platform processing roughly $30K/month in loan volume. Their non-performing loan rate had reached 90% — meaning 9 out of every 10 loans they made were not being repaid. At that rate, the business was functionally insolvent on its lending book.
The existing process: a loan officer reviewed a form submission, checked a credit bureau score if one existed (most applicants had thin or no credit history), and made a gut-call decision. There was no consistent scoring logic. No feature engineering on applicant data. No feedback loop from past loan outcomes to future decisions.
The data existed — repayment history, loan amounts, device metadata, timing patterns, behavioral signals from the app — but none of it was being used systematically.
The Diagnosis
Before writing a single line of model code, we spent two weeks on an architecture and data audit. This is the step most teams skip. They see a high NPL rate and immediately think "we need a better model." They're usually wrong.
What we found:
- Loan outcome data was in a different database than applicant feature data — no joined training set existed
- The app collected 40+ behavioral signals at application time, none of which were stored in a queryable format
- KYC verification was manual and took 48–72 hours, creating selection bias: only the most patient applicants completed it
- There was no feature store or pipeline — any model would need to be retrained from scratch on each update
- The loan decision API had no versioning — deploying a new model meant a full redeploy of the monolith
The NPL problem was as much an infrastructure problem as a modeling problem. A better model deployed on broken infrastructure would degrade within weeks.
What We Built
We ran the engagement in three overlapping phases. Not sequential — the infrastructure work started immediately while the data pipeline was being built, and the model was trained on the first clean dataset before the full pipeline was complete.
Phase 1: Data Infrastructure (Weeks 1–3)
We built a unified feature pipeline in Python that joined applicant data, behavioral signals, and historical loan outcomes into a single training-ready dataset. Key decisions:
- PostgreSQL feature store with point-in-time correctness — no label leakage
- Event capture for 40+ behavioral signals via a lightweight SDK added to the mobile app
- Backfilled 18 months of historical data from three separate source systems
- Automated daily pipeline refresh with data quality checks before model re-scoring
Phase 2: The Risk Scoring Model (Weeks 3–7)
With a clean dataset, we trained a gradient boosting classifier (XGBoost) on repayment outcomes. The most predictive features were not what the team expected:
- Time between app install and first loan application (shorter = higher default risk)
- Device type and OS version combination — proxy for income stability
- Application completion rate on previous sessions (incomplete = higher risk)
- Time of day of application — midnight applications defaulted at 3x the rate of morning applications
- Number of distinct phone numbers used across sessions
The model output a repayment probability score (0–1). We set a threshold of 0.65 for automatic approval and deployed the scoring API as a standalone microservice, decoupled from the lending monolith. This meant the model could be updated without a full platform redeploy.
Phase 3: KYC Automation (Weeks 5–10)
The 48–72 hour manual KYC process was killing conversion and introducing selection bias. We integrated an automated identity verification pipeline using OCR + liveness detection, reducing KYC turnaround from 48 hours to under 3 minutes for 80%+ of applications. The remaining 20% — edge cases requiring human review — were flagged and routed to a compliance queue with full context already assembled.
The Results
Measured at the 90-day mark after production deployment:
- NPL rate: 90% → 46% (44 percentage point reduction)
- KYC completion time: 48–72 hours → under 3 minutes for 82% of applicants
- Loan volume: maintained at $30K/month without relaxing credit standards
- Manual KYC reviews: reduced by 80%+ (staff redeployed to higher-value work)
- 15% improvement in customer onboarding completion rate
The NPL rate continued to improve after 90 days as the model accumulated more labeled data and the feedback loop tightened. At six months, they reported the rate below 35%.
What Made It Work
Three things that mattered more than the model itself:
1. We fixed the data before touching the model. A gradient boosting model trained on clean, joined, point-in-time-correct data will outperform a sophisticated deep learning model trained on dirty data every time. The infrastructure work was not a prerequisite — it was the work.
2. The model was deployed as a microservice. Decoupling the scoring API from the monolith meant we could iterate on the model weekly without coordination overhead. By week 12 we had shipped four model versions. On the old architecture, each would have been a two-week release cycle.
3. We transferred the knowledge. The client's data team was trained on the feature pipeline, the retraining process, and the scoring threshold logic before we exited. At six months, they had shipped two model updates without our involvement. That outcome — independence — was the goal from day one.
Tech Stack
Work With Us
Is your platform AI-ready?
If you're dealing with a similar problem — high error rates, broken data pipelines, AI initiatives that stall before production — the first step is understanding what's actually blocking you. A 30-minute call is enough to find out. That's what the AI Readiness Audit is for.
AI Readiness Audit from $8,000 · No commitment required