CortexData
Platform · ML

Explainable ML for Indian lending — your data, your weights, your audit trail.

A 5-model fraud ensemble and a calibrated probability-of-default model, with full feature attribution per decision.

CortexData's ML decisioning is built around a single principle: every decision has to be defensible to your credit committee, RBI, and the Banking Ombudsman. We've engineered the stack — feature engineering, ensemble architecture, calibration, attribution, drift monitoring — to deliver predictive lift without losing explainability.

  • 5-model fraud ensemble (XGBoost / LightGBM / RF / GB / Isolation Forest)
  • Calibrated probability-of-default model with isotonic regression
  • 32 hand-engineered features tuned for CIBIL / CRIF / Experian / Equifax
  • Per-decision feature attribution stored in immutable audit chain
  • Trained on your data, on your infrastructure — never leaves your perimeter
  • PSI / CSI drift monitoring with automated retraining gates
Platform · ML module · highlights
  • 01
    5-model fraud ensemble (XGBoost / LightGBM / RF / GB / Isolation Forest)
  • 02
    Calibrated probability-of-default model with isotonic regression
  • 03
    32 hand-engineered features tuned for CIBIL / CRIF / Experian / Equifax
  • 04
    Per-decision feature attribution stored in immutable audit chain
The ML stack

Predictive lift you can defend.

Most ML lending platforms promise 'AI-powered'. We promise 'audit-ready'. The architecture, features, and evaluation harness are designed around what an RBI inspector or your credit committee will actually ask.

5-model fraud ensemble

Weighted voting: XGBoost (30%), LightGBM (30%), Random Forest (20%), Gradient Boosting (15%), Isolation Forest (5%). Each contributes complementary signal. Ensemble outperforms any single model on Indian fraud patterns.

Calibrated PD model

Probability-of-default model with isotonic regression calibration — so a 4.2% PD output means actually-4.2% empirical default in your portfolio. Critical for risk-based pricing and IRACP provisioning math.

Per-decision attribution

Every approve/reject/refer decision returns the top-N features that drove it, with their weights. Defensible to your credit committee, RBI inspection, and the Banking Ombudsman.

32 banking-tuned features

Feature engineering tuned for Indian credit bureaus (CIBIL, CRIF, Experian, Equifax): bureau score, DPD bands, credit-utilisation, enquiries, vintage, write-offs, settled, employment tenure, FOIR, LTI, plus device, velocity, address, and email-phone-risk signals.

Hybrid: scorecard + ML

WOE-based scorecard (banker-readable, regulator-friendly) layered with ML PD model (predictive lift). Either alone is workable. Together they give you regulatory defensibility AND model performance.

Anomaly detection

Isolation Forest for outlier detection on the 3-sigma rule across application velocity, device risk, address consistency. Flags applications that don't match any prior pattern — even ones the labelled ensemble hasn't seen.

Retraining pipeline

End-to-end retraining: data versioning, feature store, train/validate/test split, threshold optimisation via F1, model registry, A/B-deployable versions. Your bank's data, your training cadence, your control.

Drift detection

Population stability index (PSI) on inputs, characteristic stability index (CSI) on features, and rank-order stability on scores. Drift alerts gate model promotion automatically.

Decision pipeline

From application to decision, end to end.

01
Application
Customer fills onboarding flow; KYC + documents captured.
02
Bureau pull
CIBIL / CRIF / Experian / Equifax with consent ledger.
03
Feature store
32 features engineered from app + bureau + device + velocity signals.
04
Models
Fraud ensemble + PD model run in parallel; outputs combined.
05
Decision + trace
Approve / refer / reject + top-N feature attribution into audit chain.

Frequently asked questions

See decision attribution on your portfolio.

Bring us a labelled sample (approved + rejected, last 12 months). We'll train the ensemble on your data and show you per-decision feature attribution alongside your current process — side by side.