AI

Traditional credit scoring has long excluded millions from the formal credit ecosystem–particularly first-time borrowers, gig workers, or informal sector employees. But what if we were able to determine a person’s creditworthiness, not through old bank records but rather from the way they interact with digital technology?

This is the place where AI-powered credit scoring based on different data is changing the way we think about financial inclusion by offering fair, accurate, real-time, and understandable credit scores.

The Issue with Traditional Scores

Traditional systems such as CIBIL or Experian depend on:

  • History of loan repayment
  • Use of credit cards
  • Existing loan accounts

But:

  • 350 million people living in India do not have a credit history.
  • New-to-credit (NTC) Borrowers are automatically categorized as high-risk.

We require something better. Enter AI + alternative data.

What Is Alternative Data?

Alternative data is a term used to describe non-traditional, but digitally-accessible behavioral signals, like:

  • UPI transaction history
  • Payments on utility bills (electricity, water)
  • Mobile data top-ups & SMS patterns
  • Contributions to PF/EPFO by employers
  • Shopping behavior in e-commerce

This information is in real-time and highly relevant, and usually more indicative of the ability and willingness to pay.

AI Model Design for Alternative Credit Scoring

We utilize a layered architecture to build our scoring engine:

1. Data Ingestion Layer

  • Sources: UPI Apps telecom providers, PF APIs, UPI APIs
  • Format normalization: JSON schemas
  • Privacy In the privacy world: Masking and tokenization

2. Middleware Connect Platform (MCP)

  • It serves as an orchestration API and an abstraction layer
  • Provides traffic to providers and handles rates, auth, and retries
  • The data is normalized and enriched before scoring
  • This improves enterprise-ready capabilities and makes it accessible to CRMs.

3. Feature Engineering Layer

  • Examples of features:
    • Consistency in the payment of bills
    • UPI’s inflow/outflow ratio
    • Stability of employment (from EPFO records)
    • Recharge frequency

4. Modeling Layer

  • Model: XGBoost (Gradient Boosting Trees)
  • Explainability: SHAP (Shapley Additive Explanations) for the impact of each feature
  • Training Records with 500K+ labels with loan repayment results that are known

5. Inference & API Layer

  • Score in real-time: response within about 2 seconds
  • Score breakdown: Risk bands (low/medium/high)
  • Inputs: JSON including score and explanation, plus reason codes

Risk Mitigation

Even using AI bias, data leakage risks are present. Here’s how to deal with these risks:

  • Bias: Auditing regularly tests of models across gender, geography, work, and location.
  • Model Governance: Version control, rollback, test logs
  • Consent of the User: Data can be obtained only if you have explicit permission
  • Security: All communications via HTTPS are PII are encrypted while in transit.

Continuous Learning & Feedback Loop

We incorporate loan repayment results to:

  • Train the model every 3 months
  • Fine-tune thresholds for each area or type of product
  • Modify the weights of features dynamically (e.g., UPI vs. utility weight)

Business Impact

  • 30% more approvals for first-time loans to first-time borrowers
  • 50% reduction in the manual underwriting time
  • 25% lower default rate for the AI-based scoring group

Final Thoughts

AI isn’t just replacing outdated methods. It’s creating more speedy, better, and more accessible credit access. Through the use of alternative data and implementing it via the Middleware Connect Platform (MCP), FinTech’s and lenders are able to offer real-time credit services to the underserved segments, without putting more risk on themselves.

Tags: