Fintech#4 - Why Alternative Credit Score models are BS
Also, Elon now owns twitter. Will he now be in charge of reining himself in?
Hello.
How are we doing today?
A slight nip is in the air as Delhi prepares for the winter and the annual settling in of the air pollution in the low-lying bowl that the National Capital Region is. I am planning on decamping to Maldives to escape the worst. I’ll need to sell a few livers and kidneys to pay for this holiday though. Planning is ongoing. In the meanwhile, the government will do nothing to ask the farmers to kindly stop burning the stubble and will instead ask us to put up with hare-brained schemes like odd-even and phasing out of perfectly well operating vehicles. Basically, Because Delhi folks complain about pollution, make life hard for them. That way the government can point to the inconvenience and claim that it has done something. Moving on….
Last month had some interesting things happen:
Elon finally bought Twitter, ruining it mostly for himself.
The British PM round-robin.
I attended Money 20/20 and had to tamp down on my urge to repeatedly say “Everything the previous speaker said is complete BS. Thank you.”
At Money 20/20, there were a lot of lenders. When a lot of lenders (especially consumer lenders) gather, they talk about the problems of life - like repayments. Repayments depend on efficient underwriting and good credit scoring. Credit scoring conversations devolve to “alternate data”.
Let’s examine exactly what this “alternative credit data” is and what we can (and more importantly, CAN’T) do with it.
What is alternative data?
Alternative data is apparently the panacea to all the travails of consumer lending. It is a mythical beast, the sorcerer’s stone, the bee’s knees, a holy grail that (paired with ML and AI) produces omniscient results (hopefully).
Or, to quote a random guy called Tom Hadley:
OK. I can get behind that. But let’s try to come up with a definition:
Alternative credit data is anything that can be used to evaluate someone for consumer lending. It is:
Is not captured by a credit bureau (hence, alternative)
Gives behavioural insights on a person discharging their obligations.
Is a recurring case (e.g. rent or phone bill)
or gives insights into ability to pay (like buying 4 Rolexes as an “investment”).
I think I have covered most bases here.
Another way to look at it is as a confidence distribution:
A good credit score can give a sense of two things - Ability to Pay1 and Intent to Pay2. In the traditional model, ability-to-pay is evaluated using your salary / cash flow and intent-to-pay is evaluated by your past repayment performance. Now, with this in mind, let's look at the moats.
The first moat is what has historically been best predictor - Have you paid your loans in the past?
The next moat is recurring payments data which is not ordinarily captured. Do you pay your rent on time? Your utility bills? Have you been paying for your BNPL purchases on time? Companies that “help” you pay bills (e.g. Cred) or do BNPL lending are hoovering up this data.
The third moat is the ability-to-pay. This lops in big spends like cars, expensive watches, Birkin bags and so on…
The last one is everything else. Literally, everything that could (in theory) be used to predict likelihood of repayment. This is the kind of stuff that has historically proven to be sounds-compelling-but-useless-in-real-life kind of stuff on which thousands of dead AI/ML dreams were built.
When it comes to loans, past performance is a predictor of future returns
So, here is a hypothesis:
“How a consumer has paid their loans (and other recurring bills / obligations) is the most useful data point for predicting if they will repay their loans in the future.”
You see, one of the advantages of being an active VC and someone who is known to have some minor expertise in fintech is that I discuss a lot of business plans and strategies with some very smart founders. My takeaways have mostly been negative on alternative data.
Lenders have tried everything else and nothing has performed as well as past repayment behaviour. Even when data scientists are allowed to go crazy and use any and all kinds of data - they do not come up with a better yardstick.
A great example is China - where Ant Financial tried to use purchasing data. Here is what happened:
When Ant Financial launched its credit scoring system, Sesame Credit, in January 2015, it said the data-driven product would “make credit more available to millions of consumers across China”….
But nearly four years later, Ant Financial, which is an affiliate of Chinese tech giant Alibaba, has never used Sesame Credit for lending decisions…
Sesame, which is an opt-in feature of the Alipay mobile payments app, draws upon the biggest pool of non-traditional ratings data in the world. It synthesises details from hundreds of sources — ranging from purchases on Alibaba’s Taobao marketplace to subway fares — into a single trustworthiness number for each user, called a “Sesame score”.
But one Ant Financial employee conceded there was a difference between “big data” and “strong data”, with big data not always providing the most relevant information for predicting behaviour, and analysts say the best predictor of whether someone will default on a loan in future is often their previous loan repayment history, rather than their likelihood of returning a rental car.
Dodgy correlations.
One of the other things that got people very excited at Money 20/20 is revenue based underwriting3. Closer home, it seems I run into yet another b-plan promising to lend based on a shop’s cash flows or sales every other day.
OK. I get it. It is an obvious ability-to-pay data point. More sales will likely be correlated with higher ability-to-pay. Logical. Right?
Wrong. High sales or cash flow numbers do nothing to help me understand if the debt will be prioritized in the cost stack. Without naming names, I can tell you that analyses of last 3-5 years of cohort data by at least 3 unicorns4 operating in this space have shown zero correlation between sales and likelihood of repayment issues. ZERO.
Now, you may ask - this correlation sounds so logical. What is the problem?
A little statistical concept called Orthogonality.
You see, maths doesn’t care that you raised money from your VCs claiming <2% NPA rate but your actual NPA rate is closer to 15%. Ability-to-pay and Willingness-to-pay are orthogonal variables. They are not correlated. If you have credit history for a customer - sure, cashflow / sales data works as a complement and gives you more confidence. But if you DO NOT have credit history - the accuracy of the underwriting drops calamitously.
A high cash flow / revenue guy has money to pay. But will he pay the lender back or will he likely decamp to Belize and sip mai-tais on the beach? Sales data cannot tell us.
In the same vein, Cred may capture my rental payments and mine my credit card spends5 but it cannot assure a lender that I will prioritize repaying their loans over everything else. Not paying rent means losing my roof. Not paying my loans means legal troubles that may manifest down the line. The choice - if one is faced with it - is sometimes painfully clear.
It’s not hard to lend. It’s hard to get repaid.
Multiple generations of fintech companies seem to just forget this little statement. Lending businesses have to be learning businesses. They have to learn to ride the ebb and flow of amount lent v/s assurances required to lend out that amount. Mostly this is a linear function. You want to be around that line in the middle.
If we look at the evolution of how credit is underwritten - on the same chart:
For most of human history, relationship lending has dominated. If you had an in with the Medicis or the Rothschilds, you could become the next Holy Roman Emperor, or the Pope. For larger loans, folks asked for security. Secured lending has remained mostly unchanged from the form that it exists in today6.
However, as the definition of relationship changed with the times, unsecured lending volume has increasingly shifted to the credit score based lending model.
It is important to note that these zones build on top of each other. Lenders that rely on credit scores for unsecured lending will still take their relationships with specific borrowers into their decisions – profitable, low-risk customers get VIP treatment (better pricing, streamlined experience, etc.) while customers with prior defaults will often be automatically declined or given additional screening. For securitized lending, lenders will usually supplement their asset-based underwriting with credit scores in order to gauge a consumer’s willingness to repay.
Fintech cos have been pushing this space. Hard, with mixed success. Digital underwriting pipelines and instant disbursements have been great leaps forward. BNPL on the other hand is yet another form of subprime7. My feelings on fintechs lending are decidedly mixed.
Now that the risk-on-grow-fast-everything-goes-up-and-right funding environment is cooling off, it is almost painful to see the non-performance levels. NPAs are often in the 10-15% range. That is wiping off the entire equity and then some.
Conclusion:
I wrote in an earlier essay that Fintech cos can do two things:
Reduce friction, and/or
Assume risk.
The reducing friction story has played out like gangbusters. A great example is Razorpay. Most of the time, when one thinks of a successful fintech co with a durable business, one thinks of a company reducing friction.
However, assuming risk has mostly been about shattered hopes and down-rounds. I remain unconvinced about those assuming risk and trading common sense for hypergrowth (and valuations) when their underwriting is weak or based on weird data points. That is not to say that all in this area are doomed to fail - there are some very, very valuable businesses built up with durability in mind. But by and large, I have been disappointed as an equity investor with lending fintechs.
To conclude - the chickens do come home to roost if you are a lender. And if you weren’t focusing on accurate underwriting - often lead to down-rounds.
Housekeeping
As always, I look forward to hearing from you. If you liked this post, pls feel free to share this or subscribe to this newsletter using the links below. I try to write a 1000-2000 word essay once every two/three weeks.
i.e. do you make enough money to pay me back.
Will you pay me back or use the money to have a good time?
Companies like Capital Float and Khatabook have been built on this premise.
with massive budgets for data science spends.
under the guise of “protection”
being the only real way to lend money at scale safely.
If you read between the lines of recent RBI actions on consumer and app-based lending, we can see that the stress is already in the system.