Hit the 'Like' button if you resonates !
Like
About Author

Nabila Md Saad
Product & Design Strategy
Nabila is a product and strategy professional working at the intersection of human behaviour and financial product design.
Published on
There's a particular kind of pressure that comes with building data-driven features when you don't have much data.
It doesn't show up in roadmaps or sprint reviews. It lives in the gap between what stakeholders want — actionable insights, intelligent recommendations, personalised experiences — and what your database actually contains: a thin, incomplete, early-stage record of users who haven't yet decided whether to trust you.
I've worked through this problem twice, both times with digital banks in their earliest stages. I've started calling them "baby banks" — not dismissively, but affectionately. They are genuinely new-born players in a financial ecosystem that rewards incumbency. And like most new-borns, they arrive with enormous potential and almost no history.
The loop nobody talks about
Here's the core tension: digital banks are launched because they promise innovation. But the very thing that would make them innovative — intelligent, personalised, data-driven features — requires the one thing they don't yet have. Usage data.
Low adoption means thin data. Thin data limits the features you can build. Limited features slow adoption. You can see the loop.
Most teams respond to this by waiting — building simpler features now and promising the intelligent ones later, once the data exists. This is reasonable. It's also slow, and it doesn't solve the underlying problem: by the time you have enough data to build the features that would differentiate you, a competitor may have already gotten there.
What thin data actually looks like in practice
Thin data environments have a specific texture. Irregular deposit patterns. Large one-off inbound transfers rather than consistent salary credits. Low transaction frequency. Long periods of inactivity punctuated by sudden activity.
The tempting interpretation is disengaged users. The accurate interpretation is often more interesting — these might be your most financially sophisticated customers, deliberately parking money for a promotional rate, testing the product before committing, or managing a multi-account strategy that your database simply doesn't have visibility into.
Thin data doesn't just mean less data. It means data that is systematically biased toward a partial view of the customer's financial life. Building features on top of that partial view — without acknowledging what's missing — is how you end up with insight engines that confidently produce bad advice.
Synthesized data: hypothesis, not fabrication
This is where something shifted for me: the goal in thin data environments is not prediction. It's permission to still innovate despite scarcity.
One of the most useful tools in this context is synthesized data — carefully constructed datasets that simulate realistic user behaviour when real behavioural data doesn't yet exist. This isn't fabrication. It's hypothesis. You're not pretending you have data you don't. You're building a canvas large enough to test whether your logic works, before real users arrive to validate or disprove it.
Synthesized data enables two things that are otherwise impossible in early-stage products:
Testing insight logic without waiting. If you're building a feature that surfaces spending patterns, you need enough behavioural variation to know whether your logic handles edge cases. Synthesized data gives you that variation on day one.
Stress-testing wrong assumptions safely. Every product team has assumptions baked into their feature logic — about what "normal" spending looks like, what triggers are meaningful, what makes a customer "healthy" versus "at-risk." Synthesized data lets you pressure-test those assumptions before they cause real harm to real users.
Building for progressive data density
Data-driven features in baby banks need to be phased for the reality of low and irregular usage — not designed for a hypothetical future where everyone uses the app daily.
The framework I've found most useful is a three-tier approach based on data reliance:
Tier 1 — No reliance on behavioural data. Features that are useful from day one, regardless of how much account activity exists. General financial education, market context, product explainers. They build familiarity and trust without requiring any user history.
Tier 2 — Semi-reliance on behavioural data. Features that get better with more data, but still function meaningfully with less. A basic spending summary doesn't need a year of transactions to be useful — it needs a month. Start here early, and let the feature earn its depth over time.
Tier 3 — Unlocked by heavy behavioural data. Personalised predictions, income stability assessments, anomaly detection. These should only surface when there's enough data to make them accurate. Surfacing them too early is worse than not surfacing them at all — a feature that gives wrong answers once will not be trusted when it's finally right.
This tiering prevents what I think of as feature collapse: the quiet failure mode where a feature exists in the product but stops being used, because it gave unreliable outputs too early and users never came back to give it another chance.
The baby bank data problem is real — but it isn't a reason to wait. It's a design constraint, and design constraints are what make product thinking interesting.
In the next piece, I'll get into the sequencing principle that changed how I think about building data-driven features — and why the order in which you prioritise accuracy, helpfulness, and commercial outcomes matters more than most teams realise.