Table of Contents
Key Takeaways
DIY MDM in Fabric can be fine for a very small proof of concept, but most efforts expand quickly and should be planned as enterprise-scale from the start.
For medium and larger initiatives, buying a purpose-built MDM platform delivers faster time to production, smaller teams and substantially lower three-year TCO than building in Fabric alone.
As Fabric and lakehouse programs mature into multi-domain, governed data products, the operational overhead of custom MDM grows sharply, making early platform adoption the more sustainable path.
At Data Hero Summit 2025, I had the chance to talk about a research project my firm conducted for Profisee on a question I’ve seen teams wrestle with for years: Should you build master data management capabilities yourself in your data lakehouse or buy a purpose-built MDM platform?
Most organizations think they already know the answer. But belief isn’t a business case. So Profisee asked us to do what practitioners don’t always have time to do in the real world — put hard numbers behind the build-vs-buy decision across realistic MDM scenarios.
What we found was both unsurprising and, in one key way, more dramatic than even I expected. In this blog post, I’ll discuss the key findings from the report and share a little bit more about my personal perspective on the topic of build-vs-buy.
The State of Master Data Management (MDM) 2026
The Reality: Almost Everyone Starts by Building
In a Microsoft Fabric and lakehouse environment, the instinct to “just code it” is understandable. Fabric is a strong platform.
So teams start small. They script some matching. They build a few rules. They produce a “trusted” dataset and call it MDM.
And for a narrow proof of concept, that approach can work.
But here’s the part too many teams miss early on: those projects almost never stay narrow. Teams wind up building enterprise scale MDM whether they know it or not when starting out at proof of concept.
This is an observed pattern across industries. Success breeds demand. More sources. More domains. More governance. More stewardship. More change. The “quick and dirty” build becomes a critical data product…and then a high-maintenance liability.
What We Benchmarked (So You Can Map It to Your World)
To get past anecdotes, we modeled and benchmarked four common implementation paths across small, medium and large project sizes:
- Proof of Concept
- Scaled Chaos
- Over-Engineered Data Silo
- Enterprise-Scale MDM
We used entity resolution as the measured use case because it’s a core MDM job and a common starting point. But we also accounted for the “drag-along” work that always shows up as maturity grows: stewardship, governance, workflows, DevOps, hierarchy handling, continuous improvement and production operations.
Then we compared two approaches:
- DIY in Fabric only
- Fabric plus Profisee MDM
Not theory. Not marketing. Realistic delivery models, factoring in the people costs, the time and the effort involved in total cost of ownership.
Key Results: Time, Staffing and TCO Favor Buying Earlier Than Most Think
1. Profisee Reached Production Faster, Consistently
DIY builds in Fabric took 28–52 weeks depending on size and complexity. Profisee implementations took about 12 weeks across all sizes.
That consistency matters. Enterprises don’t just want speed. They want predictability.
2. Staffing Requirements Were Dramatically Lower for ‘Buy’
To achieve the same outcomes:
- DIY needed 5–9 team members
- Profisee needed 1–3
That gap widens as you add domains. And in modern data environments, multi-domain is increasingly the norm.
3. Three-Year TCO Favored Profisee for Anything Beyond a Tiny PoC
Over three years:
- Profisee cost ~3.8x less for small full-maturity projects
- ~3x less for medium and large projects
- DIY costs were 2–3.5x higher for large implementations
We also quantified the “break-even bar” for when buying makes sense. And here’s the thing that even I didn’t fully anticipate: the bar is so low for when it makes sense to get into Profisee.
If you’re truly doing a small, throwaway proof of concept and you are absolutely sure it will never scale, DIY can be cheaper.
But for a medium scope project? Buying pays off almost immediately. It isn’t even close for large or growing initiatives — buying an “off the shelf” MDM solution is your best bet.
Why Proofs of Concept Almost Always Grow
We mapped the specific forces that push organizations past “six-week PoC land”:
- Rising data volume and source diversity
- Unsustainable custom code maintenance
- Expanding governance and stewardship
- Greater hierarchy/roll-up complexity
- Pressure for faster time to value
- Bidirectional integration requirements
None of these are exotic edge cases. They are what success looks like in real enterprises. If your Fabric investments are working, demand for trusted master data will accelerate. Fabric drives the need for more MDM because it amplifies the value of clean, governed and consistent data across the lakehouse.
Without disciplined MDM, Fabric’s promise degrades under duplicated, inconsistent, poorly controlled master data. And then your analytics and AI stack starts quietly lying to you. No one wants that.
So, What Should You Do?
Here’s the plain guidance based on the research:
- If you’re testing a tiny, short-lived PoC and won’t scale, DIY in Fabric may be fine.
- If you’re at medium scope or larger, or you have any reasonable expectation of growth, buying MDM early delivers lower cost, faster production and a more sustainable path.
- If you’re unsure where you fall, use the sizing guidance in the report to map your domains, sources, complexity and stewardship needs to a scenario.
And yes, we’re happy to walk through your situation with you. In a half hour call, just plug your numbers in and we can show you what Profisee versus DIY would look like.
For the Best MDM Results, Start with the End in Mind
Most build-vs-buy debates in data come down to emotion, habit or optimism about future maintenance. This one doesn’t need to.
We now have benchmarks that reflect how MDM projects actually evolve in lakehouse environments. The takeaway is simple: start with the end in mind. If the business is going to rely on master data, architect for enterprise scale on day one. The numbers say you’ll get there faster, cheaper and with a lot less misery.
And if you want the full breakdown, including the sizing models and scenario math, watch the on-demand Data Hero Summit 2025 session and grab the report. It’s all there.
William McKnight
William McKnight has advised many of the world’s best-known organizations. His strategies form the information management plan for leading companies in various industries. He is a prolific author and a popular keynote speaker and trainer. He has performed dozens of benchmarks on leading database, data lake, streaming and data integration products. William is a leading global influencer in data warehousing and master data management and he leads McKnight Consulting Group, which has twice placed on the Inc. 5000 list. He can be reached at wmcknight@mcknightcg.com.