Analytics
Business Strategy
Culture
Data Management

The CDO Matters Podcast Episode 02

Strategies & Tactics for a Successful MDM Implementation with Tobias Macey

X

Episode Overview:

In this episode of CDO Matters, Malcolm is a guest on the Data Engineering Podcast with Tobias Macey. This episode is a great fit for any non-technical data leader who is looking to gain a deeper understanding of some of the technical dependencies and concepts required for successful master data management (MDM) and data governance — but without getting too deep into jargon or software engineering concepts.

If you’re a business-centric CDO with limited technical experience or background, this podcast will help you to build your data literacy and allow you to have deeper and more compelling conversations with your technical staff — and it will help you make more informed technology-centric decisions.

Malcolm and Tobias cover some of the technical concepts involved in MDM and governance programs — in both relatable and understandable terms — including:

  • MDM systems architecture and typical implementation patterns
  • The connection between MDM, data engineering and systems architecture
  • Data modeling for MDM and data governance
  • The processes used in MDM platforms to support data quality requirements
  • Entity resolution, i.e., matching or deduplication
  • MDM team dynamics and roles, the role of data stewardship.

After listening to this podcast, any data leaders who may be new to the concepts of MDM or data governance will why their organizations need these foundational elements and better understand how they can be used to drive business benefit.

Key Moments

  • 3:14 – Identifying ‘who is a customer’ to model and govern data
  • 7:11 – What is MDM and how does it add value?
  • 10:27 – Who needs MDM and how does new technology solve for data quality?
  • 15:11 – Limitations and considerations when searching for a “single source of truth”
  • 18:15 – Who is responsible for MDM within an organization and who comprises it?
  • 22:16 – What are the differences between analytics and operational MDM?
  • 29:15 – Top 4 reasons that so many MDM implementations fail?
  • 32:45 – Using a business perspective to identify the right outcomes
  • 37:40 – How MDM is evolving to use graph functionality in addition to relational databases
  • 42:32 – Why Customer Data Platforms (CDPs) fall short for enterprise-level management
  • 43:36 – Insights on novel MDM use cases: data sharing, graph databases, data fabrics
  • 50:08 – 3 ‘Watch-outs’ learned from years in the data management space
  • 54:36 – How small companies can implement MDM principles
  • 57:38 – The gap between data software and real business outcomes

Key Takeaways:

When is MDM relevant for an organization? (10:22-11:34) 

“The bigger and more complex you are and the more decentralized you are…where organizations are struggling to have a single view of the customer…the larger the company, the more they tend to have a need for MDM.” – Malcolm Hawker

Cloud-native data warehouses vs. MDM software (15:11-16:33) 

“There are many cloud-based data warehouse technologies that are saying we can enable a single version of the truth, and they absolutely can…but does it have all the flexibility and reconfigurability to allow for all the things that MDM software can do? Typically, they don’t.” – Malcolm Hawker

What are the differences between analytics and operational MDM?? (23:51-26:22) 

“An analytical style of MDM is where the flow [of data] is one-way…[operational MDM] can actually turn around and syndicate that data back down into consuming systems.” – Malcolm Hawker

4 MDM pitfalls to avoid during your implementation (29:15-31:01) 

“If you’ve got a need for MDM and if you have been given a mandate by your management to come up with a single version of the truth…avoid the key pitfalls that often send so many MDM programs sideways.” – Malcolm Hawker

Companies of all sizes can benefit from MDM principles (57:05-57:26) 

“I would argue that most companies need MDM as a discipline…But chances are, you still have some use cases that need that consistent approach to the data management side…” – Malcolm Hawker

About the Guest

Tobias Macey is a dedicated engineer with experience spanning many years and even more domains. He currently manages and leads the Technical Operations team at MIT Open Learning where he designs and builds cloud infrastructure to power online access to education for the global MIT community. He also owns and operates Boundless Notions, LLC where he offers design, review, and implementation advice on data infrastructure and cloud automation.

In addition to the Data Engineering Podcast, he hosts Podcast.init where he explores the universe of ways that the Python language is being used. By applying his experience in building and scaling data infrastructure and processing workflows, he helps the audience explore and understand the challenges inherent to data management.

Episode Links & Resources:

Your host is Tobias Macy, and today I’m interviewing Malcolm Hawker about master data management strategies for the enterprise. So, Malcolm, can you start by introducing yourself?

Yes. Thank you. I’m Malcolm Hawker. I’m the head of data strategy for Profisee.

Profisee is a Magic Quadrant vendor of MDM, master data management software.

My primary job is to serve as an evangelist within the data management community and to bring awareness for the benefits of MDM, why you should care, how MDM fits into an overall data ecosystem, why you need to be thinking about governance as well. I’m sure we’ll talk about that in a lot more detail. But at a high level, that is my role with Proficy.

And do you remember how you first got involved working in the area of data?

Oh, gosh.

That’s a great question. Yes. And it was kind of the seminal moment as well. I mean, I had been kinda born and raised in the software development world, and there was always data there. Of course, there’s databases sitting under all applications.

And I had managed a number of engineering teams that were deploying and managing database software. So I suspect I’ve always been in data management going across my entire thirty year career. It sounds even crazy to say that out loud. But, yeah, almost thirty years.

But, really, I kind of really dove into the data management space, the data and analytics space. When I got hired as a consultant for a billion dollar publicly traded company, My SOW, my statement of work as that consultant was really simple. The statement of work said, solve the problem of answering how many customers we have. That was pretty much it.

That was the entire SOW.

And I was like, man, this is gonna be the easiest gig ever. Right? I’ll go in, and I’ll help them establish some analytics. We’ll we’ll stand up some dashboards.

I’ll connect a few databases. I’ll run a few integration scripts, and and I’ll be off to the races. And the client will be just be ecstatic.

And I figured out rather quickly that it wasn’t gonna be that easy, that there was a lot more involved to simply answering the question of how many customers do we have because, you know, what I found and what everybody inevitably finds is that, you know, in this case, it was b to b, but it could be b to c. It doesn’t really matter. Through a b two b lens, it was Acme, Acme Inc, Acme LLC, Acme Co, and we didn’t really know if that was one thing or four things. And that was really my appetizer, I should say, into the world of data management.

And it’s what, believe it or not, really kind of drew me into the space because I was just fascinated by the ultimate simplicity of the problem and the underlying complexity of the problem. That kind of that paradoxical notion of, in this case, MDM, master data management, because what we’re talking about is a customer master record. The underlying paradox there just drew me in. I just found it fascinating.

And from that point forward, I’ve really kind of been focused on data management, specifically governance and master data management.

Yeah. And and even beyond the kind of entity resolution problem, there’s also the initial question of, well, what is a customer? How do you define whether somebody is a customer or not?

I, Tobias, I can’t tell you how many days of my life I’ve lost sitting in meeting rooms bickering about that very topic, about how do you define a customer. You think it’s so simple, but it’s just not because everybody’s got a different view. Right? And marketing will think one way. Finance will think another way. Legal may think an entirely other way, and they’re alright.

Every one of them is right through their operational lens. Right? Marketing tends to look at things through the lens more of kind of prospects and potential customers, maybe even past customers if they’re trying to bring old customers back into the fold. You know, fulfillment or operations or logistics tends to look at ship to addresses, like the customer is where the goods go.

Right? Legal tends to look at things through the lens of, well, okay. If things break, who’s gonna try to sue us? Right?

And finance will tend to look at things through the lens of rev rec. And, again, all valid views, but your comment is spot on, which is, you know, how do you define a customer? And the answer to that question is inherently a governance decision. It’s a policy.

What you’re defining is a policy, but it is intrinsic to a data modeling exercise as well. Right? So if you wanna set up a database and start modeling out customer, another thing that you’re gonna do is you’re gonna model out relationships that exist as well. Right?

So that’s a join. Right? I don’t need to tell anybody on this podcast how to model data, but it’s, again, simple but complex.

Right? And it’s those things together. How do I define a customer, and what does that actually mean? Well, that’s a governance decision. Governance sounds hard. And, also, is a data modeling and a data architecture question as well. So, yeah, it’s simple, but it’s often not.

Yeah. It’s funny because it’s one of the perennial problems of data. And one of the things that does make it such an interesting and complex space to work in is, you know, if you’re working in a as a software engineer, you know, building a web application, from the kind of mechanical perspective, it’s very rote. You know, you can say, this is the right answer because this is the way that the HTTP spec was written.

I know that I can do this. I can write some tests for it, and I am done. You work in data, and it’s like, oh, okay. I just want to, you you know, build a customer model.

Okay. A customer has a name and an address, and you start to talk to the business. Say, okay. Here’s my customer model.

And they say, no. That’s not right because they’re not a customer yet, or we also need all this information to determine when they became a customer or how they became a customer or if they are, you know, a customer of our customer if you’re doing b to b to c kind of a thing.

Right. Right. It’s complex, and you just touched on it, and I touched on it before. It’s not just the individual entity.

It’s the relationship between those entities. Yep. And which some of those relationships may be nested. Right?

In the case of a b to b customer, you know, if your customer is Berkshire Hathaway, does that mean you’re also doing business with Dairy Queen? Right? Which is a wholly owned subsidiary of Berkshire Hathaway. Right?

Or conversely, if you’re doing business with Dairy Queen, does that mean in theory that you’re also doing business with Berkshire Hathaway? So, you know, building out hierarchies and managing hierarchies of customer information is a key part of MDM. That’s a part of it. But, again, it traces right back into a modeling discussion.

Right? Previous to what I do with Profisee, I was an analyst with Gartner for nearly three years. I mean, I got a lot of questions when I was a Gartner analyst about, okay, what’s the right way to model customer data?

And the right answer is that you need to understand your business strategy, and you need to understand your business goals, and you need to understand, you know, what you’re trying to get out of the data. But it was inevitably a fairly complex question because of the issues of nested relationships and relationships, complex relationships that that may exist within a given company.

You’ve touched on it a little bit about sort of what master data management is and its role in an organization, but I’m wondering if you can just give your kind of elevator pitch definition of what master data management is when you’re talking to somebody who has never come across it before and some of the overall scope of the activities and functions that it embraces.

Yep. At a very high level, master data management is the people, process, and technologies that are required to provide the accuracy, consistency, semantic consistency, particularly, needed to optimize the value out of an organization’s shared data assets.

So the shared data asset part is really important because master data is the data that is used widely across the organization. When I was a Gartner analyst, I would usually visualize this with some form of a Venn diagram, like a three ring Venn diagram. And you can imagine right in the middle of that three ring Venn diagram, that’s master data because it’s used everywhere. Right?

So not everything is master data. Just because something may be important, important to a specific workflow, important to a specific report doesn’t mean it’s master data. Master data is that data that is used consistently everywhere in the organization. So things like we we typically define them through the lens of what we would call a domain or maybe an object.

Right? Customer, product, location, contract, SKU. The list here can be fairly long, but I’ve worked with some ridiculously large companies where, for example, their customer master record is fewer than ten fields.

Right? So we’re not talking every field of customer data. We’re talking the fields that are used ubiquitously everywhere and that need consistent definition, that need consistent structure, that need consistent quality, that need consistent governance policies applied to them. So that if the CEO asks how many customers do we have, there can really only be one answer, at least at that level.

Right? We can get into more complex forms of MDM, more context driven forms of MDM where there are potentially multiple answers, where to a marketer, there could be one answer, and the CEO, there could be another answer. And those forms of MDM and those forms of governance exist, but they tend to be fairly rare because the kind of if then notion having multiple answers requires a fairly mature approach to data management that frankly most companies lack. Right?

For most companies, if they can just get to the point of there being one answer that everybody trusts and everybody agrees that, well, hey. Fireworks. Fantastic.

Yeah. Absolutely. And one of the interesting things about the overall concept of master data management conversations that I’ve had, it tends to seem as though that idea comes primarily from enterprise organizations where you have these very complex reporting requirements and complex organizational structures.

And as the overall ecosystem of data management and big data and all of the associated technologies and trends have come about. You know, the conversation around MDM has faded a bit, and there are things like the metric store or the semantic layer that are, in some ways, kind of replacing those conversations. And I’m wondering what you see as the ways that the conversation around MDM has shifted along with the sort of shifts in technological and organizational capabilities.

So to the first part about what you were essentially paraphrasing you now is you were asking about, you know, when is MDM relevant? Right? Is there a certain size or complexity of an organization where it becomes more relevant? Right?

And is it more relevant for extremely large enterprises? Answer, yes. Absolutely. Right? The bigger and more complex you are and the more decentralized you are from an operating perspective.

I gave the example earlier of Berkshire Hathaway. Maybe actually not a very good example because they are very, very decentralized.

They’re holding company in essence, and each of the individual operations has complete autonomy to do whatever they wanna do. But there are other organizations where that’s not the case. Right? Where organizations are struggling to have a single view of the customer or a single definition of the customer. And through kind of natural growth or through mergers and acquisitions or through having a lack of governance or a lack of centralized data management approach or a lack of maybe even office of the CDO. Over time, companies have naturally just kind of evolved to different definitions.

But as a general statement, I would say the larger the company, the more they tend to have a need for MDM. Is there any sort of cutoff?

No. But generally, what I’ve seen in my experience is once companies hit two, three, four hundred million in revenue, that seems to be about where they start asking questions about MDM.

And almost always the first use case that pops up, almost always, is differences in CRM data and ERP data, marketing versus finance. And that’s almost always where splits, tend to happen for the first point for the first time. And, you know, CEOs ask for a report, and then that report takes a week to run because somebody in IT has to manually try to reconcile Acme versus Acme Inc. Right? And what that leads to almost inevitably on the IT side is kind of like you’d use the word entity resolution before.

Completely correct, AKA matching. Right? And kind of rudimentary forms of MDM almost always take shape in this, you know, IT organizations where somebody generally has some sort of a data loading script or a data transformation script or moving data from a to b, where b is a single bucket of data, where somebody’s given the task of, okay, you know, figure out if Acme Inc and Acme LLC are the same thing or a different thing.

Right? And which leads to things like, okay. Well, if they share the first three letters of the string or the first four letters of the string, chances are pretty good. It’s the same thing.

Oh, well, wait a minute. Look. I need to look across fields. It’s not just the name field.

It’s also the address field. And, oh, well, wait a minute. Hold on a second. You know, there’s natural differences vendors of MDM software, their phone starts to ring.

When IT organizations you know, the vendors of MDM software, their phone starts to ring. When IT organizations are like, oh, wait. This is way harder than I thought it was gonna be. The stuff that we did with load scripts or, you know, ETL scripts to try to resolve for this just don’t work or aren’t scalable.

Or we did it two years ago and the guy who wrote the script left And now we can’t unravel it. We don’t know how the thing works. So maybe we should be looking at software. So there’s no hard and fast rule of when, you know, MDM becomes needed, but it’s often born out of IT frustration of around entity resolution.

Now the second part of your previous question, you’d asked kind of about evolving technologies and our technologies helping to solve for this. And yes and no is the answer. Right? Like, can you use new technologies like, you know, kind of a data virtualization technologies maybe or other new kind of technologies to solve for this?

You can try. It’s awfully hard though. And the core logic here really comes back to that entity resolution. There’s a lot of things that make MDM different, at least from a software perspective.

And if you were talking to a Gartner analyst, they they would show you something called critical capabilities of MDM, and there’s thirteen of them that make MDM software unique. There there’s an ETL capability. There’s a workflow capability.

There is data governance. There’s data quality, right, where you build business rules into the software to say, well, this is when something is accurate. This is when it’s not accurate. There’s a UI component obviously to it as well to allow for data stewards to manually review data and on and on.

But, you know, the answer to your question is, can new technologies help? Yes. They can certainly help. MDM software is one of those technologies that makes these processes a little more scalable, a little more configurable for sure, where you can kind of turn some of the dials that are used for matching or that are used for some of the business rules around data management.

But when you start talking some of the whiz bangy stuff like AI and ML, data virtualization, other new technologies, they really have a hard time solving for some of those core problems that I was talking about, particularly entity resolution.

Yeah. It’s it’s definitely interesting how every technology tries to obviate the ones that came before when they all need to work in conjunction instead of just saying, we’re the new shiny thing. This is all you’ll ever need.

Right. Well, a great example right now are, you know, data warehouses in the cloud. And I won’t name any vendors, but there are certain kind of many cloud based data warehouse technologies that are saying we can enable a single version of the truth. And they absolutely can enable a single version of the truth, but they’re gonna look at source data, and they’re going to see Acme Inc, Acme LLC, Dairy Queen, and say, okay.

Wait a minute. Those are probably three different things because they have three different source IDs. Right? They may go so far as to run some very basic data quality and understand, And, poof, where and poof.

Where there you go. We’ve got a master record, a sort single source of truth for Acme Incorporated. But then you’ve got another one for Acme LLC sitting right next to it and another one for Dairy Queen sitting right next to that. Where Dairy Queen and Acme LLC may actually be the exact same thing.

Maybe they changed names. Maybe they are part of the same corporate hierarchy where you’ve made a decision to say anybody in that hierarchy is part of the same corporate family or shares the same customer ID. Probably not as relevant use case, but you get my point. Right?

Which is a data warehouse can absolutely be used to establish a single source of the truth. Yes. It can.

But does it have all the flexibility and configurability to allow for all the things that MBM software can do? Typically, they don’t. Right? I had many clients when I was a Gartner analyst ask me, hey.

I’m standing up a data warehouse. You know, insert big name here. It doesn’t matter. Right?

AWS, Azure, it doesn’t matter. Any proprietary names, can I use that as my MDM?

And the answer almost always was, we probably not. Probably not because they don’t have those thirteen capabilities that MBM software are purpose built to solve for, to support. So they have one or two of them. Right?

They’ve got integrations that are kind of hardwired in. They’ve got the single repository, and that’s a good thing, whether that is physical repository or virtual repository, one or the other. But do they have stewardship UIs? Do they have complex business rule management for governance policies and on and on?

No. They don’t. And the same is true with a lot of the source systems. So I used to get asked all the time, hey.

We’re spending a lot of money on salesforce dot com for CRM. Can I use that as my MDM? Well, again, probably not. Right?

Salesforce, awesome tool. It can be a great marketing source of truth.

But as an enterprise wide source of truth, you’re probably gonna run into a situation where people in finance or legal or operations or logistics or you name it probably don’t view that data the same way that people in sales view that data because Salesforce or any CRM system is purpose built for sales centric use cases, not enterprise wide use cases.

Absolutely.

And in terms of the overall MDM effort in an organization, who are the people who are typically responsible for that once they do say, okay. This is an an actual endeavor that we have to support. We need to invest in it. You know? Maybe you bring in a vendor solution to be able to help with that. But who are the people who are actually responsible for making it work and maintaining it over the long run?

The textbook answer is that MDM is supposed to be a collaboration between IT and the business, where there is an active collaboration between systems and operations and software, which is the domain of IT, and business rule management. I will just loosely say business rule management, AKA requirements. You can call them requirements, but it’s all the business rules that would be configured into an MDM for things like, how do I define a customer? How do I define those customer relationships?

What are the data quality rules that this system will use? Right? When are two records the same and when are they not the same and on and on. So in a perfect world, there’s an active collaboration.

There are people on the business side of the house who are defining all the requirements, and they’re helping define some very basic data governance policies that would be configured into an MDM. Then on the IT side of the house, generally, MDM lives within some data and analytics function. Right? The same team that is gonna be deploying Tableau or Qlik or Burst or the same team that has the data science function, the same team that typically would have in some, you know, data integration functions, data management, data modeling functions.

That’s generally where MDM lives.

Typically, ninety percent of the time, MDM is being deployed and implemented by consultants. This is a metric that that Gartner published in its Magic Quadrant two years ago. I don’t see any reason why that would have changed in the last two years. So consultants are heavily involved here because there’s a shortage, if you ask me, of subject matter experts who really know MDM very, very well, particularly within various vendor solutions, whether that is Profisee, the company that I work for, or anybody else.

Consultants can play a very, very important role here because they will have experts who know that software and who can help get it up and running. An MDM team will typically involve some sort of program lead, the director of of MDM that would manage some form of a small team that would generally include some form of an analyst. Can be two forms of analyst, more of a business analyst, somebody that would know the business processes and dive into things like data lineage, business process, how, you know, how do customer records or product records or whatever records get created. There can be more systems analysts as well.

Right? The people that tend to understand the back end and who know, you know, how to build ERDs and may even know SQL and can help from a systems and operations and deployment perspective. Generally, some form of a data architect involved. Generally, some form of a systems architect involved.

And then getting back into the business side, inevitably, once you have the MDM software up and running, there is a need for what’s called data stewardship. Right? Human beings to manage exceptions. So you’ll code rules into an MDM that says, Acme Inc and Acme LLC are the same thing.

And you’ll say, you know, we’ll wait the name twenty percent, we’ll wait the address thirty percent. So the algorithms that are running in MDM programs are are fairly advanced and are getting even more advanced thanks to, addition of graph and a few other cool technologies.

But the algorithms can only go so far. Right? Typically, you will have some humans involved for exceptions where the algorithm says, I can’t I can’t make a firm determination whether Acme Inc and Acme LLC are one thing or two things. And I keep using a b to b context here, but it could be g smith at g mail dot com and jeff smith at at g mail dot com. Right? Whether that’s one thing or two things is equally relevant in the in the consumer and corporate, b to c and b to b spaces. Although, in the b to c world, you know, the the prevalence of email addresses, I I wouldn’t say it makes it necessarily easier.

But but, you know, email address plus cell phone, you know, there are some identifiers out there that that are that are I won’t say, you know, persistent, but but are are more often used.

As far as the actual technology solutions, you mentioned that there are these vendors out there, Profisee being the one that you work for and the one that we’ll probably spend most of our conversation on today. And I’m wondering if you can just talk to some of the ways that that software integrates with the data systems of an organization to support and maintain that MDM solution and some of the capabilities that they bring in and some of the ways that you think about kind of selling into an organization and going through that integration and implementation phase?

There’s really kinda two high level what we would call styles of MDM. There are analytical styles of MDM, and then there are operational styles of MDM. A great way to think about MDM is that it will be a data hub. These are hubs of data. These are collection points of data where they will sit on top of, logically speaking, on top of source systems of data. Right? So to deploy an MDM, you’ll set up this this hub.

Increasingly, these are cloud based. Right? And pick your cloud. Doesn’t really matter. We can run it in Azure.

We can run it in in AWS on and on, where that hub will be collecting data from multiple source systems. Right? It’ll go collect customer data from a CRM. It’ll collect customer data from an ERP or often multiple ERPs.

Right? For a lot of bigger companies, they’ll have more than one, right, where it’ll be I’ll go get data on customers from ERP one, ERP two, ERP three, often from, you know, IT service management type applications like a ServiceNow. Anywhere there’s customer data, the MDM will have an integration to those systems, right, where it could be a pull of data. It could be a push of data.

It could be a Kafka stream of data. It doesn’t really matter.

But we’re certainly facing APIs on that MDM hub. We’ll be constantly being pulling for data on customers new or edits to data and where that customer data it kind of be streamed down. A light version of that customer record will persist in the MDM. Right? So, again, you know, I’d mentioned before, it’s not gonna be all two hundred fields of your customer record. It’s gonna be a limited number of fields in your customer record that are replicated into an MDM hub.

Now getting back to that notion of analytical versus operational, an analytical MDM really solves the question of a three sixty of something. I will loosely use the word three sixty of something. Right? If your only use case is to solve for a view of a single view of your customers, then an analytical use case is perfect.

And all you’re trying to do in that case is to tie all of those versions of Acme together under some unifying ID. Right? Using a set of business rules that are configured in the MDM, where all that data gets pulled into a centralized hub, and then you try to unify all those versions of Acme under some new master ID and potentially even a master record that just kinda serves as a stub, as a placeholder. Think of it more as a kind of a registry where each of those kind of let’s just call them child IDs or linked to some parent ID in the MDM hub, which will allow you to then aggregate any sort of data associated to those child records in a consistent, accurate, trustworthy way so that you could run a report that says, here’s how much business we’re doing with Acme Incorporated.

So that’s an analytical style of MDM where the flow of information is one way from the source systems into the hub and it dead ends in the hub. And where you could be using that, you know, those IDs and those keys for reporting in your enterprise wide analytics platform, but where there isn’t a bidirectional flow between the MDM and the source and the contributing systems.

That bidirectional flow would be more akin to an operational pattern of MDM, where that first half that I described, MDM will pull in the customer records, the product records, the location records, whatever objects that you you’re focusing on. It will create a new master record for Acme Incorporated or for a product or for a location, and then it can actually turn around and syndicate that data back down into consuming systems. In theory, you could start with five versions of Acme Incorporated in a CRM.

But once MDM has run its processes, you could merge those records down into a single record and propagate that back into a CRM or back into an ERP with one consistent record. So two different styles of MDM kind of typified versus analytics versus operations, but the first one being relatively simple, I’m air quoting simple here to deploy because you’re not merging records. You’re not getting into the business of changing any workflows. You’re not getting into the business of changing how core applications work. That’s more the domain of operational MDM where you could be actually, you know, trying to inform or enforce business rules within contributing systems.

So you could even go so far as to use MDM with a real time connection to, say, a CRM system where somebody’s typing a new record into a CRM, Acme Inc, where real time via an API connection to an a b into an MDM hub, the source application could say, wait a minute. This already exists. Right? Are you sure you wanna create this record for Acme because we know it already exists? Yes or no. That type of thing. So two different deployment styles.

Generally, these are data hubs where the data is sitting and is being persisted in an MDM hub where it could be used in multiple downstream applications including, you know, BI platforms or even operational systems themselves.

As far as the strategy of going into an organization and saying, okay. We’ve got the technology. We have a way to be able to reconcile these different source records. We can figure out what the sort of combination is supposed to be, but now we need to actually figure out how we want to combine those. What are those canonical business objects, and how do we define them? What are the actual strategic elements of implementing MDM, figuring out those governance policies, and then translating that into the actual tactical elements and turning that into an ongoing process that gets maintained.

And then once we explore that, maybe feeding into the conversation about, okay. I’ve got my master data management. I’ve figured out how to map it across my entire organization. And then, oh, shoot. Now I just went and bought another company. I’ve gotta figure it out all over again.

Right. Right. The process you first described, you know, what are the business rules that you use to merge records? Right?

If you’ve got four versions of Acme Incorporated, you know, what rules do you use to merge them together? That’s what, you know, MDM and governance nerds like me called survivorship. What are the rules to do that? That’s the hard part of MDM.

That’s really the hard part of MDM because what you’re trying to decide is who wins. Right? Does marketing win or does finance win? Or is there some sort of compromise that happens between those two groups or the five groups or the ten groups or the ten divisions? It doesn’t matter. Everybody wants their version to win.

So that’s the real hard part of MDM. The technology of MDM so, you know, putting on my analyst hat here again, MDM is both a discipline. It’s a way to manage data, but it’s a technology. So it’s both.

It’s a noun and a verb. And when it gets into operational styles of MDM, when you start talking about merging records, when you start talking about having a single version of the truth that is used operationally, then that’s when things get really, really hard. And if you don’t have some form of governance, if you don’t have strong executive support where you have an executive stakeholder who’s acting as an arbiter. Right?

Making sure everybody gets along and that the rules are being followed.

That’s kinda where MDM programs can go sideways. Right? If you talk to a lot of, you know, elder statesmen like me who’ve been in data management a while, chances are pretty good they will have some some experience in a failed MDM program.

I hear it all the time that where that failed MDM deployment has a lot of scar tissue associated to it.

Right? The story kinda goes like this, which is, oh, well, you know, needed a single version of the truth.

And, you know, maybe our regulator told us to do that in more of a banking use case, or maybe it was our auditor told us to do it, or maybe our CEO told us to do it because we were trying to work towards some of, you know, digital transformation type endgame. And we hired a consultant, and they came and did a due diligence. And they spent nine months on the as is, and they spent nine months on the as is. And they spent nine months kinda cataloging all of our data, talking to all of our stakeholders, and building a business glossary and, you know, detailed understanding of our lineage and on and on and on. And then by end of year one, we had a new CIO, and they got frustrated with the lack of progress on MBM, so we shelved it.

I mean, like that I I hear that stuff all the time, and I used to hear it when I was an analyst all the time. And that’s kind of, like, typifies the what not to do on an MDM deployment. Right? Because when the consultants descend, they’re used ninety percent of the time.

So don’t get me wrong. You’re probably gonna get some value out of consultants. You’re probably gonna need their help. But there is a little bit of a vested interest there for the consultants to make this a pretty big program.

Right? Because when they make it a big program, they get paid more. Right? And so how do you find a balance?

Right? If you’ve got a a need for MDM, if you have been given a mandate by your management to come up with a single version of the truth or a single viewer, a three sixty view of customer, just be very, very careful that you avoid scope creep and that you avoid trying to take a kind of a big bang approach, that you avoid a lot of the kind of the key pitfalls that often send MDM program sideways. And I touched on a couple of them. Right?

One is having a too big of a scope. Right? Don’t try to break all of your data silos all at once. Just break a couple data silos.

Like, break two or three. Right? Be very focused. Take, like, an MVP, minimum viable product type approach to taking a limited scope approach here and and get something off the ground quickly.

So don’t try to boil the ocean from a scope perspective. That’s one kinda key to success factor. Another is to have that executive support we were talking about. So, so, so critical.

You need somebody in your corner from an executive perspective. Third part is don’t cut quarters on governance. You need the business engaged. You need them to help the define customers and customer relationships.

You need them to help with stewardship. So don’t cut corners on that. Number four, know what your expected business outcomes are. I can’t tell you how often I would get on the phone when I was an analyst with IT leaders, and I would ask and they would say, hey, Malcolm.

We wanna do MDM. We got executive support. And they signed the checks, and they’re excited, and we’re ready to go. And I’d ask the question, well, why?

Well, because our data is bad. Right? Our data is bad, and we’ve been given a mandate to fix the data. It’s like cold hard truth that I learned very, very early on in the data and analytics space.

Nobody cares really about bad data other than IT people. Right? For data people like us, it pains us to see bad data. Right?

Like, it it causes, like, an antibody response. It’s like, we gotta fix the data. Right? What don’t you get?

We gotta fix the data. But then on the business side of the house, they’re like, hey. We’re hitting our sales quotas.

Right? We’re shipping the goods out. Right? We’re selling what we need to sell. We’re hitting our numbers, and you’re telling us the data is bad.

One of the hardest lessons I learned, I I walked into a CFO’s office to try to get funding for an MDM program, and I put together what I thought was a business case. This this is MDM dependency number four. I have a business case. I thought I had a business case, walked into CFO’s office, and he looked at it.

And the business case was all about fixing the data. Right? Reducing our duplicate rate, eliminating no fields and customer records, eliminating malform fields. And and it was all data quality centric.

And those things are important, but the CFO looked at it and said, you know, we’re a publicly traded company. I got Deloitte in here four times a year auditing my books. They say our data is good. My chief revenue officer says the data is good.

Our logistics people say the data could be better, but we’re still delivering most of the goods and everything’s fine. And you’re telling me everything’s bad.

So who am I supposed to believe? Right? Am I supposed to believe all those other people, including my auditor? Or am I supposed to believe you?

Right? I went back to the drawing board and I said, okay. Wait a minute. I need to look at this a different way.

I need to look at this through the lens of, can we sell more? Right? Can we reduce our costs?

Can we improve our customer experience? Right?

And when I did that, I build a business case for MDM strictly on cross sell, upsell. That was it. That’s all I did. For scope number one is I forgot that there was other things that we needed to fix as well.

But all I did was is I went to a third party data provider. I bought information about corporate hierarchies, and I loaded in into a limited stripped down data hub. And I did some entity resolution on that data, and I was able to show people in sales that we were selling to division a of a company, but we weren’t selling to division b. And I didn’t even use the word governance.

I didn’t use the word data quality. I didn’t even use the word MDM. I just said, hey. We ran this report that showed where you could be selling more.

Does this interest you? Of course, it interests me.

Eureka. I’ve got funding for MDM.

Absolutely. Yeah. It’s one of the hard lessons that we have to repeatedly learn as technologists is that technology for the sake of technology is pointless, and only technologists care about it. Nobody cares about your data until it causes a problem for them. You know? Nobody goes to your website because of your beautifully written software.

Right.

Right. Yeah. Exactly. I mean, I’m pretty active on LinkedIn, and I, you know, I’m pretty active in the industry.

And I just I keep scratching my head because I keep seeing these posts from pundits like me who keep talking about, hey. Focus on business value. You need to focus on business value. And it’s like, well, yeah.

Right? Like, everybody thinks they are through their metrics. Right? And if you’re in the analytics group and how you are incented is either the deployment of software or maybe it’s the speed at which your dashboards run.

Maybe you’re in the operation side and you’re making sure that nothing stops running. Right? That’s what being customer centric is to you. That’s what being business focused is to you because that is your business.

Right? Keeping the databases running is your business. Keeping the ETL scripts running is your business, and that’s how you’re incented. Right?

At the end of the year, your boss will sit down with you and say, okay. Did the ETL scripts run? Yes. Did the service keep running?

Yes. Right? Did you hit your twenty four hour SLA on that build? Yes. You’re customer centric.

I get it. It seems again, getting back to that theme, it seems so simple to say, be focus on outcomes, focus on business outcomes and business value. But when you’re in IT and business value equals those things that I was just talking about, you think you’re hitting the mark. So I don’t know.

I think maybe as, you know, as data leaders, data management professionals, maybe we need a different lexicon. Maybe we need, you know, a different glossary to describe the stuff. I’m not sure. I don’t have any magic answers.

But, yeah, I get a chuckle a lot of these constant well, there’s a lot of finger waving on LinkedIn, you know, about, hey. Focus on business outcomes, and shame on you for not focused on business outcomes. But if you ask most people, they’ll tell you they already are.

Taking that wonderful conversation and going in completely the opposite direction now, I’m interested in dig digging into some of the kind of technical and data modeling considerations of when you’re building an MDM solution. So you say, okay. I’ve got this platform. I’m going to be able to, you know, do the entity resolution for these attributes.

You know, how do I think about modeling these different domain objects and the data records? You know, is it primarily a relational exercise? Is it a graph exercise? You mentioned that graph technologies is something that are being brought into more of these MDM solutions.

So as somebody who is working at the data layer and, you know, I’m responsible for making sure that all the source systems are staying up to date and that I’m cleaning things appropriately and I’m managing my data quality. I’m feeding that into my MDM solution. How do I want to present all of those records to that MDM platform to make sure that I’m modeling things appropriately so that those business users can obtain that value from these, you know, beautifully modeled records that nobody else will ever have care about.

Yep. Yeah. So there’s a few questions nested in your question.

Let’s start with the data modeling, and and then there was another question really kind of relational, nonrelational, and where are things going, you know, what are they? The answer to the question of, you know, relational, nonrelational is yes.

Increasingly, they are both, but kind of legacy old school MDM is a highly relational exercise because the legacy algorithms that run need structure. Right? So there are a well known set of algorithms for entity resolution without going into too much boring detail that the Jarrow Winkler and the Levenshtein distance algorithm, the Soundex is another one. These algorithms, particularly the distance driven ones, so I can measure the difference between characters.

And I’m at the boundary of my intellectual capabilities. But to make a long story short, traditionally, they run on kind of doing kind of value pair analysis, Right? Every pair of records is evaluated against itself, which means that that process, entity resolution, requires a little bit of structure and is very compute intense. Right?

However, you can be very specific about the outputs. Right? You can configure the outputs. You can weight the algorithms.

You can say use address ten percent or use name ten percent or ignore this field or apply these other rules. So that’s kind of old school MDM classic. Right? And for the most part, that is still how MDM platform, most of them are running these days.

And there is still a pervasive belief in the business, at least within the user community, I should say, particularly for more legal driven use cases that that configurability is absolutely critical. Right? That you need control. You need auditability. You need even rollback, like, the ability to unmerge or unmatch or match as of a certain date, right, where the results of the match are persistent and consistent and predictable, particularly, again, for certain use cases.

This is still how most MDM plat, processes are running these days, and most MDM platforms utilize that form of matching, where, again, kind of born out of more legal or more well defined use cases. There is a growing use of graph in the space where there’s a growing focus of trying to merge kind of that relational driven compute intense world with non relational, graph driven, relationship driven, node driven approaches to matching. Now nobody has really kind of figured out how to bring these two worlds effectively together.

What’s being done now is really kind of more of a waterfall type approach where graph driven matching or graph driven entity resolution, which is still very, very early, by the way, from the perspective of kind of its technological maturity. You’re on pretty cutting edge ground here in terms of, like, graph based entity resolution. But where I see things going is that there could potentially be kind of a waterfall type approach to entity resolution where it’s a first pass or maybe a second pass, right, where you use Graph for a first pass to support more marketing centric use cases where close enough is good enough.

Right? Where you’re not gonna get sued if you get it wrong, you know, or you’re not gonna break any rules or or regulations or be noncompliant if you get things wrong, where, you know, again, it’s a marketing driven use case where close enough is good enough. And if you send the wrong PD mail or the wrong offer to the wrong person, well, not optimal, but you’re not gonna get sued. Right?

So I could easily see a world where Graph kinda gets integrated into this space where you have both nonrelational processes that are running and relational processes that are running in the background.

What I just described is kind of a bit of a next gen type approach to MDM. There’s not a lot of vendors that are that are really kind of doing that that yet for a lot of different reasons, mostly because kind of graph based entity resolution is still pretty new, but that’s kind of where the industry is headed. Now from a analytics perspective, there’s a lot of graph that’s being used in the MDM space. Right?

Because it can do analytics, and it can help with relationship mapping and hierarchy management in ways that provide a lot of flexibility, and they’re very user friendly. Right? Where you can kinda hit the using graph, you can kinda hit the hierarchy o matic button. Right?

It’s like, go build my hierarchy for me. Now if you trust the data a hundred percent, poof, you’re done.

Right? You could run those processes. And if you trusted your data, your source data, you’re good. But I don’t know anybody that does. Right?

So even in a world of AI assisted, graph assisted, you know, what the industry pundits would call more augmented forms of data management, even in that world, there is a role for kinda human driven oversight here. But, historically, you’re talking generally about relational databases that are running match processes in support of more legal or compliance or regulatory driven use cases where the cost of being wrong is is generally pretty high. But where there are new use cases and new applications that are being built to support other use cases that tend to be a little more marketing driven. And interestingly, what I just described is the difference between MDM and a new technology that is out there, been around about five years called the CDP, a customer data platform.

So customer data platforms are the kind of the new shiny object in the space. They’re the new whiz bangy thing where if you do a search for, you know, single version of the truth or gold master record, what you’ll get back is a lot of customer data platforms that are very, very marketing centric where, again, they’re non relational in nature where that customer data is not typically persisted over time, where that hub does not persist over time, where even across campaigns or across loads of data, you could get different answers to the same question, which, again, may be good enough for a marketing use case. But, typically, enterprise wide use cases need more persistence and consistency in in the answers.

In your experience of working in the space of master data management and helping organizations adopt the technologies and understand its utility and the ramifications and the ongoing maintenance that’s that’s required? What are some of the most interesting or innovative or unexpected ways that you’ve seen MDM used or solutions that you’ve seen to address that need?

A lot of MDM use cases tend to lean towards kinda same old, same old. Right? Like, the customer master record or a product master record.

There’s some new things afoot in the area of what I would loosely call data sharing.

Now I happen to think that there’s a lot on the horizon here, and there’s some interesting stuff on the horizon, but where companies are getting together to share data. So think about everything I’ve described and what MDM is. It’s a single hub of your customer data, your product data, your location data, your asset data, your employee data. It doesn’t matter.

But think of that and start looking across multiple companies. Right? Company a, company b, company c, company d. Right?

You could, in theory, use MDM to sit on top of all of those other MDMs. You could build an MDM of MDM, where in theory, you know, two plus two could equal five, where you could start to drive some network effects by looking across multiple sources of data to create a single version of the truth where those companies are comfortable in sharing that data. Right? I would argue if you can Google search the name and address and headquarters of Verizon.

Right? You know, that’s not competitive data for you as a company. Right? If it’s publicly available out there, I would argue that’s highly commoditized and probably not a competitive differentiator to know where the headquarters of Verizon is or of any other company.

So that’s the type of data that I could see start to be used in more of a shared mode. Right? So there’s going to be a interesting economies of scale there because right now and it could be a Verizon record, could be an AT and T record, it could be an Acme Incorporated record. It doesn’t matter.

But across companies across the globe right now, they’re all managing their company their customer data, their product data, their location data, and it’s all being managed largely the same way, and it is horribly duplicated. Meaning, company a, company b, company c, and if they’re all relatively large companies, they’re probably all doing business with Berkshire Hathaway. They’re probably all doing business with Verizon or AT and T. They’re probably all doing business with any of the Fortune one thousand or two thousand companies.

Right? And they’re all applying stewardship to that data. They’re all applying business rules to that data. They’re all applying storage and compute to that data.

And could we start to manage some of that data as more of a shared asset? So one of the more interesting approaches I’ve seen here, there is, you you know, a company over in Europe that has kind of created a consortium for common data management. Consortiums of data have existed for a long time. A great example is, like, UPC codes, like barcodes and products.

Right? At the core of that is a consortium of companies that have got together that have agreed on a common set of data governance policies.

Right? Business rules. How do these UPC codes work? How do the barcodes work? What do they mean?

Right? That required a bunch of companies to get together, generally, in the form of some sort of industry group. Like, you know, they will create some sort of, like, an industry group that manages the standards. The same is true with any data standards organization, where I could see more and more and more of that evolving over time because there’s really cool economies of scale that could exist there where instead of a large company paying for four or five or six or maybe even ten data stewards I was just at a conference two days ago in San Diego where I heard of a company that had a hundred and fifty data stewards.

Right? Hundred and fifty people, all they’re doing is managing data quality. That’s it. Right? To make sure that customer records were accurate.

It’s a hundred fifty people involved in customer record management.

If you started to manage some of this data as a shared asset, well, if you drove those that that hundred fifty people down to, like, two or three, right, where the maintenance of the data becomes more of kind of a shared pooled thing, that could be relevant. That could be interesting. Some other interesting MDM use cases. I mean, there there’s certainly a lot of advances in AI and ML in the space.

Some vendors are focusing on increasing uses of AI and ML, particularly when it comes to data governance rules and data management where you could build algorithms to kind of train entity resolution processes over time where the decisions of data stewards could be used to help train match decisions, that kind of thing. We talked about graph, talked about data sharing. There’s a few interesting things afoot, maybe even this notion of a data fabric.

I was again, I was at a conference a couple of days ago in San Diego, and there was a lot of buzz about data fabric. Most of the buzz was data fabric versus data mesh. It appears there’s evolving tribes there where there’s a data fabric tribe and a data mesh tribe, and they’re rather animated in their belief that one may be better than the other. I don’t get into all of that.

But from the notion of a data fabric, which is if you ask me, data fabric is a world where data starts to inform its own classification, its own use, its own management, right, where a data catalog could with a semantic layer on top of it, with an MDM layer on top of it with a few other layers, a data quality layer on top of it could where the metadata out of that catalog could to start fueling and augmenting augmenting matters as a word because it’s not gonna replace. It’ll just make decisions better. It’ll augment legacy decisions around integration patterns, data quality rules. Right?

MDM business rules, matching rules. Right? Where metadata could be used to say, plus transactional data, you could say, well, this transaction failed or it didn’t fail. Why?

Well, it’s because it had these record attributes that was the fit that was the successful transaction. This one was a failed transaction. What does the failed transaction look like? Well, all of a sudden, over time, you could evolve to start to see, okay.

Well, that’s what the data quality rule should be. And oh, and by the way, that actually could be considered master data because it seems to be used in a lot of other places where you didn’t know it was used before. Put all those things together where you have kind of a self informed I don’t wanna say self governing, but at least a more automated form of governance, more automated form of MDM, that’s the data fabric.

By the way, if you talk to the people at least at Gartner who are behind the creation and pushing kind of the data fabric narrative, many of whom might know very, very closely, these are these are friends of mine. They’ll tell you that there’s only, like, about five, six, or seven companies on the entire planet that truly have a real data fabric. So this is, you know, five to seven years from mainstream. There are vendors out there that are saying data fabrics exist. Most Gartner analysts and ex analysts, including myself, would disagree with that. But MDM will play a key role in data fabrics going forward.

In your own experience of working in the space of MDM and helping your customers now that you’re at Profisee and working with organizations when you’re an analyst, I’m just wondering what are some of the most interesting or unexpected or challenging lessons that you’ve learned in the process?

Oh, boy. Never assume that your definition is the same as somebody else’s definition.

Right? Never assume that. Like, it’s a mistake to assume that the way that you look at customer, the way that you look at product, or that the way that you look at employees is the same as anybody else. That’s certainly a lesson I’ve learned.

Never underestimate the power of a strong business partnership.

I would take a partnership with a motivated, engaged senior director or VP who has budget over a CEO or a CIO saying we need to do something because we need to do something. Right? I’ve seen a lot of situations where checks get cut and a CIO or CEO says, we gotta go do this, fix your data quality, or do MDM, and then nine months later, everything has gone sideways.

But if you’ve got somebody in an operational role who is responsible for, you know, selling stuff or delivering stuff or making stuff, And if you’ve got a good partnership with them and they are motivated and they’ve got a lot of pain and they wanna work with you to solve for this, that’s worth its weight in gold. That is absolutely worth its weight in gold. So that’s another lesson. A third lesson is, you know, scope, scope, and scope.

And did I mention scope? When it comes to MDM, do not try to boil the ocean. Keep your scope limited for a first launch. Get something out the door that delivers some value quickly.

And break two or three silos, but don’t break fifteen. You may have fifteen ERP systems, and that hurts. I get it. But break two or three.

And again, go back to find those operational leaders who have acute pains, and you’ll tend to know who they are because they’re the ones that are complaining the loudest.

That’s certainly a lesson. It’s just to manage for scope. And the last one I learned is this is not about domains.

It is not about domains. Right? I used to ask I was an analyst. What’s your focus?

Right? How are you gonna manage your scope? Well, we’re just gonna focus on customer data. That’s how we’re gonna limit our scope.

We’re gonna focus on the customer domain. And I’d take a deep breath, and I would say, you’re not limiting your scope. Customer data is everywhere. Right?

That’s a false sense of security when it comes to scope management.

And by the way, nobody within the organization has any sort of incentives. Nobody is paid on domain. People are paid on processes. Right?

Selling more, delivering it faster, making it faster, reducing our fuel costs. That’s how people get paid within the organization. That’s how they make their bonuses, and that’s what you need to attach MDM to. You need to attach MDM to one of those things, not a domain.

Because, again, just like nobody cares about data quality, nobody cares about domains. Most people don’t even know what that means in the organization, by the way. And you could change it to object if you want. They don’t know what that is either.

So focus on a business process and say, okay. I’m going to enable my chief revenue officer to cross sell. That’s it. Boom.

Right? And I’m not even gonna do it for every division or every product line or every SKU. I’m gonna do it for a very limited subset of our products.

Then what I always heard what I would say that to my clients is that’s multi domain. Right? Because then I have to master some product data, and I have to master customer data. I may even have to master contract and maybe even location.

I was like, yep. You will. But keep it limited to sources. Keep it limited to the systems that you’re integrating with.

And, yes, you’ll cross domains, but your chief revenue officer or your chief product officer will be able to tie right back to what you’ve done. They’ll be able to point a finger and say, you know what? We were selling x before, and now we’re selling y. And we’re pretty confident that the only reason why we’re doing that is because of this MDM thing.

But when you’re focused on domains, it’s like, I don’t know if it moved the needle or not. I don’t know. So if you focus on process, you’re in a good place.

Yeah. I’d say that’s a valid lesson regardless of whether you’re talking about MDM or just, you know, business analytics and data management writ large.

Exactly. Yeah. I mean, what I just said could be you’re responsible for a data quality project or you’re responsible for turning up analytics. You’re deploying a Tableau for the organization. Right? Everything I just said could be applied to all of those use cases.

Absolutely. And so for people who are trying to figure out how to gain better visibility or better understanding of their organization. What are the cases where MDM is the wrong choice and it’s actually too heavyweight? And maybe they should just go and, you know, buy the latest metrics layer or, you know, use some of those AI technologies that will do an entity resolution for you.

It’s a good question. Sometimes I worry that I’m an MDM hammer.

Right? And all I see are MDM nails.

Right? I think, really, the question you just asked is not whether companies need MDM.

I think the question is whether they need MDM software because it’s not cheap. Right? Even though the price is really coming down and you can get enterprise class solutions like Profisee and others for, like, sub six figures. Right?

I’ll stop selling. But the prices have been coming down. But there’s still for some smaller companies, I think you could ask the question, okay. Wait a minute.

Am I gonna spend a hundred grand on MDM software? Plus, I’ll probably spend double that on on consultants to get it up and running. So I’m all in for a quarter million probably, and I’ve got a problem. Right?

I don’t have a single view of the customer or my my customer reports are inconsistent or, you know, whatever some of those pain points are. You know, even small companies can have those problems.

So I think the question is not whether you need MDM because let’s not forget MDM is a discipline first and foremost. It’s a way of managing data. It is a way of making sure you have consistency, quality, accuracy.

Right? The structure for cut your core shared data assets so they can be used widely. Chances are pretty good you need MDM as a discipline. But you can do MDM in Excel.

Yeah. I said it. You could do it. You know? Are you going to have real time back and forth integrations with source systems?

No. But if you wanted to build a customer three sixty, you know, you could bring data out of a few different source systems, drop it into even an access database, and run some basic business rules against it to try to link everything together. Right? It’s gonna be a little brittle.

It won’t be that scalable.

Maybe in time, you’ll migrate to some form of MBM software, but you may still be able to drive some significant business value by saying, here’s our customer three sixty report that we couldn’t do before. When you layer in, like, some third party data services out there, particularly in the b to b world, the b to c data providers tend to be fairly expensive. Well, so do the b to b providers. But if you’re in b to b mode and, you know, you’ve got, let’s say, fifty thousand customers and you’re not really you don’t feel good about the accuracy of the data and, you know, you’ve got some badly duplicated data, you lack a single version of the truth or a single customer identifier, You know, there are providers out there that you can bounce that data against, use their match engines.

Right? Use their, you know, what they call reference data. And pretty quickly and relatively affordably, using Excel as a source, have some form of a customer three sixty up and running pretty quickly. So I would argue most companies need MDM as a discipline.

Do you need MDM software? Is the MDM software too thick for you? It might be. If you’ve only got, you know, a few thousand customers, two thousand, three thousand, four thousand, five thousands, MBM software may be too thick.

Yep. But but chances are, you still have some use cases that need that consistent approach to the data management side of it.

Are there any other aspects of master data management and the strategic and tactical elements or the ways that you’re approaching it at Profisee that we didn’t discuss yet that you’d like to cover before we close out the show?

Yeah. I mean, thank you. This is the kind of the opportunity to sell a little bit. You know, Profisee is enterprise class MDM software.

We are on the on the Gartner MDM magic quadrant. We are a challenger in the quadrant and have been making significant progress over the last few years. We challenger in the quadrant and have been making significant progress over the last few years. We are one of the fastest growing last year.

Where we differentiate, where Prophesy differentiates, is is, time to value and simplicity of deployment, which are two of the biggest challenges related to MDM. Going back to the story I told a while ago about, you know, paying a ton for consulting fees over a year and then getting nothing to show for it. So Proficy really focuses on being fast time to value and and easy to deploy. Right?

So we run natively in any cloud, but we’re particularly good in the Azure cloud where we actually have an integration to Purview, which is Microsoft’s data cataloging solution. We’re in the marketplace, so you can be, like, up and running, you know, platform as a service in literally in minutes. Right? Where you don’t have to worry about deploying servers and necessarily or procuring hardware where, you know, if you’ve got your data in Azure cloud or any other cloud for that matter, Profisee is exceptional about getting up and running very, very quickly.

That is, like, so different than it was even just two or three years ago where, you know, getting NBM up and running was a challenge in and of itself. So our software can be up and running very, very quickly. Now the hard part of NBM is those business rules that we talked about, and those will take some time to configure. So it’s not like you just turn it on and it magically runs on its own, getting back to our very early conversations about how to define a customer.

So there are decisions that need to be made, and there are business rules that need to be configured into the software. It just doesn’t run magically on its own. But if you’re looking to be up and running in three months, right, if you have it more of those analytical use cases that I was talking about earlier, if you want a fast time to value and if you want a lower total cost of ownership, Proficy is extremely price competitive.

Consistently, one of the more price competitive solutions out there. The key variable here is the amount of data that you’re pumping through the Data Hub. If you’ve got, you know, forty million customer records, you’re obviously gonna be paying more for a solution than you would be if you had a twenty thousand, customer records. But through our native integration to Azure, you know, what you’re paying for is just the software.

It’s really kind of bring your own cloud type approach where, you know, the hosting fees and the compute fees will be yours to deal with your Microsoft where, you know, all you’re buying from us is the software. So lower total cost of ownership, speed to value, ease of use are areas where Proficy really, really excels. Right? So if you wanna be up and running in a few weeks and not a few months, Profisee is a fantastic solution.

Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you’re doing, I’ll have you add your preferred contact information to the show notes. And as the final question, I’d like to get your perspective on what you see as being the biggest gap in the tooling or technology that’s available for data management today.

It’s such a great question. And I think that the gap traces back to this notion of business outcomes.

If we are making MDM software or if we’re making data quality software, if we are making data integration software, warehouse software, right, I think we, as vendors, can do a better job to find ways to trace the impact of our software on the business.

Right? Meaning, if in your data warehouse, you have transactional data that shows details of your transactions. You’ve got metadata on every successful transaction. You’ve got data around every detail, every nugget of the customer experience.

You’ve got data around what people clicked on, what they didn’t click on, and on and on and on. I think we kinda fall into this trap where we say, okay. Those are the end user facing apps. Right?

Those are the ones like, your CRM system, your digital marketing platforms, although the apps that are running in marketing, well, well, those are the ones that are gonna optimize the customer experience, and those are the ones that are going to optimize our product pricing or our messaging or the speed or the cadence of our marketing campaigns or all the stuff that touches customers. Right? And I think we, within the data management space, particularly software vendors, could do a better job to reach into that world. And the data’s there.

We’ve got it. Right? It’s sitting in data warehouses. Sitting in data lakes. It’s sitting in MDM hubs or data quality hubs and find ways to link that data back to what’s being done from a data management perspective.

Right? To link that data back to a decision that was made from an integration perspective or data quality perspective or an MDM perspective. And to say, okay. Wait a minute.

We can trace a hard line back from that decision that was made about that data quality policy that was changed to an increase of sales. Right? If we could do that, that’s the holy grail. Right?

This is what we’ve always wanted in the data management space to be able to say, we have a hard impact on business outcomes. But we kinda wash our hands and we just say, okay. Well, that’s the domain of the business apps. CRM systems of the world, the list here is very, very long.

Right? They’re responsible for that. Right? But I think we could do more to find ways to link what we do in data management to actual business outcomes.

Because if we can do that, then we’re able to show the ROI of data management, Then we’re gonna get the business attention. Then we’re gonna get the investment. Then we’re gonna get people. For right now, where we are is, you know, hey.

Invest in us, please. Right? You need reporting. So you know, and you need visibility, so you better invest in reporting.

Or you need better data, you should invest in MDM.

Right? Yeah. I think we can go farther. And I don’t have all the answers there, but the data’s there. That part I know.

All very true, and it definitely is reflected in kind of the way that the industry is trending where in the, you know, early to mid, you know, two thousands and twenty tens, it was all about big data. Just collect everything all the time because maybe it’ll be useful, and now being very much more intentional about what is being collected and how it’s being applied both because of regulatory risks, but also because of organizations being more cost conscious and more of the sort of small to mid sized businesses starting to adopt those capabilities, and they don’t have these massive, you know, cash troves to throw at this investment in the hopes that someday it will be valuable.

Yeah. Well, in the companies that did, you know, for a lot of them that spent millions and millions on standing up Hadoop clusters, many of which are still running, and they’re driving some value. Don’t get me wrong. But for a long time there, for a lot of companies, big investments in Hadoop were, you know, the glib metaphor that I use here is that they were, you know, creating a whole bunch of questions or a whole bunch of answers that were desperately seeking questions.

Absolutely.

Right? It’s like, hey. We found this anecdotal insight about a b c Hey, business. Did you know?

Absolutely.

So how do we avoid that?

How do we go from we know you care because we’ve got the metadata to show that you care?

Awesome. Well, thank you very much for taking the time today to join me and share your experience and perspective on the overall space of master data management. It’s definitely a very fascinating area, and it’s always great to be able to dig into it with somebody who has such, deep knowledge and experience in the the space. I appreciate all of the time and energy that you’ve put into helping to support that ecosystem and to share it with us today. So, thank you again, and hope you have a good rest of your day.

Thanks, Tobias. Really enjoyed it. Same to you. Talk soon.

ABOUT THE SHOW

How can today’s Chief Data Officers help their organizations become more data-driven? Join former Gartner analyst Malcolm Hawker as he interviews thought leaders on all things data management – ranging from data fabrics to blockchain and more — and learns why they matter to today’s CDOs. If you want to dig deep into the CDO Matters that are top-of-mind for today’s modern data leaders, this show is for you.

Malcolm Hawker

Malcolm Hawker is an experienced thought leader in data management and governance and has consulted on thousands of software implementations in his years as a Gartner analyst, architect at Dun & Bradstreet and more. Now as an evangelist for helping companies become truly data-driven, he’s here to help CDOs understand how data can be a competitive advantage.
Facebook
Twitter
LinkedIn

LET'S DO THIS!

Complete the form below to request your spot at Profisee’s happy hour and dinner at Il Mulino in the Swan Hotel on Tuesday, March 21 at 6:30pm.

REGISTER BELOW

MDM vs. MDS graphic
The Profisee website uses cookies to help ensure you have the best experience possible.  Learn more