Master Data Management and Data Governance in Azure Pt. 2: Why Poor Data Quality is the Biggest Risk to Your Azure Investment

Master Data Management and Data Governance in Azure Pt. 2: Why Poor Data Quality is the Biggest Risk to Your Azure Investment

Organizations often invest in a cloud migration as part of a broader IT and business objective, including undergoing a digital transformation, developing a business intelligence (BI) or analytics program, increasing operational efficiency or meeting compliance requirements.

And Microsoft’s Azure cloud-computing stack is an attractive option with dedicated applications and services for data governance (Azure Purview), analytics (Azure Synapse) and data integration (Azure Data Factory).

But once data is combined from multiple source systems and databases in Azure, organizations often find that their enterprise data is of poor quality, missing, incomplete, duplicated or more complicated than they expected.

You may have identified the symptoms of poor data quality and have felt the impacts within your organization. But you may not know that poor data quality — and lack of trusted data in general — can be the biggest risk to your Azure investment.

And when it comes to poor data quality, you are not alone. In fact, 60% of organizations reported under-investing in their enterprise-wide data strategy, preventing valuable data from being broadly used, according to a 2021 survey by Harvard Business Review Analytic Services.

Harvard Business Review on the Path to Trustworthy Data

say having a strong MDM is important to ensuring their future success.

believe their organizations are under-investing in their enterprise-wide data strategy.

say their organizations rely on more than 6 data types that are essential to business operations.

who have employed MDM say their organizations’ approach to MDM is moderately or very effective.

But there are proven steps you can take to improve data quality throughout your organization.

Read on to learn how you can define your business objectives and identify relevant data sources to measure the quality gap in your data and then use data governance and master data management to close the gap and build a strong foundation of trusted data.

Measuring Your Data Quality Gap

When identifying and remediating data quality issues, it is often helpful to start with the end in mind. First, define your objective.

Common examples of enterprise IT initiatives include:

  • Digital transformation: reimagining your business for the digital age by using technology to create new or modify existing business process
  • Business Intelligence (BI) or analytics: implementing data management tools to understand business data and derive insights to drive the business
  • Increasing operational efficiency: increasing the ratio of your business output (e.g., manufacturing volumes, revenue, customer engagements) to your resource inputs (e.g., capital, person-hours, raw materials) — in other words, anything to make the business more efficient!
  • Maintaining compliance: the ability to act according to a market demand or regulatory requirement by implementing business processes and safeguards. Common examples include privacy and the safeguarding of sensitive information.

In today’s digital age, nearly all business objectives have enterprise data at their core. And your ability to accomplish these objectives is a function of the quality of that data across your organization. Even taking advantage of advanced technologies like machine learning and artificial intelligence requires data that is consistent and complete if they are to make the inferences and deliver the insights you are seeking.

Mapping Business Initiatives to Data Types or Domains

To better define your data needs for your specific objective, it is helpful to look at the data type, or domain, involved in the objective — bearing in mind there will usually be multiple:

  • Customer data: a full 360-degree customer view of all personal, behavioral and demographic data across sales, marketing and customer service interactions. Multiple variations on ‘customer’ data are people and businesses, as well as individuals and households. In different industries such as healthcare, a similar (but not the same) model might be used for patients and providers (doctors, nurses, etc.)
  • Product data: can vary significantly across industries and can include name, description, SKU, price, dimensions and materials as well as relevant taxonomies, hierarchies, relationships and metadata. Note that widely different sets of attributes may be required for different product families (i.e., shoes and shovels require different sets of attributes to be properly identified)
  • Location data: includes geospatial data like latitude-longitude, city, state or region as well as business-specific information like line of business (LOB), sales region, services provided and number of employees based
  • Reference data: can take many forms, but are the ‘simple’ lookup tables included in many applications across the enterprise — from the approved list of state codes, to sales regions, to cost center hierarchies to standard diagnosis codes in healthcare. These often seem so ‘simple’ that can be overlooked but can cause issues when they are inconsistent across systems
  • Other: all other data types, including account certifications, contracts, financials, policies, etc.
Primary Data Type / DomainExamples by OrganizationSample Attributes
CustomerCustomers
Employees
Partners
Patients
Suppliers
Demographics
Purchase History
Contact information
Preferences
Certifications
ProductProducts
Bill of Materials
Assets
Equipment
Media
SKU
Price
Dimensions
Materials
Taxonomies/Relationships
LocationAgencies
Branches
Facilities
Franchises
Stores
Latitude & Longitude
Address
City, State, County
Sales Region
Line of Business
Reference DataStandard codes
Cost center hierarchies
Lookup tables
Dimension tables
OtherAccounts
Certifications
Contracts
Financials
Policies
Account Approval Status
Expiration Date
Parties
Regulatory Authorization

From here, you can begin mapping your primary business objective with the necessary data domains.

For example, to improve customer service you may need to understand which products are most frequently purchased and factor that into your customer service training. Or to effectively cross-sell and up-sell to existing accounts, you may need to understand whether customers more often start with a checking account or mortgage loan and build your targeted marketing campaigns around the products they have the highest propensity to buy.

What the above examples illustrate is the need for a multidomain data quality strategy. Indeed, they both involved data from multiple domains because no true business problem can be solved with only a single myopic view of the underlying data.

But don’t get discouraged. While all the domains interact with one another, it is still critical to prioritize your data and break up your data management initiative into phases. Just like the adage about “boiling the ocean,” your project charter shouldn’t be to “govern and manage all company data.”

The data discussed above is master data, the core, non-transactional data used across your enterprise. Unlike transactions that may be logged by your business in the order of millions of times per day, master data is relatively slow-moving.

But while master data is smaller in volume than transactional data, it is almost always higher in complexity. So it is critical to review both the data itself and the relationships between data types. For example, a clothing retailer may sell dozens of types of running shoes throughout multiple regions that each can be configured in different sizes and colors.

More than just a dozen shoe products, you have myriad relationships among their attributes. You need to consider and define these complicated relationships and relevant metadata to design, execute and enforce a robust data governance program.

Scan and Collate Your Master Data by Source System

It isn’t uncommon for an enterprise to have their customer, product, location and other data spread across disparate systems. Perhaps you keep records of customer purchases and how they interact with your marketing efforts in your CRM while billing and payment terms are stored in a different system.

Each system then holds a duplicate record of a contact or account, and the records likely have both overlap and discrepancies. For example, you may have customer names or their organization listed differently across systems (e.g., ‘Crete Carrier Corp’ in your CRM and ‘Crete Carrier Corp​oration’ in your ERP), preventing records from being matched in your analytics platform.

Scale these duplicates and inconsistencies across systems, regions and departments, and enterprise data can quickly become more siloed, duplicate and inaccurate — a shaky foundation for any business initiative! Suddenly, answering simple questions about a customer’s payment history, propensity to purchase another product or customer service interactions require consulting a half-dozen systems or database administrators.

This example illustrates the importance of identifying and scanning source systems for types, or domains, of data and determining which systems contain the most reliable data for your most critical data types and which can potentially be helpful secondary sources to fill in gaps.

Closing the Data Quality Gap with Data Governance and Management

Now that you have identified your critical data types, organized them by domain and determined which source systems house the records of that data, it is time to begin the work of data governance.

Though governance technically does not require a dedicated technology platform, data governance solutions like Azure Purview have made it easier for organizations to scale their governance efforts and leverage their accumulated data to support trusted analytics and new or improved business processes.

Establish Business Definitions, Determine Lineage and Create Data Quality Rules

Whether employing a dedicated governance application or simply using a data governance framework, you need to first define your data types and sources, specify your data quality requirements, assign ownership, measure effectiveness and ultimately enforce data standards.

At a high level, data governance often includes:

  • Resolving definitional conflicts between sources for each data type
  • Recording any subsequent technical and business definitions, lineage and mapping or transformation rules required
  • Creating and documenting data quality rules
  • Adjusting the data model to reflect these governance rules.
Data Governance TacticExampleBusiness Impact
Resolving definitional conflictsCRM refers to B2B re-sellers as a ‘partner,’ but financial reporting considers them ‘customerHolistically measure revenue across channels
Defining and logging rules in business glossaryDefine ‘region’ as one of six delineated U.S. sales territoriesBusiness glossary serves as ‘single source of truth’ to begin rectifying data across sources
Creating and documenting data quality rulesCRM has no minimum character requirement and allows special characters for ‘business name’ while ERP requires 25 characters and only supports plain textEnsure consistent, complete data throughout the enterprise through formal rules and validation mechanisms
Adjust the data model to reflect governance rulesMarketing department does not realize the address field is separate from the ordering/shipping addressData governance framework is visible to all stakeholders

While data governance as a strategic framework can help organizations best define how to manage their data, they still need to enforce these governance standards. Indeed, the previous examples only constitute the “policy and creation and management” phase of data governance — executing those policies then monitoring their effectiveness is where IT leaders can truly demonstrate the value of their governance initiatives and unleash the value of their Azure investment.

Enforce Data Governance Standards with Master Data Management

Now that you have a formal data governance program in place — whether through a dedicated application or as a documented framework — you cannot let your hard work go to waste! Your rules, definitions and data modeling must be enforced on the data to ensure it is accurate, trusted and available across the enterprise.

You have already established through your building and implementing your data governance program that various source systems, departments and data types each have unique data quality and validation requirements, i.e., what is good for a CRM may not suffice in an ERP.

So rather than hiring data stewards to continually reconcile and correct data on an ad-hoc basis or try to enforce data standards upon entry in the source system, holistically manage your enterprise data with master data management (MDM).

MDM is a hub for matching and merging records and implementing data quality standards for undeniable trust. And much like a dedicated governance application can make your efforts much easier, you need an MDM platform that can be quickly implemented with a native Azure integration, is multidomain, and scalable across your organization.

MDM can handle data from multiple source systems both operating in Azure and from external sources, including ERPs, CRMs, custom applications, cloud applications, legacy apps and more. Scanning these data sources with Purview can build a catalog of what data is available from each source, and applying MDM to analyze common data domains can quickly identify data quality issues like missing, conflicting, incomplete, duplicated, and outdated information.

MDM corrects these problems by uniformly enforcing data quality standards —matching, merging, validating and correcting your data — then syncing it back to sources.

MDM also excels in multi-domain use cases like combining householding, product and contract information for risk-based pricing in finance, or coupling provider, patient and treatment data in healthcare. In fact, almost all real-world business use cases rely on data from more than one master data domain.

Business outcomes depend on multiple domains, and multi-domain MDM handles domains for any use case. Customer domain data must be combined with product data to properly identify cross-selling and upselling opportunities. In manufacturing, you need data about items, planners, and facilities for strategic procurement. The more cross-domain utility you get from MDM, the greater your ROI.

Maximize the Return on Your Azure Investment with Trusted Data

The single source of truth that MDM can provide is the best path for high-quality, trusted data in Azure because of its multi-source, multi-domain design.

Don’t let poor-quality data derail your diverse Azure ecosystem. Use MDM to complement your data governance program and work with your existing systems and databases to effectively manage your data.

To learn more about how MDM can seamlessly complement your Azure investment, explore the Profisee platform see how it seamlessly integrates into Azure to deliver high-quality, trusted data across the Azure data estate.

Ready to see MDM in action? Schedule a live demo to see how the Profisee MDM platform can solve your unique business challenges.

Other Posts

Harvard Business Review on the Path to Trustworthy Data

of executives say that having a strong master data management program is important to ensuring their future success.

of executives believe their organizations are underinvesting in their enterprise-wide data strategy.

of executives say their organizations rely on more than 6 data types that are essential to business operations.

of respondents who have employed MDM say their organizations' approach to MDM is moderately or very effective.

Want to learn more about Profisee?

See the Profisee platform live