Data Quality Rules: Examples and Best Practices

Key Takeaways

  • Well-crafted data quality rules save data professionals time and resources when automating data ingestion.

  • Continued communication between business users and data implementation teams leads to successful projects.

  • Data quality is not a project but a process that the team revisits, revises and repeats to keep up with changing data needs.

Many businesses get tripped up on their data governance journey by trying to identify the characteristics of “quality data” without going directly to the source: the existing data. The information that the business currently uses — and the reasons it uses that data in particular — should drive the company’s definition of usable data.

Data quality rules are the guardrails a company puts in place to ensure data that enters and is used by the larger system meets the company’s definition of useful data. In this article, you’ll learn what data quality rules are, why they’re important and how you can use them to improve data quality at your organization.

The State of Master Data Management (MDM) 2025

The State of Master Data Management 2025 is your essential guide to understanding the latest trends, challenges and vendors shaping the MDM landscape.

What Are Data Quality Rules?

Data quality rules set parameters and thresholds that data inputs must meet for it to be deemed useful. Data quality rules play an important role in a data governance program by:

  • Keeping low quality data from being entered into a source system via data validation
  • Flagging poor quality data for manual review
  • Telling a master data management (MDM) tool how to correctly standardize master data

When implemented correctly, data quality rules let businesses automate more of their data governance.

In MDM projects, data quality rules define what does (and does not) constitute a golden record. These rules decrease duplication or corrupted master data and increase the usability and reliability of the data products built on golden record data.

Why Are Data Quality Rules Important?

Data quality rules make life easier for data stewards by running data validation by exception, meaning stewards only need to manually review information that does not match the rules. Data quality rules may be set for every property in a data set, corresponding to every column in a spreadsheet.

For example, our table includes three properties: first_name, customer_id and last_name.

  • A data quality rule for the customer_id property would require unique IDs for each customer. Any entries that match an existing ID in the database would be flagged for review.
  • The data quality rules for the first_name and last_name properties may define the capitalization structure and character limit or check for duplicates to avoid assigning two customer ID numbers to a customer.

Without data quality rules, multiple ID numbers might be assigned to different customers, causing billing, shipping and customer service difficulties. Data quality rules also keep data from different sources in the same format and prevent duplication, deletion or transformation that ultimately makes the data unusable for business teams.

Data Quality Rules Best Practices

Because of the sheer number of tables and properties needed by business departments, the corresponding number and complexity of data quality rules could cause some heartburn for the data governance team. But following a few best practices (and taking a Pepcid) can ease the suffering. 

  • Let business use guide rules: Data must be useful to business teams, so ensuring that formatting and requirements meet their needs is paramount to the ultimate success of the master data project.
  • Automate as much as possible: Data quality rules help automate the ingestion and storage of data. But with so many rules to create, automation can speed the selection and calibration of rules you use.
  • Reuse rules across the business: Especially important with master data management, reuse rules that define the format and accuracy of data when importing from sources across the enterprise.
  • Calibrate rules: Implementing loose formatting rules may reduce review notifications, but it could also increase the number of formatting errors. It may take some trial and error to dial in the right guardrails for some properties.
  • Revisit and iterate: Is data work ever done? Probably not. Schedule consistent data checks to validate that data quality rules work as expected and revise them when necessary. 

These best practices will get you a long way, but be on the lookout for some common pitfalls in the next section.

Data Quality Rule Pitfalls to Avoid

Building a data governance program is complicated enough without falling prey to some of these common mistakes that trip up even experienced teams. 

  • Lack of communication: Consistent communication between business stakeholders, IT, data teams and end users can head off costly errors or oversights that might hold the program back.
  • Neglecting master data management: Getting your gold medallion data right in the beginning will significantly speed the overall process. Because so many other data sets and projects rely on your master data, it’s important to get it right early in the process.
  • Making it a project instead of an ongoing process: A project has an end. A process repeats and becomes cyclical with continued use. The entire team should understand that data governance integrates with and supports other ongoing business processes.
  • Forgetting the problem behind the problem: Sometimes we implement a rule to solve a problem but forget to investigate why that problem happens. Understanding root causes and behaviors that result in errors can simplify rule sets and the number of rules needed to cover exceptions.

Data Quality Rule Examples

To get started with data quality rules, it might help to look at some examples. Microsoft Purview comes with a set of pre-written data quality rules that users can set up in the tool. These rules cover some of the most common quality checks, including freshness, duplicate rows and accuracy.

RuleDefinition
FreshnessConfirms that all values are up to date.
Unique valuesConfirms that the values in a column are unique.
String format matchConfirms that the values in a column match a specific format or other criteria.
Data type matchConfirms that the values in a column match their data type requirements.
Duplicate rowsChecks for duplicate rows with the same values across two or more columns
Empty/blank fieldsLooks for blank and empty fields in a column where there should be values.
Table lookupConfirms that a value in one table can be found in the specific column of another table.
CustomCreate a custom rule with the visual expression builder.
Source: The information in this table is quoted here as it originally appeared in “Create data quality rules” from Learn Microsoft Purview

Let’s look at some of these in detail:

Accuracy

An accuracy rule, like data_type_match or string_format_match, checks that the data within a property (column) matches what the rule expects.

In the case of data type, the rule will check that only numerical values appear in properties that call for only numerical values. This rule can also be used for nearly any value, including product types, name or unique ID.

String format rules check that the format of the values matches the string provided in the rule. This rule may cover properties with a mix of letters and numbers, like a SKU or a street address.

Timeliness

A timeliness rule checks how frequently the data is updated. This type of rule may be used to alert data professionals when a business system has encountered an issue exporting data to the database. 

The freshness rule “confirms that all values are up to date” by checking the date of modification. If that date falls outside a threshold set at the business’s discretion from milliseconds to years, the team will receive a notification.

Completeness

A completeness rule checks if data exists in a property cell and whether it contains the extent of information needed. These rules call out blank cells, but more refined completeness rules may notify users if, say, the street number was missing from an address or if someone enters only their last initial instead of a surname.

What Industries Need to Pay the Most Attention to Data Quality Rules?

Every business needs accurate data quality rules and business processes that review and confirm the accuracy of data. But some industries rely more heavily on data quality rules to run effectively. These industries require high volumes of reliable and usable data:

  • Healthcare
  • Finance
  • Manufacturing
  • Logistics
  • Energy and utilities
  • Retail 

These industries all have complicated and quickly changing data inputs as well as a need for quick, trustworthy analysis to keep up with rapidly changing conditions. Companies within these industries would benefit from master data management software and data governance technologies to build a solid data foundation that subsequent projects can build upon. 

Putting Data Quality Rules to Work

Data quality rules are a foundational element of any effective data governance strategy. When thoughtfully designed and regularly refined, these rules ensure data remains accurate, complete and reliable across the organization. By aligning rule creation with business needs, automating where possible and staying vigilant against common pitfalls, organizations can build a scalable data quality framework that supports ongoing business success.

Boost Your Data Quality with Profisee

Profisee makes it easy to get your data clean and keep it that way. Explore how Profisee's data quality engine lets you create and manage data quality rules through its intuitive web interface.

LET'S DO THIS!

Complete the form below to request your spot at Profisee’s happy hour and dinner at Il Mulino in the Swan Hotel on Tuesday, March 21 at 6:30pm.

REGISTER BELOW

MDM vs. MDS graphic