Garbage In Garbage Out: How to Improve Data Quality

Garbage In Garbage Out: How to Improve Data Quality

As an avid consumer of IT and business news, you surely have heard the adage “garbage in, garbage out.” The concept is simple: the quality and relevant use of any analysis, analytics or business output is a direct function of the quality of the data feeding the model.

Though the first recorded used of the phrase dates all the way back to 1957, it was popularized in the early days of computing and recited to the point of cliché with the popularization of data warehouses in the 1990s.

Yet this oft-recited phrase still holds an important truth today. Whether pursuing a digital transformation or taking advantage of nascent technologies like artificial intelligence (AI), machine learning (ML) or the Internet of Things (IoT), organizations need a strong foundation of trusted data to achieve their business goals.

After covering the data governance and master data management (MDM) markets for several years — as part of helping clients build business cases for governance MDM— I took a step back to reflect on why the state of data in most organizations is as dismal as it is.

Read on to see why there is such a challenge in demonstrating the value of trusted data available across mission-critical operations and analytics in an enterprise — and how you can kick your own dirty data to the curb.

Garbage In: A History of Data Silos and ‘Technical Debt’

The succinct version of this story starts with the fact that most of us in any business environment have historically been compensated to optimize business processes in silos, whether we realized it at the time or not.

For those of us fortunate enough to become involved with information technology during the last half of the 20th century, we were allowed and encouraged to automate these silos.

This resulted in tremendous productivity gains when all those processes were aggregated, and truly little attention was paid to the technical debt that was created when each of those new business application systems took its own set of data with it.

ERP and other application suites made significant strides towards at least co-locating logically similar data within their databases, but little if any capabilities were built in to enforce broad-based data quality and semantics across the supported business processes.

This resulted in a different set of “logical” silos within a single physical data store. Concurrently, specialized applications such as CRM arose, again increasing productivity in isolation but again complicating the issue of trusted and, therefore, reusable data.

Garbage Out: Data Warehouses, Data Analysts and Data Quality

The technical debt of poor data quality was first widely exposed by the advent of data warehouses and data marts in the 1990s and the first attempts to consolidate and reconcile source data from these various data silos for use in even basic reporting and analytics.

Nascent “data analysts” discovered that the data in these systems did not conform to the ostensible rules within each system. Worse, the meanings of seemingly similar data attributes across these systems bore little resemblance to each other.

Today, as enterprises now pursue strategic initiatives like digital transformation, they have increasingly discovered that the status quo of largely untrusted data can no longer be tolerated if they are to implement futuristic capabilities and automations of their business processes and analytics.

The technical debt of poor-quality, mission-critical data must now be paid. Indeed, it is no surprise that 60% of organizations reported under-investing in their enterprise-wide data strategy, preventing valuable data from being broadly used, according to a 2021 survey by Harvard Business Review Analytic Services.

End Garbage In, Garbage Out with Master Data Management

As often happens with macro business trends and the broad availability of technology, it’s quite likely that some version of this effect will eventually trickle down to small- and mid-sized companies as well.

The good news is that technology has been available and improving for several years now that empowers all companies with the organizational discipline to resolve these issues, in the form of master data management (MDM) and data governance solutions.

Multidomain MDM platforms provide the data model flexibility to allow the business to develop a common data model that accurately reflects both the current and desired future states of the business.

Indeed, implementing MDM and data governance are critical steps in resolving your technical debt and fully enabling any digital business transformation.

Read the full Garbage In, Garbage Out article today to start your journey on the path to trusted data — so you can kick your dirty data to the curb.

Download Article

Other Posts

Harvard Business Review on the Path to Trustworthy Data

of executives say that having a strong master data management program is important to ensuring their future success.

of executives believe their organizations are underinvesting in their enterprise-wide data strategy.

of executives say their organizations rely on more than 6 data types that are essential to business operations.

of respondents who have employed MDM say their organizations' approach to MDM is moderately or very effective.

Want to learn more about Profisee?

See the Profisee platform live