Are you confused by all the hype around data sharing? If so, you’re not alone.
I believe data sharing — and the inter-company sharing networks that will evolve over the next five to seven years — will disrupt several data management software markets and forever change how we manage much of our enterprise data. In time, I will provide more insights on how the data sharing evolution will unfold. Spoiler alert: it involves blockchain and even crypto.
However — between now and the data sharing revolution — we should expect an ongoing excess of hype around the exact meaning behind it because data sharing is not well defined. Data sharing — at least insofar as everything from data-for-good to data monetization and everything in-between. Growing up as a kid I assumed “sharing” was an altruistic action that didn’t require receiving anything in return. But in the case of data sharing, I would often be wrong.
Data Sharing Defined
As currently defined, the act of data sharing:
- Is done within a company
- Is done across companies
- Is done one-to-one, one-to-many or many-to-many
- Includes data marketplaces or other for-profit vehicles to distribute data
- Includes all data consortia and providers of industry-standard reference data
- Includes all instantiations of data as a product
- Involves a field, an object, or table, a database or several databases
- Can be one-way or bi-directional
- Is done by a person, group, business or government entity
And the list goes on!
I would argue that the current definition of data sharing is way too broad of a definition to be useful, so I’m working to create additional specificity on this definition in the coming weeks as I share (pun intended) more insights on where data sharing is headed.
While the data sharing definition remains murky, software vendors are making a massive land grab to claim their space within the data sharing narrative.
Data Sharing through Data Governance
Leading the packs are the cloud hyperscalers and purveyors of data warehouse/lake solutions — where the ability to put data in a single physical or virtual location and allow sharing through permissions management is somehow being positioned as a revolution in data governance.
This makes me cringe, but I understand why they are doing it.
In the absence of any clear definition of what is required for effective (and value-added) data sharing beyond plumbing access to data, this overly simplified view of data sharing is just as valid as any other —for now.
To bring more clarity to this space, I advocate we start with a better definition. I don’t have all the answers, but I’m confident the definition of data sharing should include some notion of shared data governance.
The reason for this is that data sharing without any insight into the governance rules used to create it will significantly limit the value of what’s being shared.
If I don’t know how shared data is sourced or defined or I can’t verify or validate its accuracy or if I don’t know to hold old or relevant it is, how much value will I drive from it? Using this data as-is in any critical business process would require a massive leap of faith that most companies simply will not make.
Data Consortia for Business Value
Shared governance is why data consortia can drive massive value while the list of failed data marketplaces is too long to list. Consortia require shared governance, while data marketplaces typically don’t.
Some of the many examples of the common uses for data consortia include:
- Create credit scores
- Manage complex supply chains
- Manage industry-wide reference data sets
- Product UPC codes
- Banking and financial data
- Medical treatment codes used for medical insurance
What do you think? How would you define data sharing? Join the conversation on LinkedIn.