Business Strategy
CIO
Data Management
Data Professionals

The CDO Matters Podcast Episode 58

Modern Data Platforms with Emerson Gatchalian

X

Episode Overview:

 Are you struggling to better understand what it means to implement a modern data platform, and why doing so is relevant to your business?

If yes, check out this week’s episode of the CDO Matters Podcast, where Emerson Gatchalian, the CDAO of the Blackbelt Team within Microsoft, shares his insights on how his largest clients and Microsoft are implementing more adaptable, scalable, and AI-ready data infrastructures. 

Episode Links & Resources:

Good morning. Good afternoon. Good evening. Good. Whatever time it is, wherever you are in the world. I am Malcolm Hawker, the host of the CDO Matters podcast.

Thank you for checking us out today. Maybe you are watching us on YouTube. Maybe you are downloading us through Spotify. In whatever way you’re consuming the content, thank you for being a part of this growing and vibrant community of chief data officers and people who want to be chief data officers.

Today, we are going to talk about a modern data platform. What does that actually mean? What do I need to be thinking about as a CDO if I’m making investments in technology and process and people? If I want to upgrade my systems, if I want to get ahead of the curve, I want to be more maybe AI ready, what do I need to be thinking about? I can’t think of a better person to be asking these questions to than our guest today, Emerson Kochalian, who is the CDAO of the Black Belt Organization within Microsoft of the Americas. So, Emerson, thank you so much for joining us today.

Oh, thanks for having me, Malcolm. Pleased to be here.

Awesome. Emerson and I have some interesting shared background in that we are both from the amazing province of Alberta, Canada.

Emerson is in is in Calgary, which is frankly, folks, if you’ve never been there, it’s one of the most beautiful cities, I I I think, anywhere. Yes. Vancouver is pretty. Denver is pretty, but Calgary is awfully, awfully pretty.

If you’ve never been a had a chance to go to the Canadian Rockies, I would absolutely recommend it. Emerson’s from Calgary, and I’m from two hundred miles north, this place called Edmonton, which is not nearly as pretty. Let’s let’s be honest. I mean, it’s it’s not it’s not nearly as pretty as Calgary is, but there’s a great little rivalry there between Edmonton.

Particularly our hockey teams. Edmonton got the best of it this year, but who knows? Next year, maybe the flames will cut will will will come will come back around. But but, Emerson, you’ve got you’ve got a span for all of the Americas.

Right? Canada and the US. Both. Right?

That is correct. I I cover LatAm, United States, and then and Canada.

Okay. Well, help help our listeners better understand what what you do as a CDO for for for the black belt team. What what what does that entail?

Absolutely. Thanks, Malcolm. So I I work with a lot of our clients, CXOs in general, CDO more specifically, more often, and then, of course, CIOs.

My my role is to kind of, like, how do I kind of help our clients to modernize, if you will? Right. There’s a lot of, you know, discussions around being ready for AI. There’s discussions about how should I optimize my data operations, if you will. So I help the clients as to what modernization means, what a modern ecosystem or platform should look like, and what does it make, for for a successful path forward if if you’re thinking about modernizing, for example.

Great. So so you’re out there talking to CXOs all day, every day about what a modern platform looks like?

Absolutely. A lot of discussions, a lot of, sharing of how we did it in Microsoft, how we did it in other clients. But most most importantly, I think the one that is missing is how do you execute? So I work with clients as to how do you kinda go from, you know, what we discuss, what we plan, what really think is the path forward, and how do we make it real for you?

Yeah. I had a chance to attend a day long data governance workshop hosted by the Purview team from Microsoft, Karthik Ravindran, Alex Posar, De Mebuono, where what they shared through that governance workshop was Microsoft’s governance journey, which I found amazing and empowering because like listening to some of their challenges, biggest company in the world, right? But listening to some of their challenges was really empowering to me because it’s like, Oh, I’ve battled with that. I’ve wrestled with that.

And hearing Karthik and his team just kind of go through what they went through from a Microsoft journey perspective. Again, I thought it was really, really useful. So what what are some of the key takeaways from the Microsoft journey that that you take that that you that you try to impart with your clients? What are the some of the things that that Microsoft has learned that that you say, hey, We know this works or we know this doesn’t work. What are some of those things that you’re sharing with your clients?

Perfect. And I would like to maybe reinforce what you just said about Kartik and the team. They are the a team, ace team of our transformation, the data transformation with Microsoft. So if you think about it, they they lived it.

They they solved our problem, and now they’re in Microsoft purview making it real for for clients. Right? So to to answer your question, what will work for client, for for Microsoft is, number one is coming from a very monolithic fifteen years ago, twenty years ago. And by the way, all of these things that I learned are true.

My collaboration with Karthik and and the team as well is coming from a very monolithic, you know, ten, fifteen years ago, all the way to a hyper federated solution where it created so much more kind of like silos and what have you. The self serve, everything kinda created so much problem of duplication, what have you. So maybe the first learning I would share is how do you create that balance of not being so monolithic and in control to becoming flexible at the edges in the business units, and yet you’re disciplined at the core, meaning everyone is tied to a core of disciplines, of governance, of data management, and how you still enable this business unit with the data and the tools that they need so that they can scale and build their own analytics.

Because if you think about it, a lot of the business units, you know, known to everyone, they’re still using Excel and what have you because they wanna do analytics. Right? But how do you modernize that? How do you kinda bring it closer to the core and yet they’re still able to do what they need to do, if you will?

Right? So maybe that’s the the the gist of of all that learning.

Well, I I love that story because, you know, when when I was at Gartner, I saw a lot of kind of seesawing back and forth between between centralization and and and complete decentralization.

And what I really love so much about the Microsoft message was, okay. We kinda tried both, and where we landed was this more federated model where there is still individual autonomy within operating units.

And and and Karthik, like, he logged into Purview. We showed us, like, that there’s still some situations where where, for example, a customer record is appearing multiple times, but at least there’s visibility over it. At least there is control over it. So what I what what I love about the Microsoft story is that it seems like it’s very practical and very pragmatic and didn’t try to go too far to one side or too far to the other, but that it’s this practical kind of middle ground. Do you do you agree?

I I agree. Absolutely. And if you think about it, you you mentioned about centralization, and that was the discussion on building warehouses. And then you can take data from warehouses to build the data marts. Right?

That serve its purpose for a lot of the reporting, but now we’re we’re building for modern analytics, if you will, right, for for AI, for example. So if you need to train data for ML, for example, using a warehouse, does it really work? You know, let’s it’s a discussion to be had. Right? But if you think about open formats and open computes, what have you, then now you’re opening up for more capabilities where analytics can can go expand beyond warehousing, for example. Right?

When it comes to yeah. You mentioned about duplication of that data. How do you solve duplication for that data is first, knowing is there duplication, how much duplication, tracking where this data come from. And then from there, you know, start the root, track it back to the root as to why this created so much duplicates. And and so, yeah, Purview can serve that. That’s that’s why it’s it’s amazing in how we can can trace all of that. And then from there, also go beyond that quality and all that stuff, for example.

So you just mentioned kind of traceability, visibility, lineage.

Is is that is that let’s let’s assume I’m a CDO with with a heavy dependence on kind of legacy.

May maybe it’s on prem, maybe it’s hybrid on prem, off prem, whatever.

But but I’m I’m looking to make that jump to a more modern platform to something that I can use to leverage for multiple workloads. This maybe starts to get into the fabric conversation.

But I got a lot, like there’s a lot out there.

Where do I start? What’s a good starting point? I assume this is something you probably hear from your clients. If if yes, what what are you telling them?

So there’s multiple layers or multiple discussion points as to where you start. Should you start with culture? The answer is yes. Should you start with how should I a modern technology or solution should look like?

The answer is yes. Should I start with governance? Absolutely. So if you think about it, there’s multiple layers as to where you should start.

So to me, the the one that worked for for Microsoft and for other clients specifically, especially we have hundreds and hundreds of disengagement is start with all of this layers, you know, start where, you know, you show value right away and then rinse and repeat, rinse and repeat, basically. Kind of like don’t think about the the overhaul of what you need to do. Start with something that we will show value in three to six months of culture, of technology, and and governance, for example, of data management on top of it. So if you if you think about it, maybe let’s you know, culture is a totally different conversation.

It’s probably gonna be hours, if not days of conversation. Let’s say, yes, there is a culture topic to be discussed. You you should start there because that will tie to what you’re implementing implementing. Right?

That’s tied to your strategy.

But if you wanna start with, say, a platform, how should I kinda modernize my whole ecosystem?

The way I kinda see it, and maybe something that I can share here is I I built the modern analytics and AI and governance at scale. We also have a white paper that maybe we can share through through a link here. But there are layers of that. Right?

Number one, the foundational problem is, of course, the governance. If you don’t know where your data is, and then an example for a lot of my engagement, how much data assets do you have when you ask the CTO or data leaders? And they said, great questions. I think we have about two thousand data assets.

But once you kinda get into the governance, you just realize you have hundreds of thousands of data that you don’t know about. Right? The number one is governance. Start with understanding how much data do you have.

Is this data classified? Start building policies around it and so that you can start securing it. And that’s the key to kind of data democratization, right? So let’s say you do all of that.

And again, governance is probably, you know, days and days of the conversation, then get into how should I unify my data? Because building a new platform, building a new warehouse is just building a new silo, if you will. But if you think about unifying your data, your data could be coming from CRM, could be coming from ERP, could be coming from your operational databases, or could be coming from your environment.

So if you think about unifying that data, the first kind of like default is how can I unify that data without having to duplicate or copy that data? And so that unification, if you see Fabric, that’s the one leg where you can now create change data capture, meaning kind of get that data flowing into one leg without having to buy a third party solution.

Yes, your data gets copied over into into one leg, but we take care of that cost, for example. But that data, at least, is in that layer now. Right? There’s shortcuts on it on on Firebrick as well where if you have data lakes coming from Azure, from other clouds, you can absolutely create a shortcut without duplicating that data. So that’s kind of like the under the the underlying foundation of how should I think about my ecosystem. Right?

I I don’t see the copying into warehouses to be working in the future. Think about streaming. Think about training data. Think about gen AI, for example.

If you have to have all that data in a warehouse, would you be able to do that? Absolutely not. You you have, you know, unstructured data that you need to use AI for PDF files, for example. Right?

You have to find a way of storing it in open formats and what have you. And that’s usually the lakehouse where we are supported using Forza one Lake.

Second layer to that is the data management. This data management is critical because this is automation, if you will. Automation and how should I ingest data. You know, a lot of our you know, I’ll give an example.

One of the one of our clients during COVID, the the pandemic, they they were saying, Emerson, we have seven hundred data engineers, and yet we’re not able to catch up to the things that we need to do to clean data, integrate data, find that data, what should we do? And so the the answer that I provided them is we are trying what I see is you’re trying to solve a process problem with a people solution, which it shouldn’t it doesn’t work. So but if you think about automation, which is, again, a lot of the things that we did within Microsoft, is a lot of that automation on how you ingest, creating a metadata driven ingestion where there’s a thousand data that I need to pull, creating this without having to create a thousand pipelines.

This is kind of like pulling that data. That’s a framework to be implemented. That’s a solution for it. And then start, like, data quality.

How do you do you remove the, the how do you duplicate data? How do you implement some of the quality? And how do you register this data that you kind of pull together into the catalog? So that’s the automation.

And then the top layer, which is kind of like the consumption, again, this is now how should I build kind of like this domain so that I have a domain specific for this business unit? And and so if you enable them with these domains and then have access to the underlying underlying data that we just unified, that’s become a a game changer for everyone. Now you’re scaling your analytics. The business knows a lot about their their data.

The business knows about the analytics and the insights they wanna pull. If you empower them with this data and the tools, that’s a game changer. And then, of course, all of this is tied together by by governance, security compliance, which Microsoft probably is, enabling us. So I hope that’s a a very long explanation, but I hope that kind of provided some context.

I love it. Persistence, management, and maybe what you could say on the top is activation, publication, syndication, publishing data products maybe.

Something that I like there is and this is something that that that Karthik kind of touched on when when he was describing this in his presentation in Boston a few weeks ago was there’s the data fabric for sure.

It’s a part of the manage well, it’s a part of all of it, actually. It’s a part of all three layers. But but there’s a little bit of a happy marriage here between a fabric and some of the concepts of a mesh as well, which so I think what I like about I’ve always been a little bit of a concern been a little bit concerned about the mesh because I think it was overly conceptual, didn’t address cross functional use cases. There was a few other things that that I had concerns about, but I did like the idea of domain centricity. Right? I did like the idea of having specific business rules and specific quality rules for that domain.

Mhmm.

And what you’re talking about in the journey that in the lesson that Microsoft learned, what I think is is that you can do that as long as you have some idea of oversight at a higher level. Is it would you agree with what I just said?

I I agree. And to echo on on the mesh. Right? We looked at it. I respect all the, you know, the the discussions around it.

But when it comes to the practical implementation, work what I see working are the domain kind of, like, creation. Right? And it’s it’s a concept like years and then decades decades ago about the domain driven designs and all that stuff. Right?

So absolutely, yes. The federated governance, absolutely, yes. That’s what we’re advocating for. Where governance is for for everyone, for example.

Right? So there are and and and and Kartik, I’m not sure if he he talked about this. So what we are advocating for is a hybrid mesh, if you will, where Yeah. Yeah.

We have the mesh kind of, like, enable the business units. But the data management that we just discussed and all that stuff, that’s the kind of like the data fabric in an academic discussion, not the Microsoft fabric, but the data fabric that creates the intelligence and the automations and all that stuff. Right? So what I just wanna get to in in summary is you have to kinda have the the the mesh concept with fabric and a hub kind of like approach to kinda make it real for our client.

That’s that’s the key difference, which is mesh is is kind of has this real reluctance to take central an approach to centralized infrastructure. In in fact, you could argue that one of the core kind of premises of of the mesh is that centralized architectures are inherently less useful, maybe is a way to say it. But I actually I I fundamentally disagree, and this is one of the things that I’ve always disagreed with. You know, network architectures, network topologies have hub and spoke models for a reason.

The airlines have hope hub and spoke models for a reason because there are economies of scale that can be realized through that. Now the best of breed is what you just described, which is let’s focus on data products. Let’s focus on domain centricity. Let’s even focus on a future where we start to automate governance.

Those things are all great.

Absolutely. Yeah. And and the data product is the the value realization, if you will. If you create a data product, an insight that generated, I don’t know, thirty million word of efficiencies, that’s the value of that data product.

Right? And that data product can be used to build another data product for AI, for example, where how can I predict all of this and and and what have you? So that data product, value becomes, you know, double or triple, what have you, as you kind of build upon it, if you will. Right?

So a hundred percent data product, absolutely.

Well, it’s interesting something that you said, you know, mesh purists would would say that that data warehouses are what Zimok would call an anti pattern. She’d also say that MDM is an anti pattern. She actually says that in data mesh. She says MDM is an anti pattern. But you actually said that, hey. Is that a useful is this centralized data warehouse a useful architecture?

And you suggested that more open source approaches to this. I I assume you’re talking Delta Parquet. It doesn’t necessarily need to be Delta Parquet. But but but you but you suggested that duplication of data in into warehouses may not be useful as a part of this modern data platform.

Can you can you explain more?

I think when we’re building designs and decisions, we have to be very inclusive. For example, big companies, they have data warehouses. Does that mean we should stop using that? You can migrate that, but if it’s gonna cost the company thirty million dollars to migrate into a, a federated approach, Is it reasonable?

Is it practical? The answer is no. Right? So what that’s why the hybrid approach is what we need because we have to think about the previous investments that if you change that just to make this new framework work, then this new framework is not really a value needed by the current company because it’s gonna cost me maybe the Hadoop environment or I was saying, I have a mainframe still sitting in in in in the in the other room here.

Does that mean I I have to migrate all of that to make a framework work? And it’s not practical. Right? It’s just not gonna work.

So we gotta be very open on how we think about, yes, domain driven and all that stuff, which is, again, not just a mesh concept. It was years and years ago. That’s the concept where datamarts work, for example. Right.

Yes. We build a datamart for HR, a datamart for finance, and what have you. That’s the main design, if you will, in in old forms, if you will.

You’re absolutely right. I I I was around in Datamart days. I’ve I’ve built many.

It’s it’s something interesting that you said, and and I fully agree with. You actually integrated the concept of, well, hey. What’s it gonna cost for for me to migrate off an AS four hundred to support a mesh paradigm? What’s it gonna cost for me to do all of these things just to conform to this one architecture that may or may not actually be the most efficient way to manage data. Because you could argue peer to peer interactions have have really never worked that well.

Right.

And I think that’s a question a lot of data leaders need to be asking more, which is, okay. Conceptually, I kinda like some of this stuff, but what’s it actually gonna cost? It’s just not practical. I love I love your focus on the on the practicality.

Here’s here’s a little kind of advertising plug.

I should have mentioned this probably earlier on, but I I did mention a day long workshop hosted by Microsoft that happened in Boston a few weeks ago.

If you’re in Europe, there’s another one coming up the third week of September in Stockholm that is associated to Microsoft’s Fabcon. So if you’re thinking about going to Fabcon, if you’re thinking about spending a day learning about, like, going beyond just this hour long conversation with Emerson and I and wanna spend a full day deep diving on this stuff, I know that there’s another one of these events happening in Stockholm. I I wanna say it’s the twenty third. Don’t quote me on that September, but but but check it out. I’m pretty sure this episode will release before then. I I think it will. So so, yeah, really exciting stuff.

Awesome. Let’s let’s talk a little bit more about about the fabric.

I’m I’m a huge believer. I I was a believer in the fabric when I was at Gartner when we when we were talking about this stuff four years ago. I was a believer then. Let’s talk specifically about, about Fabric and AI and how these things fit together and, and maybe the automation piece as well. How do you see all of these things fitting together, particularly through the lens of AI enablement?

Absolutely.

So if you think about data, for example. Right? You need data coming from your PDF files. Yeah.

Let’s say genomics data, PDFs, thousands of files, and you wanna use AI on top of it. Right? Yes. It could be in a data lake, for example.

Right? Now what if you wanna get this genomic data and merge it with some of the patient records or what have you, which could be in a warehouse or a lakehouse, what have you, where would you then be able to kind of build this AI solution?

That would be in in in, a fabric where it could be a data mirrored from a data lake, for example. It could be in any any other cloud. And some of that data could be in, for example, a data warehouse where you can mirror, for example. It could be in databases too.

Some of this transactional the transactional, but but data that is stored in databases, you can start mirroring it. So that’s the value that first Fabric can bring is now you can unify data because AI is not if you think about AI, it could be on specific data assets, but the value that we’ll bring is how you can use AI on assets where it isn’t beyond one platform, for example, or one storage. Right? So, yes, PDF files in data lake, data coming from databases or warehouses or what have you, unify it.

That makes it easier for you to kinda get that data available for for AI. Now let’s say that data is there, and then now you have a workspace in fabric where you can basically you know, you’re doing data engineering. We have spark implementations there. You can even call OpenAI from spark within within within fabric and start, you know, do a prompt and do a line training, for example, to kind of sequence all of this.

And then from there, basically get the value out of, you know, out of your data, use some of the, you know, the the models we have available in, that we can offer within fabric, and that becomes easier. Now if you wanna go, you know, implement some of this heavy customized AI solutions, you can use, for example, AI Studio and connect to this, you know, one lake within Fabric. Now it becomes available not just from within Fabric, but other applications outside Fabric. So it’s open and it’s available for for for you to tap in.

That becomes easier, for example. Right? AI, you know, part of that is about being efficient. Right?

There’s the Copilot capabilities in fabric. So a lot of, for example, the CIOs, they wanna kinda start getting into AI. The first step for them is, how should I take advantage of this tool pilots, for example? So there’s different ways in how we can actually, you know, use Fabric for for all that.

Use the Copilot to start, get some of the data from within Fabric, spark implementation, call some of the AI models, that second. And third is from outside, you know, any other solutions you have outside Fabric, and then basically use the data that you have within Fabric. You know, that could be a third horizon.

I I I love it. Now if if I was to put on my Gartner hat, I I’m sure that there are probably people listening out there who are saying, okay. Well, but I’m not a Microsoft client. Maybe I’m not a Microsoft client yet.

Can I do this on my own? My answer to that would be, yeah. You could if you you you probably could. If you’re a Fortune one hundred company and you’ve got a lot of money and you’re met if you’re a data and analytics leader that is managing, you know, hundreds of millions of dollars of budget, could you build?

Could you hand roll your own fabric like infrastructure? I think maybe you could, but I don’t know why you’d want to.

I mean, there’s a lot there’s an awful lot of work there. Is this is this is this a build or buy? What do you think, Emerson?

So so a very interesting question because we I’ve seen I’m working with so much clients. Seeing clients who are, you know, s, you know, s, one hundred, you know, top one hundred clients or clients who are not even in in in Azure right now. They it so it comes from yes. This is the the high rollers of the k where, you know, I have money to kind of roll it out.

Absolutely. They’re using Power BI, by the way. And a lot of these clients are basically easier for you to kinda I’m a business user. I’m using Power BI.

I just wanna get some of my data, you know, using Excel still. How should I get into it, for example? Right? Or I have data warehouses that are in closed form, but proprietary storages.

I need to kinda get it open to start migrating some of their data warehouses into fabric.

That’s what other clients are doing. Right? So that’s kind of like, you know, to your point. You have bills and the millions of dollars to, you know, to to use for this this, role at all of of Fabric.

But we have clients who are even not in Azure, like I mentioned earlier, that isn’t getting into into, into Azure because of Microsoft Fabric. The reason for that is it’s easier for use to them. It’s, you know, simple. It is basically open, and it’s also governed.

So a lot of these clients are also taking advantage of Fabric, and we have a ton of those, which is very, very exciting for us as well.

Yeah. I mean, you know, can you hand roll a data quality solution? Yeah. Can you can you build an MDM kind of from scratch?

Can you follow it like, the discipline of MDM that builds something from scratch? Yeah. You could, but it’s to the point now, I don’t know why you would, and and I see something very, very similar happening with the fabric. Four years ago, there was a handful of companies that I was talking to a gardener that were building fabric like infrastructures, but we’re at the point now, I honestly don’t know why you would particularly you mentioned shortcuts, right?

If you’ve got data sitting in Redshift, if you’ve got data sitting in GCP, that that kind of that open approach that Microsoft is taking here, again, I don’t know why you would want to invest in building your own hyper data virtualization layer when there when there’s when there’s a leader in the space that is that is doing it. I mean Yeah. Yeah.

Go ahead.

Yeah. I was gonna say, because you mentioned Gartner, I I in I I was in both Gartner in London and and, in Orlando. Right? In both cases, I’ll share one one use case.

There’s there’s CIO and an SVP who approached me after my talk about modernization. Right? And so they said and everyone’s this is the commentary that I got from them is, now I have to change my strategy and how should I modernize. And the reason for that is, again, going back to they have basically multi cloud and it’s it’s not by choice.

It’s because they purchased a new company. There’s a cloud, different cloud, what have you. So again, going back to the practicality, how should I unify my environment? Right?

And so moving from one cloud to the other is not a practical solution, but putting fabric on top of this three different data lakes, if you will, Azure AWS and GCP, that’s super practical.

Right. And it just makes it easier for them. So if you think about the practicality of using fabric, it provides a lot of that. Unifying your data, making it simple to kind of enable, you know, the different business unit and pursuing us.

Yes. I have I’m I’m a heavy coder. I can use the different capabilities, in in Spark and other, you know, warehousing and what have you. But if I’m a business user who just wanna get some dashboards with data that I I bring to to to to One Lake, that will just basically me within hours.

I I I do this without having to to wait for a table to be created by someone. I can do it on my own now. Right?

If you are a Microsoft client or even if you’re just looking around, you know, at what the capabilities are here, what I would urge you to do is to find a way to have somebody to give you a demo about OpenAI Studio.

The first time I ever saw OpenAI Studio, I watched a Microsoft architect create a Copilot in literally minutes, right? So you’re talking about value realization from AI, particularly GenAI.

I watched somebody in minutes create a co pilot experience that was pretty darn good. Right? Was it perfect? No.

But it was pretty darn good. And what you’ll see as a data leader is how powerful this stuff can be, the role that internal data will help play in grounding any any of the queries that you’re sending to to an LLM, and maybe even the future role that I hope that structured data will play here. Because, Emerson, today, I see a little bit of a break between Gen AI based solutions and structured data. Right?

A lot of the work we’re doing, everything we do around governance is is really related to structured data. And a lot of what these LLMs is consuming is, to your point, is is PDFs. It’s unstructured data. It’s it’s text.

Do you see these worlds coming together?

Yeah. Absolutely. Absolutely. I’ll give you an example. I or maybe let me qualify the question.

Are you the are you asking if I have a structured data and unstructured data and I can use this for for Gen AI?

Right.

Yeah. Or is it more?

Or is it one feeding the other or or or what have you? What was what’s your Let’s let’s say I’ve I’ve got a structured database of all of my previous customer transactions.

Yeah. And I wanna use a chatbot to help explain the history of those interactions. Right? To summarize all of those interactions. But today, it’s sitting all those interactions are sitting in in in, you know, rows and columns. How do how do I get at that stuff and make it meaningful with it within an LLM?

Okay. So if you have those information, that could those could be in in in logs, for example. Maybe not necessarily structured data. Right?

  1. Right? So, that that’s one. And and and let’s talk about customer service, for example.

Right?

One client that I’ve worked with, they they’re building this AI for customer service. And and it’s about, you know, if the client calls, what would be the potential topics that they they will will talk about. Right? So they might purchase the product, which is, you know, some of them in billing and and what have you.

This might be something related to a network. Let’s say this customer is a telecoms. Right? There is some disruptions in with with some of the service, and then they would call about that.

So if you think about it, the the telecoms, business, if you will, there is network data that’s coming from, you know, a streaming information, if you will. And then there’s some data coming from, how do you call this, a database or what have you, where it says that this person came to to to, to to to our store, and they purchased online, what have you. So all of that is structured, And some of this is streaming somewhat structured, coming structure and then put them together. And this client is calling because they purchased a phone from the, from the store.

And now there was a disruption in the area through some of this, you know, streaming information. Now I convert that data. And then when this client calls, now I have the reason to say, hey, you know what? This client’s probably calling about the disruption.

They’re new to the service. And now how can I make sure this client is still happy and give them kind of like the confidence that, you know, this is what just what’s just a glitch or what have you? That could be a a solution for AI, for example.

I that that’s interesting. You mentioned you mentioned log data. Right? That’s the the stuff we just kinda pitch over pitch over to the side and don’t really worry about to don’t really worry about too much. But this this explains why you’re so bullish on stream.

Right? And and I assume time series data and a whole bunch of other data is because it’s it’s more easily consumed by GenAI solutions.

Am I correct?

Absolutely. And so if you think about innovation, right, if we keep on innovating on structured data, we can do that. There’s so many potential use cases, what have you. But if you wanna innovate on things that we we that were on top, that would be the unstructured data, for example, and streaming data.

Right? Because now it becomes easier for for us to kind of use Gen AI, like the genomics data that I just mentioned earlier. You have thousand and thousands of information in in these PDF files. Now you can use AI to kind of go deeper and kind of like getting more information out of this PDF files and then store them into, a two dimensional table, for example.

Right? So yes, structured, very important. Not gonna go away. You can innovate more into that.

But if you wanna innovate deeper as an organization, look at where you’ve never touched before. You know, maybe if it’s not structured, unstructured streaming data, I’ll give you five, ten percent if you increase that by thirty percent and gain millions and millions of the info dollars worth of information. Isn’t that amazing?

The to me to me, what you just said is is the most valuable thing for CDOs, arguably, maybe of this entire conversation, which is if you really want to innovate, if you want to be in the driver’s seat when it comes to the adoption of Gen AI in your organization and driving transformational value, then you gotta look where you haven’t been looking in the past. And and and that that’s all of that unstructured data. It’s the log files. It’s PDFs.

It’s your SharePoint server that that maybe you haven’t you’ve never even looked at in in in the past and finding ways to bring that all into a common management layer, which brings us right back to the fabric or something that looks an awful lot like a fabric. I I I love it. In in our last few minutes, where where where do you like, to the degree that we are prognosticators and we can see the future, where where do you see things going in the next two to three years? Like, what’s what’s what’s exciting around the corner that that maybe Microsoft is working on, maybe some of your bigger clients are working on?

What’s what’s some of the cool stuff you see coming down the pipe?

I think I’ll take it from a client perspective because I know clients are preparing for AI, preparing for how should I be AI ready, if you will, right, if I’m not already doing some of the AI. I think from a client perspective, I if this is probably more valuable to to the audience, is think about foundational items that you should implement.

For example, if to your point, if you wanted to use if you want to do Gen AI, and Gen AI is not just on two dimensional models, It could be in your emails. It could be in SharePoint. It could be everywhere. Right?

So think about foundational of the governance. That’s why we have the Microsoft Purview. Thinking about how you can enable, for example, your business users because a lot of the clients that I see, that they’re becoming successful is the tight collaboration and partnership with IT and the business units. So that culture of collaboration and enabling the business units is, is allowing them to, to be successful.

From a platform perspective, of course, start with the data, open formats. If you’re building new things right now or, you know, building new capabilities, make sure it’s open from a format, storage format perspective, from a compute perspective. Then then from a data management, think about automating, automating, and automating, a lot of this this processes. And then, of course, that the consumption that we discussed.

Right? When it comes to AI, look for and and this is not an an option anymore. Look for where you will get more value out of AI, for example. There’s so many use cases out there and we are innovating heavily on on AI capabilities on simpler, integration, if you will, on how we can also go for an all of this.

And that’s why Microsoft Purview is an important strategic, move for us. All of that tied together is simplification. If you are gonna think about modernizing, yes, you need Microsoft Purview. And then if you need about data unification, what have you, that’s Microsoft Fabric.

And when it comes to AI, we got the AI studio with all the capabilities that you just mentioned about co pilots, if you wanna build, your own and all that stuff. So simplified, there’s three three kind of, like, solutions we’re putting forward.

I love it. That is a great place to tie off. Emerson, thank you so much for for for coming on today and for sharing your insights with the CDO Matters community. Really, really appreciate it. Thank you. I hope to, to do it again sometime soon.

Absolutely. And thanks thanks for having me, Malcolm. And, again, if you so as you mentioned in in Fabrikon, if you’re looking for deeper dives on this, looking forward to to have that discussion, just let us know.

Yeah. That’s awesome. Fab FabCon, September twenty third, Stockholm, Sweden. Check it out if you’re over in Europe.

It would be some time well spent. Alright. To our community, thank you so much for tuning in to another episode of CDO Matters. We will see you on the next episode sometime very soon.

Thanks, all. Thanks, Emerson. Bye.

ABOUT THE SHOW

How can today’s Chief Data Officers help their organizations become more data-driven? Join former Gartner analyst Malcolm Hawker as he interviews thought leaders on all things data management – ranging from data fabrics to blockchain and more — and learns why they matter to today’s CDOs. If you want to dig deep into the CDO Matters that are top-of-mind for today’s modern data leaders, this show is for you.

Malcolm Hawker

Malcolm Hawker is an experienced thought leader in data management and governance and has consulted on thousands of software implementations in his years as a Gartner analyst, architect at Dun & Bradstreet and more. Now as an evangelist for helping companies become truly data-driven, he’s here to help CDOs understand how data can be a competitive advantage.
Facebook
Twitter
LinkedIn

LET'S DO THIS!

Complete the form below to request your spot at Profisee’s happy hour and dinner at Il Mulino in the Swan Hotel on Tuesday, March 21 at 6:30pm.

REGISTER BELOW

MDM vs. MDS graphic
The Profisee website uses cookies to help ensure you have the best experience possible.  Learn more