The Data Scientist's Dilemma: Why Good Models Die Before They Ship

Episode Overview:

Data science teams are delivering results — so why do so many projects never make it to production? Malcolm Hawker and Kristen Kehrer, founder of Data Moves Me and former data science leader, dig into the organizational failures behind the disconnect: governance that blocks data access, business stakeholders who hand scientists solutions instead of problems, and why product management may be the missing layer in your data org. They also get into what AI actually means for data science careers, whether junior roles have a future, and how staying relevant now means building things — fast.

Episode Links & Resources:

Episode Transcript

Good morning. Good afternoon. Good evening.

Groovy.

Good whatever time it is wherever you are in this amazing world of ours.

I am thrilled today to be joined by Kristen Carr. Kristen is one of those amazing people who does like everything. She’s a renaissance lady. She is a data scientist. I I think in the in one of the promo pieces, said recovering data scientist. I don’t think that’s correct. I think you were accurate.

No. I am a data scientist. Not recovering.

You’re a data scientist.

She has led a data science function. She is a published author of a children’s book. She has an amazing LinkedIn learning course on machine learning. She runs her own company.

She’s she is a consultant in the data science space, founder and run founder and principal and owner of data moves me dot com, which you should definitely check out. If you’re looking for somebody who knows what they’re doing, right, and who is somebody who is not McKinsey to Deloitte Accenture and who can actually help you guide you through an AI based implementation, help you understand how to stand up a data science function, get your hands around machine learning. You found the right person. I’m talking to her today.

The bad part about talking to somebody like you, Kristen, is that I feel like an underachiever. When I look at your resume, I was like, wow. She’s doing a lot of really, really cool and fun stuff. And you were data science before it was cool to be data science. How’d you how’d you how’d you land that? How did you get ahead of the curve?

Oh, actually, I mean, that’s going way back. So I got a bachelor’s degree in math in two thousand and four, and then I found myself in a job not making a lot of money in thousand and seven. And actually, was working for Caldwell Banker, residential brokerage as the housing bubble hit. And I knew I was going to get laid off and I knew that I wouldn’t be able to support a family someday on what I was making. And I did some research and decided that statistics looked like an area where, well, and I loved statistics in college, but I thought it was time to go pursue a master’s degree in statistics. And I got out and started doing econometric time series analysis and forecasting and the utilities.

And then found out that actually you could make more money in general analytics. And then the data science term became popular and over time I rebranded and made sure to find opportunities for modeling in my roles and just sort of went from there and kept evolving.

Well, but it started with an education in STEM.

Oh yeah.

Would you agree? I mean that was that was the genesis of it all, focusing on mathematics which not a lot of people would would do. So kudos to you. What’s what’s the difference? Is there any difference between somebody who knows AI and somebody knows data science?

It’s funny because I have that thing in my head when people, you know, are talking about AI and I feel like they could just use the term ML, but at the same time, there are differences. And I think it’s difficult for me because I’ve sort of, you know, I got into computer vision and then I learned a little bit about vector databases and was using those. And then LLM became popular. And so I sort of slowly worked my way through. So for a traditional data scientist, there are going to be gaps. If you’re a data scientist, you have the skills to pick up the other areas and and start building things for your job.

So there is absolutely differences and things to learn, but I almost feel like it’s similar to somebody who has built models for retention and acquisition and then they say, hey, you know, we need you to build.

A recommendation engine and you’re like, I’ve never done that before, you have the skills to go in and learn and figure out how to do it.

And I’m sure, like the A. I, the text data, typically most of the people in the data science world, we’ve been hanging out with relational databases, tabular data. It is different. There are, you know, there are things to learn just like I had to learn some things when I started working with image data, but at its core, it’s data. Right? So so it is, it is and it isn’t.

Well, okay. So you know, you touched on AI, you touched on machine learning and you know, is it correct to say, and I’m I’m playing kind of like the unfrozen caveman lawyer here because I think I already know the answer but I want I want this to be heard for our listeners. Is it correct to say that when we talk about AI, it’s just another form of predicting something? Right? Whether it’s predicting an outcome, predicting a weather event, predicting something. Is is is prediction kind of at the core of this?

And if it’s not just prediction, what is it?

Yeah. I mean, so the way that the LLMs work is that it’s predicting the best next word. Right. And so it is, you know, when I think of AI, if you called your logistic regression model AI, which tons of people did in the 2010s, right? And people would sort of guffaw, but it is using machine learning models or statistical models even for the purpose of prediction.

And I think where the AI designation comes and I don’t. So this one’s difficult because I once gave a presentation about AI to my daughter’s third grade class when she was in third grade and the kids started asking questions and they said, well, is, you know, based on your definition that you gave, would that make spell check AI?

And I was like, well, based on the definition that I gave, Yes.

Would we consider spell check A. I. Now? Right. Because it’s like using a computer to do something that a human traditionally had to do.

And the answer is no, we wouldn’t consider spell check AI now. And I feel like the designation sort of in the industry is when you get to these neural nets that are very deep. They have, you know, more than three layers. That’s sort of where we say, Okay, that’s AI versus everything else’s ML. And so you may just be using a more shallow neural net and we might not call that AI, but it’s really fundamentally very similar, especially since nowadays a lot of times, you know, we’re just doing XGBoost fit, you know, with our features and getting a model. We’re not really as intimately close to the real model infrastructure anymore.

What’s a neural net?

Unless you’re somebody working with it.

What’s a neural net?

I mean, a neural net is a model architecture that So the first time I used a neural net was in twenty ten, right? And the reason why we decided to use a neural net was because there was a nonlinear relationship between the weather and the, like, sales for electricity, right?

And so I was trying to model Okay, that’s interesting.

Yeah, I was trying to build a model to allow the supply chain people to manage electric load capacity during heat waves and stuff like that. And so if you’re using an ARIMA or a linear regression model, there’s like a linear relate like it’s not going to allow you to model those nonlinear relationships as well, Like linear regression, linear relationships.

And so it oh, I I don’t know what angle to go at the Go in whatever angle. Like, go in whatever angle that Right.

Using the metaphor of the third grade class, You’re you’re not too far off in terms of your moderator of this podcast. So continue.

Yeah. So you can you have the ability to define the layers. And so I’d have, like, two sigmoid layers.

And there’s just different ways that you can choose to, you have some like additional variable, like settings that you can use to try and optimize the fit of your data.

I can’t come up with a really good definition right now.

So, but what you’re saying is that there’s, would potentially describe a neural net would potentially describe a complex array of relationships that were not linearly correlated, but are somehow and I maybe correlate is the wrong word, but that are non linearly related that help explain the behavior of something.

Right? Where it’s where there isn’t it, you know, I do I do this, I do x so y happens. There may be x and y as a part of this equation. They may not necessarily be related, but the combined impacts of all of them are having some influence over the behavior of some complex system. I would assume, maybe?

Yeah, yeah, well and that’s how a lot of it is, right? Like in that same job I’d use something called seemingly unrelated regression where I’d build two models and then there’s randomness in the error of that model, right? Because models are never perfect. And it could look at between the two models, is there actually non random stuff going on there and use that information to make it better?

Or and I think that this is sort of where I’m getting at to because I’m not like I’m not an expert on neural nets, but every time a data scientist goes to attack a problem, they need to think about the different assumptions that need to be met, right? Like, I was once building a model in e commerce where they said, okay, we have this retention model, it’s pretty good, but we think that some of these businesses have a seasonal, it was for small micro businesses that we were modeling, those were the customers and we think that some of those have a seasonal usage pattern and like, how do we make use of that so that we’re not messaging people saying something along the lines of thinking that they’re a retention risk when actually they’re an ice cream shop and they’re closed for the winter.

Right. And so I go on trying to figure out what model I’m going to use and I’m not going to use something where, so some of these businesses are only going to have been with us for a couple of years, right? Like I’m not going to be able to model something where I need to understand their yearly pattern and I don’t have at least two years worth of data. But on top of that, I want to make sure that I’m using a model that doesn’t have like a ton of assumptions that I need to fit because I’m going to have issues with constant variance and other stuff.

So I end up choosing like a TBATS model. Right.

And so, you know, a lot of it when I’m attacking a problem is just trying to understand like, what is the problem? What does the data look like? And what type of model is going to allow me to model this without, you know having like huge problems and weird output because my I’m not actually like the valid the assumptions of the model aren’t valid for my use case.

So what you just and this is this is valuable for for CDOs, especially for CDOs and senior data leaders that have not managed or led a AI based function or decision science function in the past because what you just described sounds almost like cooking.

Like, where I’m gonna throw a little pepper in, but where you’re not necessarily you’ve kinda got an idea of the of the recipe or at least you know what you wanna make. Right? Like, but where it’s this very organic process where you’re throwing in a little salt, a little oregano, oops, too much of this didn’t work. Delt didn’t work. Highly experimental, highly iterative where where you’re trying to, in essence, maybe create a recipe.

Right? You’ve got you’ve kind of got some idea of the building blocks, but ultimately what you need to get to is this recipe, which is the model. You need to get there, but there’s no hard and fast way of how you get there. And how you get there requires experimentation. It requires iteration. It requires collaboration of going back and forth with like everything that you just described is the exact opposite of building a dashboard.

Literally.

Like a dashboard, you know the output, you know what’s like, you know, you know the numerator, you know the denominator in that metric, you know where the data is, you know where it’s coming from.

Yeah. Maybe name one thing in one repository and name something else in another repository, but there’s no it’s not inherently creative. That process is not well, there’s data people who are probably screaming at their at their computers right now yelling at me. Yes, is.

No. No. Because otherwise, also too, you might have two different orgs, like, fighting over what is the actual definition of this feature, but it’s not you don’t even have ownership to populate. And when I model, I feel like I always had ownership over choosing the algorithm that I used.

Ah, well, so there. So okay. That’s that’s that’s something else. Right? That is also very different.

Right? In in theory in theory, if your organization was extremely mature from a data and analytics perspective, you had every imaginable data governance policy completely documented and nailed down, you had a semantic layer and you had all the tools and you had like, you you knew what things were defined. You had you had the structure. You had all the metadata.

You had everything.

I I would argue that it wouldn’t be a very creative endeavor to create a dashboard. But what you just said, in addition to my cooking metaphor, which may be good or bad, don’t know. But what you just said is you had freedom to do a lot to do a lot of these things and you had freedom to experiment, which which again, I don’t think most data engineers would say that they have that, those those those degrees of freedom. So what I’m trying to get at here, Kristen, is is like the kind of fundamental differences in a data science function versus a traditional analytics function and what me as a CDO would need to know about how these things the differences in how these work and maybe even the differences in recruiting, the differences in managing people that do these two different things. That’s kind of what I’m trying to I’m trying to trying to get at. What do what do you think about my little rant there?

The differences in managing the people specifically?

Let’s start Do do you see difference or are these all just like math people that are all kind of swim in the same pool and all generally?

Mean, so Go ahead. Yeah. Like with the experimental nature of the modeling, you do need to give people more room to actually experiment and it does become more difficult to manage the actual like flow of information. I think that one of the things that I’ve always sort of seen go poorly is, you know, we’d have like an, I’d be on an established analytics team and maybe marketing isn’t getting what they want and, you know, and we’re not able to staff them more. So they say, screw you analytics team, we’re going to go hire somebody on our own.

And typically that person is junior or maybe like a little bit above junior. And then they just get railroaded with ad hoc requests that aren’t the highest value, right? Because the, the position was born out of, I want somebody to be able to pull this data when I want it, or I need a dashboard for this and it’s not coming from like the higher level.

And when you have somebody who’s like a little bit more, you know, junior or analyst level, they’re not used to pushing back either. And they don’t have the, you know, when I’m like a manager, I’m the gatekeeper, right? Like if you ask ask me for, you know, a test that’s got a, that you’re going to, regardless of the decision, you’re going to roll out the test anyways, I’m going to say we’re not doing that, right? But the analyst that is placed on a team that doesn’t have leadership, like analytics leadership flowing up the pipeline.

Yeah, I see when people ask me about differences and stuff like that, that’s one thing that I like to call out because I have seen it numerous times and I don’t know that I’ve ever seen it done well. And I think, you know, because all of the teams that I’ve been on, they did try to think about, you know, strategy, the ROI of different product, different data products or different data projects that we do and what we should be supporting and how we should prioritize and how we think about it. And then, I don’t know, man, when I see people, when I see these onesie twosies get placed on the, you know, oh, we need an analyst. It just, it becomes like an, like a, I don’t know. I’d almost want to say like an ineffective business analyst position and business analysts are not ineffective. I’m saying that positioning is ineffective.

Well, just touched on something that I’ve seen over and over and over again, which is this inherent tension, marketing wants to do their own thing, Right? Versus some idea of, we just, for lack of a better word, centralized oversight. Right? A centralized controller or centralized data and analytics function. And you’re absolutely right. If you if you as a data leader don’t give marketing what they want, they’ll just go hire somebody.

They’ll just they’ll just go hire somebody. That’s true from a data science perspective or a classic analytics perspective, a governance perspective, you name it. They’ll go doing what they need to do.

But I’ve had a lot of conversations with CDOs over the last couple of years, especially after the explosion of of AI where a lot of them are trying to find ways to to build more scalability into their data science functions.

And for them a lot of the answer to that drives them towards more of these centralized approaches and some of these centralized models or centralized operating models, I should say, where there is a where I I do see this tension happening. I’ll give you I’ll give you an example where, you know, data scientists kind of naturally just wanna have root access to every source database, and they wanna go and get all the source data on their own, and they don’t want to rely on traditional ETL. They don’t want to rely exactly. Right? You’re nodding.

Right?

Like Give me access.

Exactly.

Yes. Exactly.

But I think I think a lot of data leaders hear that and the and the response is, oh, heck no. No. We’re gonna we’re gonna put some controls around this because my governance policies require controls or my auditors require controls or whatever it is. What I’m trying to get at here is that the pushback is not because the data scientist doesn’t want to play nice or because the data scientist is a curmudgeon. It’s because of the inherently creative nature of what you’re doing and the inherently experimental nature of what you’re doing. And if you can’t explain why something looks the way that it looks in Databricks, because there was some ETL process along the way that you you don’t know existed or weren’t involved in creating, or there could be some transformation in the data that you didn’t even know about, that’ll affect the performance of your model.

If I don’t have access, what am I doing? Are you going to give me a flat file and I’m going to build a model on that? And then when that model goes to production, what happens? The data needs to flow, you know?

And so Well, I’m trying to touch on this this tension because I do know I’ve had this specific conversation where the data science function comes across, and I’m not saying you.

I’m just this is this is this is how it works.

Maybe me.

Comes across as as as like high needs and and doesn’t play well with others because they want to go and do all of this experimentation and they want access to the source data, and they don’t wanna have to deal with data that has been touched in any way that may be sitting at Snowflake or Databricks and gone through transformations that they don’t know about or don’t control.

And and my perspective is increasingly like, four years ago, five years ago, I’d be like, heck, yeah. Get them in line.

Yep. Get get get them in line. You know what? You know, this this is the reason why data science is seen as this, you know, know, r and d thing where where, you know, ninety five percent of the stuff never makes it to production, and that’s that’s a problem and we need to get our hands around this and provide more scalability and more more repeatability and more control.

And I think I I’m not entirely sure that’s the right answer to that. Cause I think if it is that experimental and if it is that iterative and if it is that creative, if it’s like the cooking I described, maybe trying to exert more control over it won’t do the right thing. I don’t know. What do you think?

Well, I mean, so the POCs that I’ve had that have failed, it’s been because I need to go over to operations to ask them for data around whatever it is that fits my use case and they give me that data. That’s great. I build a model with that. And then I say, here’s my beautiful presentation with my results.

This thing works.

Let’s build it out. And then like, it just dies because no one’s going to like build out the data to make it work.

I don’t know. I’m not exactly sure why it stops there, but it is absolutely and it’s and it’s been more than once or.

You know, it’s that we use third party data as part of, you know, a behavioral customer segmentation or something like that. And then it’s like, Okay, we’re going to, you know, move forward with this where I mean, maybe that is something that doesn’t you don’t need to have constant data. Maybe you could get like a refresh of that or whatever.

But, you know, then they decide not to buy the data and then it’s dead. And it’s like, we I spent a couple of months on this. Did you know you weren’t going to buy the data? Did we know we weren’t going to set up the data from operations? Right. Because now at that point, I don’t have access. So another team has to build it out and like no one’s talking to them about what their prioritization looks like.

Well, okay, this starts to get into conversations around product management. And you and I had a a great offline chat about the pluses and minuses of product management. I’m I’m a believer and I talk a lot about this in in in my book about how product management I think is necessary in our field, but you haven’t had a lot of good experiences with it. And what you just described was bad product management all day long, I think.

So that was no product management. That was like twenty eighteen. That was before I feel like around then was when product management started getting as common. Do you know when product management sort of had its rise for data projects?

Well, yes, I do. And I need to be clear that what we’re talking about in terms of the more recent craze is actually data products.

Would argue it’s data products, not product management. I would argue these are two very, very different things.

Okay.

One is a shift left mindset, get as close to the source as possible. Right? It’s all about scale. It’s all about repeatability.

That’s for data products. Data product management, I would argue, is more of a shift right where it’s get close to customers, get used to the use case, get used to the business, get close to the business. But to answer your question, the popularization of data products was very much an outcome of the data mesh and all of the the hype and fervor around the data mesh that started the late twenty twenty one, early twenty twenty two with the publishing of Zamok Degani’s book, Data Mesh by O’Reilly. I would and I would argue that’s exactly when things started to take off. So yeah.

So you were doing something in twenty eighteen where if you didn’t understand the limitations of some third party data, you didn’t buy the data or you didn’t anticipate what you would need to go to production, you didn’t better anticipate customer needs. I would argue that is entirely a failure of product management, whether you had the function or not. Whether that was a business analyst who was doing that stuff or a leader or senior manager who the heck knows. I would argue that is a failure of product management because that’s the kind of stuff you should know before you start tinkering in a POC, don’t you think?

Yeah, no. And when you tell me to go build a model and to go get data from operations, I assume that, you know, and like, you can’t assume, but at the same time as like a senior data scientist and someone’s telling you to do it, you just assume that they’re going to like build out the infrastructure required to support it if you build it. I don’t know.

Well, but what this comes back to the push the pushback issue, right? And do you have the chops to push back?

I suspect if you’re a data scientist making a good salary more than likely six figures or even several six figures, you probably have the chops to push back and say, and ask the question of, okay, instead of telling me the model you want, why don’t you tell me the problem you’re trying to solve?

And then we can collaborate on what the best way to do that might be.

What I just described that, hey, tell me the problem you’re trying to solve. Instead of giving me a solution, right? Give me a problem and then we’ll collaborate on the solution. That is, I would argue that’s product management. But far too often the business comes to us with solutions. They don’t come to us with problems. And then we go build them and then they’re like, ah, no, that wasn’t what I needed.

I’ve had that happen to me like a zillion times.

Which is which is why I’m which is why I love the idea of product management. So putting somebody in the middle, whether it’s a business analyst, a product manager, a senior manager who is asking the question of what problem are you trying to solve. Right? So, know, that’s that’s where I think we we need to go.

But I mean, do, I am very big on what is the problem that we’re trying to solve.

I have historically preferred to speak to the product owner directly because they understand. So if I’m building a fraud model, what does fraud look like? What are, when you see some, but what do you actually see in the data? What behavior do you see so that I can understand what features I’m going to need to try and build this up, define what fraud is. Like, what does, how do I know that it’s actually fraud?

And the people that I’ve worked with in a product management capacity historically just both didn’t have the deep product knowledge because they weren’t the end user for this product. They weren’t the person who was going to, you know, who day in and day out are seeing this behavior come through, who are dealing with the issues when they happen. And they also don’t have the chops to say, you know, like start thinking about like, how am I going to, you know, create my dependent variable? What is going to mean in terms of what algorithm I’m going to use?

What algorithms can I not use? And so I can see a world where product management is awesome. But what it has historically looked like to me is we have fifteen people in a room and the product manager is leading the meeting and he’s talking to the business and I haven’t even had a chance to talk and I’m the head of data science. And then he says something like, We’re going to start with a linear regression model as a baseline.

And the dependent variable’s binary, that’s not even an option. You know, and I’m just like, what is going on over here? And this is just, these are just my experiences that I’ve had.

Well, and I I think sadly sadly, I think that’s almost the norm. Yeah. Right? Almost the norm, particularly within the data field. And I’ve had this conversation, like, many times with many data folks. I think I think if you ask Joe Reese the same the same question, he’d say the exact same thing that you just said.

I think I’ve heard him. Yeah. Like No. No. No.

I’ve I’ve had this conversation with him recently. And Yeah. And where it’s like this office space caricature of the product manager from office the movie office space. Right?

Where it’s like, they tell me what it is you do here. Well, you know, I take the requirements from the business and I give them to the engineers. Okay. So you build the product.

No. No. I don’t.

Right?

Like it’s it’s the it’s literally the caricature of of the product manager and office space.

I I won’t go into a lot of more detail. Don’t know why that’s been on my mind today. But that’s I I think that’s almost the norm. Right?

And which is unfortunate for a product management guy like me because that is that is when product management done poorly. That is product management done poorly all day long. And it and and it feels like to any engineer, whether you’re a data engineer or whether you’re a software engineer. Like, I’ve managed software engineers for years and years and years where software engineers are saying the exact same thing.

I’m trying to code a solution here. I’m trying to build a solution, and I can’t get a straight answer to this person. They can’t define the word fraud.

Right? Like what you just said. Right? Like, they you they they can’t define what fraud means, and I should just intrinsically know what it means. And then the engineer gets put in this untenable position of do nothing, push back, or take a Right?

And then they come back to the business, and the business is like, you completely missed the mark. And they’re like, but I didn’t know it was.

Well but this this is exactly what leads to this this belief that well the business doesn’t know what they want.

Right? And I have managed teams of product managers and every time I’ve had product managers actually tell me that I said, BS, they know exactly what they want. They just can’t communicate it very well. Right? And this, what we just what we just described, this this thing, whatever it is, call it product management, call it the gap between the business and technology, the business and the AI function, the data science function, the analytics function.

That ball, a thing, is I think exactly where we need to be heading.

Because in a world where AI is doing more and more of the lifting on the back end, there’s there’s this I think there’s this screaming need to solve the problem that you and I just, I think, well articulated.

So what what, what do you think about that? Do you, are you, are you concerned about the impacts of AI in our world from, from a career perspective? Are you concerned about job security in an AI driven world? Are you concerned about some of these things or are you more of an optimist? Where you fall in all that?

I think I’m both because right now we are seeing layoffs and you know, I think last time I checked, salaries looked like they might have been a little depressed, but that’s not, you know, that’s not statistical, but I also think it’s misguided. And I think that leadership doesn’t necessarily understand what is involved and the statistical understanding and the understanding of the business that is required to be effective as a data scientist.

And so when you don’t have junior data scientists, you have no one to come up on because the tenure of data scientists is not always so good. They do a couple of years sometimes bounce. And that’s like very, very normal in data. I mean, maybe people will hold onto their jobs more now that hiring isn’t so easy as it was five years ago.

But I honestly think that companies are going to feel some pain of their decisions and that it’s going to come back, right? Because over the last fifteen years, we’ve always seen advancements where the data science pipeline, parts of it become a little bit easier. The pipeline starts to feel a little bit smaller. I remember when we got Adobe Analytics for doing, you know, web analytics and I was like, oh, are they even going to need as many people anymore? Because this is this is really slick. This this gives me, you know, site data, which is huge.

And, you know, helps me synthesize it really quickly. So what are we going to do? And the answer was always that there were higher value problems to work on. There is always a pipeline of work.

No business ever says, Wow, we’re done with data science. There are no more problems to solve. Right? You have problems.

And this idea that because the places where I’ve actually seen the most cutting of roles because of AI has been things like our chatbot is ready to go live. We’re going to get rid of these customer agents because, you know, the way we said we were going to get ROI on this project was going to be through cost cutting instead of like doing something better and helping the customer experience and getting like brand loyalty and stuff. I don’t know.

But so, you know, and I’ve I’ve literally I have like literally as you know, a member of that leadership team watched it go poorly and they laid off people and then the model isn’t great and people wait on the phone forever to get help and they’re frustrated.

I don’t think the answer is through cost cutting. It might be a slightly smaller workforce. Maybe we don’t need quite as many people, but there are always problems to solve. You can always do better. You can always make your brand better. You can always sell more products like, you know, and you can identify the data prop projects that will like help you get there and optimize your ad spend and optimize everything else.

That’s my rant.

Well what you just described, all that cost cutting is gonna be fine if everybody does it.

But if two or three of your top competitors don’t and they’re using AOI AOI. AI.

If they’re using AI and they’re keeping those people where they are realizing a net twenty, thirty, forty percent overall productivity, and they’re keeping all of those people, they will by definition have a higher level of productivity than you would because you let go of the people. Yes. Because people are inherently productive. It’s what we do. Right? So if you’re out there and you’re letting go twenty, thirty, forty percent of your workforce because you’re saying, well, I can keep rowing the boat at the same speed.

Yeah. Like, you keeping the lights on? Are you trying to skate?

Whatever. Well, right. Exactly. But let’s just let’s just say let’s say that’s a a reasonable argument to make for whatever reason. I don’t think it happens to be.

But let’s say that I with AI, I can keep going the same speed and cut cost by thirty percent. And that’ll go straight to earnings per share and my stockholders are really happy. I can do share buybacks, hip, hip, or whoever, whatever.

But if your competitors don’t and they’re thirty percent more productive than you are, you’ll be behind rather quickly. So I see that being thankfully how are today, at least, market based economy continues to work and that this idea that, you know, there won’t continue to be a demand for humans.

It I just it doesn’t pass the smell test because as long as humans are productive and are and can add value. Now there may be wages are depressed, I think. I think maybe there’ll be some downward pressure on wage. I think there’ll be some, some other kind of massive disruptions around what we do and how we do it. I think we’ll need to learn how to reskill and do things differently.

But I’m not entirely I’m not entirely sure that that companies are just gonna go and whack large swath of people.

And if they are right now, the ones that are making the news, what was the company that just laid off like forty percent of this Yeah.

No. There was another one.

It was in the news recently, but to me, like, if you think it’s AI, it’s probably a broken business or a broken business model or something else. Right? I I I think because we’re just not there yet. Right? To say I could lose forty percent of my workforce and still maintain the same output. I just I I I have a hard time seeing that.

So what you’re telling me is that you’d be comfortable steering your kids towards a career in STEM.

That’s really, and I’ve spent a ton of time thinking about that because when other people ask me like, hey, and I get this question a lot because I’m hanging out over here and people want to know, you know, would you send your kids to school for computer science? I don’t necessarily feel comfortable telling other people.

Because it’s a bet I’m making, but would I send my kid to school for computer science? Absolutely. Absolutely. And I think that a lot of people are going to go into trades and I think a lot of people are going to avoid college and supply and demand. And I think that there will be more room for STEM people.

Because when you think of the complexity of the systems that software engineers are managing, AI is not going to take that over some you need a human that understands your architecture.

And so, so yeah, so I think those jobs are going to be around.

And if other people decide not to go to college, great, I’m sending my kids to college for STEM.

I appreciate your candor. I appreciate your candor and maybe there’s a few thousand people that listen to this and now know your secret.

Now now I I actually I I agree with you and I would do I would do the exact same thing for the exact same reasons. However, I I do think that there needs to be some things that kind of change.

And one of the things that I’m most concerned about is how are kids gonna and what I mean by kids, showing my age, is somebody coming out of a four year university who is green as green can be and who has no real kind of practical experience. You know, and maybe they’ve done all the things. Right? Maybe maybe maybe they’ve done extracurricular work.

Maybe they were part of the Python club online or at college. Maybe they’ve done, you know, maybe they vibe coded their their own CRM. Who the heck knows? Maybe they’ve done all the things, right, that that you’re supposed to do and who knows what that’s gonna look like in twenty years from now.

But I’m a little worried about how will they get the experience they need when AI probably can do all the things that an entry level person can do.

And and then I think about it. And then I think about, like, big consultants.

Right? Like the Deloitte’s and the Accenture’s and the E and Y’s. All of those businesses are built around having deep subject matter expertise.

And the models are getting better, yeah.

Well they are. However, what those companies do is they will hire people straight out of school and then make a major investment in grooming them and teaching them. Yeah. And bringing them up to that level. So maybe, I don’t know, maybe I worry needlessly. I’m not sure, but one thing I know for sure is that things are changing quickly.

No, I’m not worried about the same thing. And you know, if my kid needs a cool portfolio, guess what? I’m gonna be coding with her. You know, that’s that’s that’s how we’re gonna get over some of that, you know.

How are you working differently? What are you doing differently to stay on the edge? Right? To to to maintain relevancy and to and to to to be at the bleeding edge of where all these changes are? What are you doing in your career or your day to day that maybe you weren’t doing a couple of years ago because of this pace of change?

Well, I mean, it’s just gotten so much easier to build stuff. So I get a weird idea. So my daughter, my my sixth grader is a competitive dancer. And so but like I get I at the last couple years I’ve gotten to choose her music.

And her outfit. And so I wanted a song similarity search.

And so I built an app to do it. Now there are some that exist online. I wanted to do something better. Then I found out it was really hard because Spotify changed their API last November. Nope. A November before I think it is now.

And so you’d only get thirty seconds of music and that wasn’t enough to get like, you know, I wanted to I I really wanted the essence of some things because I didn’t know which piece I was getting, which thirty second clip.

And if you try and download data illegally from YouTube, it’s garbage. So I ended up leaving that project, right? Like I stopped working on it. But like the thing is, is I spun up an app that worked in like three days where I had all these filters for tempo and how much of it was vocals and, you know, all this stuff. And I had never worked with audio data before, but I was like, oh, I’ve done image data. I’ve done text data, like, I’m going to try audio data for the first time.

And so that’s sort of how I stay up on things is I build stuff. I just, I will say, you know what, I’m going to try this thing and I go, I go do that.

And then also, the influencer work, because when somebody comes to me and says like, Oh, hey, you know, we’d like you to chat about this product. I have to get in there and try it and see what it’s about and learn about it.

And then, you know, the Mavens of Data podcast was I’d intentionally invite people on my show that, you know, could talk about things that I wasn’t so sure about. It’s like, you know, why do you like knowledge graphs?

Which I’ve used before, like I had modeled data in a graph back in like twenty thirteen. It wasn’t like a brand new idea, but like, yeah, like, no, like, really tell me, like, where do you see the benefits?

So yeah, so a bunch of a bunch of different ways and like all over the place, like a, you know, true ADHD person.

What did you was it Cloud Code? What’d you do to build?

So that might’ve been the time where I started going back and forth between ChatGPT and Claude. I’ve definitely used CoPilot.

Okay.

Claude and ChatGPT. I prefer Claude. Claude is totally better. I was getting frustrated because I’d always run out of credits right in the middle of something good.

And so I like had gone back and forth for a while, but now if I’m coding, I am, I am a solid Claude person.

Well I just I just spun up two days ago Claude Bot. So I I don’t know you for following all of the fervor around Claude Bot and it all the things related to Claude Bot. Yes, no. Anyway it’s like it’s it’s amazing.

It’s somebody branched off Cloud and Cloud Code and it’s this agentic network network that is just insane. And I can ask my bot to go basically go build anything. It’s it’s it’s it’s nuts. It is not I’ve got it running on a Mac mini, and I’ve got it like sequestered in its own private room, you know, completely walled off from the rest of the world and just a contained blast radius.

And this thing is this thing is insane. It is insane. I’ll I’ll I’ll send you the URL if you want to take a look. It’s it’s it’s out there and out there and get mean, anybody can go download it.

It’s it’s crazy.

Anyway. Alright. With that, Kristen, thank you so much. I really appreciate you taking the time to share your insights and your knowledge with our community. If people wanted to reach you, to connect with you, to hear you more, what would they do?

Find me on LinkedIn. That’s where I hang out.

You got a hundred thousand followers on LinkedIn. I mean, you you got a you got a big network.

I hang out there a lot.

Alright. Well, I hope I hope our paths cross at an event sometime very soon. The last time was in New York almost two years ago now, which is kinda crazy, but I hope sometime in twenty twenty six we get to see each other at an event would be fun.

With that, hey, CDO matters community. You just listened to the ninety ninth episode of this podcast. That is crazy. I cannot believe we are knocking at the door of one hundred.

If you’ve got this far and you listen to the podcast, please do the like, please do the subscribe, the things that keep the algorithms happy because we all need to keep the algorithms happy. And stay tuned for episode one hundred where we’re gonna do like a bro show. It didn’t I didn’t mean it to be that way, but I’m bringing back some of the returning champions of the podcast. Our our shared friend Scott Taylor will be on the one hundredth.

I love him.

My buddy Samir Sharma, my buddy Juan Cicada. Those three guys will be on the one hundredth. We’re gonna have a data rant. So you don’t want to miss that. Stay tuned for the one hundredth coming up. But with that, Kristen, thanks again.

Oh my god. Thank you so much for having me.

Alright. With that, thanks and we will see you on another episode of the CDO Matters podcast sometime very soon. Bye for now.

ABOUT THE SHOW

How can today’s Chief Data Officers help their organizations become more data-driven? Join former Gartner analyst Malcolm Hawker as he interviews thought leaders on all things data management – ranging from data fabrics to blockchain and more — and learns why they matter to today’s CDOs. If you want to dig deep into the CDO Matters that are top-of-mind for today’s modern data leaders, this show is for you.

Malcolm Hawker

Malcolm Hawker is an experienced thought leader in data management and governance and has consulted on thousands of software implementations in his years as a Gartner analyst, architect at Dun & Bradstreet and more. Now as an evangelist for helping companies become truly data-driven, he’s here to help CDOs understand how data can be a competitive advantage.

The CDO Matters Podcast Episode 99

The Data Scientist’s Dilemma: Why Good Models Die Before They Ship with Kristen Kehrer

Kristen Kehrer

Episode Overview:

Episode Links & Resources:

ABOUT THE SHOW

Malcolm Hawker

Megan Gregory

LET'S DO THIS!

REGISTER BELOW

The CDO Matters Podcast Episode 99

The Data Scientist’s Dilemma: Why Good Models Die Before They Ship with Kristen Kehrer

Kristen Kehrer

Share

Episode Overview:

Episode Links & Resources:

ABOUT THE SHOW

Malcolm Hawker

Megan Gregory

LET'S DO THIS!

REGISTER BELOW