Data Operations vs. Data Analytics
Speaker 1: This is Catalog and Cocktails presented by data. world.
Tim Gasper: Hello, everyone. Welcome once again to Catalog and Cocktails presented by data. world. It's your honest, no BS non- salesy conversation about enterprise data management with tasty beverages in hand. I'm Tim Gasper, longtime data nerd, product guy, customer guy at data. world, joined by Juan Cicada.
Juan Cicada: Hey everybody, I'm Juan Cicada, the principal scientist of data. world, and as always Wednesday, middle of the weekend of the day and time to have drinks, and time to chat about data, and what is data, and what is not working, all that stuff, and today we have a guest who is like a lot of things are happening today for our guest, and it's Bethany Lyons, and Bethany, I have been following Bethany on LinkedIn for a while now. She was just changing jobs and just kind of starting her company, and just all the questions that she was asking, and just all the content that she was generating, all the conversations that were she was engaging just got me so excited. I remember Bethany the first time we chatted. It was on a Sunday and we chatted for a couple of hours-
Bethany Lyons: It was on a Sunday.
Juan Cicada: ...with my baby and everything. Anyways, Bethany Lyons, who today is announcing, she's tghe Chief Proctor Officer at KAWA Analytics. Bethany, it is a pleasure to finally have you on the show. How are you doing?
Bethany Lyons: I'm super excited. I'm so excited to be on the show, and especially on International Women's Day. Woo- hoo!
Juan Cicada: Yes, love this. This has all worked out and you have a lot of announcements today that you started announcing, but first of all, let's kick it off. What are we drinking? What are we toasting for? There's a lot to toast today.
Bethany Lyons: We are drinking red wine. It's 10: 00 PM in London here, so it's kind of an appropriate drink for the hour, and we're toasting a new product that is launching a new revolution, the next frontier in analytics. That is what KAWA is building, so self- service operations, we'll talk more about it on the show.
Tim Gasper: All right. Awesome.
Juan Cicada: Tim, how about you? What are you drinking? What are you toasting for today?
Tim Gasper: Today I am drinking some old Forester King Ranch, which is a very strong whiskey, so I got to be careful how fast I drink this here, and I will cheers to new things and very excited Bethany for you.
Bethany Lyons: Thank you.
Juan Cicada: Yeah. And I'm having just a classic gin and tonic, but this is a gin. I think it's the sweetest gin somebody left at my house. I'm like, " I don't know where this is"... This is actually pretty good and let's cheer to all the new things that are happening, so cheers, Bethany.
Bethany Lyons: Cheers.
Juan Cicada: All right, so our warmup question of today, operations, so operation is a game where you got to do surgery on a guy, right? Or what's your favorite board game?
Bethany Lyons: So my favorite game is the Settlers of Catan. I really love kind of collaborative competitive games, and I used to play. So I had this friend who was a mathematician and a game designer at that, and he'd do these Sunday games afternoons, and he could never beat me with all the math and logic in the world because I was just better at collaborating with other people, and that turns out to be the strategy for winning that game, not math and logic.
Tim Gasper: I love Settlers of Catan. I have so many good memories of when I first started to get into startups and we would always have Settlers evening or eat pizza, drink beer, and play Settlers, so good memories there.
Bethany Lyons: Nice.
Juan Cicada: How about you, Tim?
Tim Gasper: For me, what I love a good game of Risk. That's always a fun strategy game for everyone. What about you Juan?
Juan Cicada: So one that we play a lot in my house is Sequence. I don't know if that's one of all the cards, and you have to get pats and stuff, but a game I've played a couple times is called Wizard, and it's an interesting game because you want to win, but you also want to lose, so you want to say how much of it basically, " Do I think I'm going to lose this game?" And I do want to lose because if I don't lose, then I actually am losing points. Right? It's an interesting kind of game on winning and losing dependent. Losing is also winning and winning can be losing, so I think a lot of interesting stuff to be done here, but all right. Let's kick it off on a discussion. Honest, no bs. Bethany, why do you say that we're still doing data analytics wrong?
Bethany Lyons: So I guess, yeah, I was talking specifically about self- service analytics. I guess self- service analytics is a very data- centric, data- team- centric view of the world. It's sort of like we have this backlog of requests and it's accruing, and we don't want our backlog to accrue. We want it to go down; ergo, self- service analytics. It's kind of like to resolve our request problem, we want it push the requests back to the business, but actually, the business doesn't care about self- service analytics. They care about being able to operate, design, and measure their business processes, and those things happen to require analytics, so I guess it's about shifting the focus of the goal to supporting the operation, design, and measurement of business processes, rather than facilitating the business doing their own analytics. Right?
Tim Gasper: But Bethany?
Bethany Lyons: Yeah?
Tim Gasper: I've been told that self- service analytics is the answer to all problems in the world. I don't understand.
Bethany Lyons: Yeah, I mean, no.
Juan Cicada: Okay, so how did this all start? Because I would agree that there's like, what is it, five, seven years ago? I mean 2015 or probably would be the kind of a timeframe where you would be hearing so much about self- service analytics and then you're like, " Oh, that's a solution. We need to empower the people need to answer questions to answer their own questions on their own." So that was the wrong thing to go do.
Bethany Lyons: Well, no. I think self- service analytics got taken out of context. I think self- service analytics was really designed for analysts. It was Tableau, and I was at Tableau at the time, and Tableau pioneered self- service analytics. I think they literally went to Gartner and made it a term that we now use, but it was really about shifting the power from technical teams to analytical teams, and it kind of got taken out of context because there's this need to go the next mile, which is facilitating business operations and self- service analytics wasn't designed for that. So I guess if you take a step back and look at what are the three types of people within the business, there's sort of the business operators who are trying to move cases through a process, essentially, like sales people moving prospects through a sales process; support agents moving cases through to resolution; accounts receivable agents moving invoices through to collection, those types of frontline business operators, and then you have the people who design processes, mostly the job of management, and then you have the people who measure the performance of the processes, like your leadership team. And so all of those goals require data. I think analysts typically work to serve the latter two, the designing and measurement of processes. That's the user that was enabled by self- service analytics, and we don't have a technology platform yet that supports that first use case of frontline business operations, and so we've been extending self- service analytics software, overextending it into something it fundamentally wasn't designed for.
Tim Gasper: Interesting. I think I agree with your premise here, and I guess just trying to unpack that a little bit. Is this why we see some of this recent obsession with things like reverse ETL? How do we push the analytics back into things like Salesforce, and Gainsight, and sort of more of the line of business tools because of some of this gap between, " Oh, we have these self- service analytics tools," but then it hasn't quite fulfilled the needs of these business operators like we might have hoped.
Bethany Lyons: Exactly. Yeah because these business operators, their use of data... So the way I kind of see it is the use of data for a business operator is basically to fill the gaps in between their systems, so they have a CRM or an ERP or a HR management software software, pick your business process, to manage the really standardized parts of their job and of their business process, but then there's all this stuff that falls in the gaps between systems, and that's kind of what gets labeled as needing self- service analytics, and it's actually not analytics at all. It's facilitating processes that don't have an application that's designed to manage and simplify and automate them. I think if data teams took more of a view of like, " What are the processes that span multiple systems and how do we enable those processes?" We could chip away at this backlog in a... Well, not chip away. Right now, we're incrementally chipping away at it one request at a time, but if we took this view of like, " How do we facilitate these cross- system business processes?" We could just ax off a huge set of these incremental requests that are coming in.
Juan Cicada: All right, so-
Bethany Lyons: One thing that always really surprised me about... When I worked at Tableau, I had a lot of insight into how is Tableau deployed? And in 90% of cases it was not for exploration and insights at all, which is what it was designed for. It was like it's being used in banks as the glue between systems to do stuff like mortgage remediation and it's a way to enable business users to encode business logic that they need to apply to data to run a process that's maybe a little bit ad hoc and there doesn't exist a sales force for that.
Juan Cicada: This is fascinating. This is fascinating. I mean, you're just giving the world a lot of insight right now. It's like, " Well, this is what actually has happened at Tableau. I think we've seen... You said something really important there is the business logic. There is this logic embedded inside of Tableau where that logic should have been embedded in some sort of application or something else, but that logic is actually helping those users operate, get their job done-
Bethany Lyons: Exactly, yes.
Juan Cicada: ...in analytics. So maybe an aha moment having right now, is that the moment you realize that there is a lot of business logic tied to your analytics tool, big red flag.
Bethany Lyons: Yes, huge red flag, and that's actually the main value of analytics for business users is not the analytics at all. It's the business logic that enables them to operate.
Tim Gasper: So when you start to double down on self- service analytics and doing it much more in the traditional way, we'll call this the old way or whatever, right?
Bethany Lyons: Yeah.
Tim Gasper: What goes wrong there? You started to talk about you had lot of one- off requests and things like that. Is that kind of the big thing or what else goes wrong with the more traditional approach around self- service analytics?
Bethany Lyons: So I think we're living in the Google era where a business person is used to being able to just type a question into their laptop and get an answer, and so they expect that when they ask an analytical question, it should be Google search easy, and it turns out that it's not. It's incredibly complex to answer very simple questions just because the data's in totally different systems that doesn't speak to each other. It's not modeled correctly for the question. It can take an 18- step process to answer a fairly simple business question, so I think we don't yet have this appreciation that data engineering is in many ways more complex than software engineering, and so there's just this expectation of, " I have a question. Give me an answer." And so we failed to apply product management principles to filtering out some of those requests because every time a customer has a feature request, you don't just pipe it into your product team and build it. You say no to about 97% of the things because it doesn't have sign enough value or it's not broadly applicable enough; whereas we haven't adopted that mindset in data. It's just kind of like, " Oh, my God. This person has a question. We have to give them an answer." And it's like, " But maybe you don't because maybe it's not important enough to be..."
Tim Gasper: Right. There's very much this conveyor belt mentality around data where it's like" Build the report. I need another report."
Bethany Lyons: Yeah. And I spoke to this VP of data at an e- commerce company last week and he said at his company, they have 8, 000 dashboards for 2, 000 employees, and I'm like, "But why? What are people doing with it? That's crazy. That's insane. Put a product manager in between the business, and the data people to figure out what's actually valuable and what's worth doing."
Tim Gasper: Yeah. One of our customers at data. world has I think 45,000 Tableau dashboards. It's like nuts when you get into that level, right?
Bethany Lyons: It's nuts. It's absolutely nuts. Yeah.
Juan Cicada: So I'm actually realizing here, this is a great takeaway already that you put a product manager, and one of the things that the product manager must be looking for is like, " Is this an analytical request or is this an operational request?" And the moment that you identified that, you're like, " Wait. This actually should go back to the data teams that are actually supporting some operations in a way, or actually back to the source. Maybe they should be done all the way down to the sales force or there's a gap. Maybe we do need a tool to go deal with this." I think it's really important to be able to understand the request if their analytical requests are going to be some operations just to get my day- to- day job.
Bethany Lyons: Exactly.
Juan Cicada: I'm really kind of diving into this very specific detail because this is a big aha moment for me. I really like this, the whole where is your business logic? If you're business logic analytics, big red flag.
Bethany Lyons: Yeah. And I think that's exactly right. You need to be able to field and qualify what type of request it is rather than just like, " Oh, now let's get data, then let's model it, then let's query it." It's like, " Yeah. If it's an analytical request, if it's like a help me measure my business process and it's coming from an executive, yes. Put an analyst on that and answer the question," but if it's an operational request, it's coming from a salesperson. It's like, " Well, there's maybe 2, 000 other people who do the same job, who have the same question, and so maybe you should build a product that enables them to integrate this in a process that's just part of their daily workflow."
Juan Cicada: And that's another point is like, " Oh, if this question that you're asking and you're a salesperson, who else is probably asking this exact same question because maybe they're changing some of the parameters?" That's an indicator, too, that it's like, " Oh, you need to know how many people are in this region." Well, you need to do that to get your job done. It's not really doing it for analytics or you need to know who call, whatever. Right?
Bethany Lyons: Yeah.
Juan Cicada: So somebody else is maybe exactly this exact same question, but changing the parameters and I care about it western region, and you're asking for the eastern region, but at the end of the day, people are just asking questions that, " Hey, that should have been in Salesforce. That's not a question they should be doing in a Tableau or anything."
Bethany Lyons: Yeah, exactly, 100%.
Juan Cicada: So one of the things that we were talking beforehand, and I think actually you wrote a post about this last week was from self- service analytics to self- service governance.
Bethany Lyons: Governance. Oh, yeah. People were like, "What the hell?"
Juan Cicada: Yeah. Like, " What do you mean? This is chaos?"
Bethany Lyons: "What the hell?"
Juan Cicada: So open the floor, clarify what you mean here.
Bethany Lyons: Clarify what I mean? So this is actually inspired by Clive Benford. He was the chief data officer at Jaguar Land Rover and he was on this tirade against Tableau because we had this very strict governance model where a central administrator defines groups, and manages groups, and who can collaborate with who. That's all it's defined as an IT decision, but actually, it's an insane way of working to have your IT department dictate who can collaborate with who in an organization because collaboration is fundamentally something that is driven by business need, and we no longer live in this world where you collaborate with just people in your immediate team or in your reporting line. It's like innovation comes from cross- functional collaboration and it's also not knowable upfront. Who am I going to need to share with? And so you can't possibly create a model through your active directory that models your organization's collaboration, and so the point that Clive made, I'm not going to take credit for this, and by the way, I'm going to jump to the gun and answer one of your questions, which is who should we have on our podcast next? And it's Clive to talk about this.
Juan Cicada: Which, by the way, he's going to be on our podcast in a couple weeks, though.
Bethany Lyons: Oh, amazing. I love Clive so much. He's brilliant, so anyway, the point that he made is if data is in spreadsheets, the responsibility and the accountability is on the business user for the data security. If you email a spreadsheet out of your organization, everybody knows that you get fired for that, and so they don't. Spreadsheet data is some of the highest security data in the organization because people treat it with such a high level of personal accountability. In contrast, as soon as something's in not a spreadsheet, it's like the accountability sits with the people who own the system, and so then they just try to maximally lock it down and disable collaboration, which is just dysfunctional for the organization. So it's like if we entrust people to have the physical data in spreadsheets, why wouldn't we trust them to freely collaborate with whoever they need within or outside their organization on a data set that has row- level security enforced at the database level. It just doesn't make sense that this isn't the way we work today.
Tim Gasper: Right. That's interesting, so you made a statement there, which is interesting. You said that spreadsheet data has some of the highest levels of security and individual accountability. What makes you say that? Because people are locking down their spreadsheets or because it's only the individual, nobody else can see it or what drives you to say that?
Bethany Lyons: I mean, if you look at banks and financial services organizations, they are the most spreadsheet- driven organizations on the planet, and yet, they're the highest security organizations as well. Yes. There's a high risk of errors in your spreadsheet, but how often do you hear about this person accidentally emailed PR private data out of the organization? It just doesn't happen because people have such a high degree of incentive not to screw up; whereas, as soon as the data is in a BI tool, it's kind of like the boundaries are fuzzy around who the responsibility lies with, and so I actually think you have higher risk of data breaches because there's not such clear accountability on a individual.
Tim Gasper: Got it. Okay. It's shared accountability.
Bethany Lyons: I don't know. I believe the world turned on incentives. This is just my personal belief. If you can incentivize people, that's the highest form of security you can achieve.
Tim Gasper: No, that makes sense. It's easier in a shared environment for it for security to be a little bit not my job kind of thing. Right?
Bethany Lyons: Exactly.
Juan Cicada: Yeah, so this is a very interesting point here because I'm trying to understand how to unpack this more, so if you continue the spreadsheet scenario, if everybody feels accountability for their spreadsheet, but your point is like, " Yeah. They're very secure." So in a way, that's like a self- service governance. Right? They're like, " I know"-
Bethany Lyons: It is self- service governance.
Juan Cicada: "I won't screw up. Yes. I know I need to go. I have ownership about this stuff. I will take care of it." But the moment that data goes out and other people start combining it and whatever, you don't even know what that is anymore, so you're like, " That's not me." But then because that kind of gets pushed out there and somebody else is doing the work, and may not even know the context. That governance over there should be shared, and that's what we kind of like, " Well, we need to have these big top- down approaches for governance." So they either block everything down, which is then probably not helping collaboration, and I think that's a lot of the status quo today is people think about governance is this top- down approach. Then again, then we lack collaboration, and a way to be able to be more collaborative is that we give people more ownership of their own analytics, and therefore, they take ownership of their governance, too. Is that what you're trying to make?
Bethany Lyons: Yeah, exactly. I think what you said is exactly right. Self- service governance is happening today. This isn't a new concept, it's how banks run. It's self- service governance. So it's about let's now port self- service governance to the data warehouse, and to the enterprise platforms because self- service... At the moment, self- service governance happens often Excel, and then centralized governance happens in these platforms, and so the platforms block people from doing their jobs, and so they're just like, " Well, screw the platform. I'll just do everything in Excel where I can freely collaborate and share with whoever I need to, whoever I trust, whoever I can own my decision of sharing with them." So it's like self- service governance is already happening. Why don't we now bring it to the data warehouse?
Juan Cicada: In a way, I guess it's almost part you would have to go partition the database, the data lake, the data warehouse, whatever, saying that, " Hey, this part of here, I own the data that's in. I own these set of tables here and that's what I'm responsible for," and then somebody else owns it, and then if somebody then says, " I'm going to go combine these things together, then I'm going to take ownership." So there needs to be really more incentives to be able to take ownership, and then I think it's the assumption here is that if you take ownership, then you're also going to take the governance aspect about it.
Bethany Lyons: Yes. Yeah. I think that's the concept that is missing from our, I don't know, conceptual model is that governance is about ownership more than it is about interesting control and restrictions.
Tim Gasper: So in a self- service governance model is governance or is it because I think there is a traditional domain oriented stewardship model that some companies prescribe where it's like, " Okay. Well, you're the steward of marketing and you're going to be the steward of sales, and you're going to be the steward of finance, and you're going to take ownership over your area, and then we've got a centralized kind of governance group." Is that kind of the model that you're thinking? Or is it different? Is it turning that on its head more?
Bethany Lyons: Yeah, again, I'm not suggesting we invent anything new. I'm just suggesting that people know who they can share data with today. They know in their heads, " Will I get fired if I send this spreadsheet to Susan in accounting?" They just know that upfront, and so then they either do share or don't share based on that kind of institutional knowledge that they have of what's acceptable and what's not, so I think it's about entrusting people with ownership over their decisions because they're going to make those decisions outside of your platform if you don't enable it to have inside their platform, inside the platform.
Tim Gasper: So one of the things that you want to encourage is if people are going to share things anyways-
Bethany Lyons: Yeah.
Tim Gasper: ...can we at least make it above the table? Can we at least see it, track it?
Bethany Lyons: Yes. Yeah. We could at least have audit auditability of who's working with whom in your organization, and actually, I remember there was this company that they mined their email to figure out who's sending emails to who because they figured out who's collaborating with who in the organization by mining their email. Imagine if you brought all of this Excel data sharing into a platform that you could audit, you could actually understand the dynamics of your organization in...
Juan Cicada: New York tells you like, " Yeah, go ahead. Do your thing. We trust you because we're watching you and we know."
Bethany Lyons: Because we're watching, yeah, and if you know that you're being watched, I think like 99.9% of people take care of their organization's data, and so start with that assumption. People want to keep their jobs.
Juan Cicada: That's a fair assumption. I think that's a fair assumption, and I think it's kind of almost a change of mindset. I'll be honest, no bs. I'm kind of still trying to process this, and I don't know if I agree with you. I don't know if I disagree with you, and I think I understand what you're saying. I mean, at the end of the day, it's all about you're decentralizing accountability, and I like what you're... People are already doing this.
Bethany Lyons: Yes. Yes.
Juan Cicada: People are already doing this, right? I know I'm not going to do something bad because I don't want to get fired, and I know where the line is, right?
Bethany Lyons: Yeah.
Juan Cicada: And if I do get close to that line, people will find out and that will have repercussions. So at some point, it's like you want to just work, just understand what humans, the employees already are expected to go do. Nobody's expected to go do anything bad. If they do, they're going to get fired. So we kind of have that same type of mentality when it comes to dealing with the data, and again, it's already happening with spreadsheets, but it's not happening in your data link. It's not happening in your analytics, in your Tableau or in your dashboards, and the question is like, " Why couldn't you? Yeah, we let people go email spreadsheets around and everything's okay, so why do we have to go centralize governance for something that maybe everything's okay if we let people take their own ownership accountability around that." I'm thinking about this as... We're in this world of finding a balance between centralization, decentralization. Part of it is decentralizing accountability-
Bethany Lyons: Exactly.
Juan Cicada: ...because do what you want, and you'll be held accountable for it.
Bethany Lyons: And to me, decentralizing accountability is so much more important to the progress of organizations than decentralizing who writes sequel queries. That's kind of an implementation detail.
Tim Gasper: That's interesting. So how does this come to reality then? Is this something that is... Part of this feels like culture and changing the paradigm. Is there a role that technology has to play in this? Is there something with the analytics stack that has to change? It certainly feels like you need some security- oriented tooling monitoring, et cetera in order to really track what's going on here. What has to change?
Bethany Lyons: So yeah, I actually think that KAWA can power a lot of this new revolution and self- service governance. So when I joined, I looked at the product. I'm the first somewhat commercially oriented person to join the company. Everybody else is hardcore engineers, and when I first looked at the product, I was like, " Oh, my god." Oh, hold on. My light just turned off.
Tim Gasper: Oh, no worries.
Bethany Lyons: I was like, "What?" Hold on. It's on a timer, I think.
Juan Cicada: You see, we're truly live here. We're truly live.
Bethany Lyons: Yeah, we're truly live, and then this light is weird.
Tim Gasper: You actually do have some interesting different lighting arrangements there.
Bethany Lyons: There's some weird lighting happening here. It was last minute. I was like, " I got to put together a few different lights."
Tim Gasper: No worries. Yeah. What is the approach that KAWA is doing that's different?
Bethany Lyons: So when I first looked at KAWA, I was like, " Tell me we don't do this." But you can in the platform find any other user who's in the platform, and share directly with them. So I could go and share with a user in a completely different organization through this platform, and I'm like, " How do we allow this? This is crazy. People are going to lose their minds." And then they looked at me and they're like, " Yeah, but you enforce real- level security in the database," and so if a user shares something with somebody outside their organization who doesn't have access to any rows, they'll just see empty views. They won't see any data. They'll just see the wire frame of this. So what's the big deal? And I was like, " Yeah. Actually, that's true." And they're like, "And this is the way GitLab's model works. If a user is on GitLab, you can find them, and so the platform is built by defaults to enable cross- company collaboration, and it's been a shift for me as well to be like, " Wait a second. No, this isn't insane. This is brilliant. This is..."
Juan Cicada: Is this coming from a technical perspective? You start implementing kind of row- level security, maybe even cell- level security and stuff like that. That's how we are able to really decentralize accountability. You just do need to know. What are the groups? And you need to put people inside those groups, and then just go free because we know that technology will take care of what can be shared or not.
Bethany Lyons: Exactly, the database manages who can see what rows, and so then the users can share whatever content they want with whoever they want because the content isn't the thing that's high security, it's the data, so it's enabling this very high security RLS policies in the database with this very low- security content management sharing and collaboration model.
Juan Cicada: But these policies do need to be defined somewhere beforehand, right? So you are centralizing something that then you're centralizing the policies, but then you're decentralizing the execution of these policies with respect to the people and the data that's being shared.
Tim Gasper: Yeah. You're saying, " Share away, but I know that it's going to be safe because of what I've set up on here."
Juan Cicada: But I know it's under these parameters, and if you don't fit within these parameters, then you can't share. It always goes back to finding some balance of centralization, decentralization, which is different from the spreadsheet approach because the spreadsheet approach, I would argue, there is no centralization at all. The only centralization-
Bethany Lyons: Yeah, zero. Yeah.
Juan Cicada: The only centralization is you do something wrong, you'll get fired. That's the only policy, and-
Bethany Lyons: Yep. That's the policy.
Juan Cicada: "We trust you, until the moment we don't." Right?
Bethany Lyons: Yeah.
Juan Cicada: That's pretty much the only policy out there, but otherwise, there is something that needs to be centralized, so I guess the point I'm getting to is we just can't go to the other extreme.
Bethany Lyons: Yeah. No. So it is. The access to the rows of data is still centralized in our platform, and I don't know what if there's demand from customers to allow people to set their own RLS for when they share on to other organizations, I don't see why we wouldn't build that because how can an IT department know? Let's say you're a retailer and you need to share data with your supply chain. Can your IT department necessarily know upfront exactly what data you need to share with who? Maybe, but maybe there's other sources you need to integrate that it can't be so clearly defined and upfront, and so maybe you need to delegate to your category managers who are working with the retailers the ability to set their own RLS. So I could imagine a world in which we enable decentralized setting of rights management for sharing with third- party organizations. That doesn't exist now, but it's the type of thing that we need to be thinking about because the idea of there is one team in an enterprise of hundreds of thousands of people who manages all of the rights management for the entire organization is kind of crazy.
Tim Gasper: Interesting. This is making me think about a lot of different topics. It's all kind of swirling together. We talked a little bit about self- service analytics, self- service governance as like, " What does this look like as we go into the new world?" I know something that you've also talked to us about is this idea of self- service operations, and also sort of the centralization of analytics, but then the decentralization of self- service. Can you talk a little bit more about what your thoughts are there and where things need to go in the new world?
Bethany Lyons: Yeah. It's a work in progress, my thoughts on this topic, but it's very much informed by what I saw at Tableau, which is this analytics software is deployed as essentially a calculation engine for business logic, and I think that there's a big opportunity to capitalize on that, so I'll give you an example. So speaking to this guy who works in mortgage remediation. Basically, their software for how they allocate payments to mortgages, the business logic, and it is wrong because the business logic is so complicated, the software engineers who built the system can't understand it, and so they implemented it incorrectly, and it's stuff like a mortgage can have multiple subaccounts, and one account can be in arrears and one can be in a surplus. The way they set up the software is that if the arrears and the surplus is the same on two separate accounts, they cancel each other out, but that turns out to violate math, and you can totally screw over your customer with this cancellation policy, and then the customer can sue the bank, and so the compliance risk to them is absolutely massive, and so he has built this Excel spreadsheet that is used to reconcile the production system that allocates payments to mortgages. And so nothing leaves the bank unless the production system reconciles with his Excel spreadsheet, so they actually pull data out of Teradata, dump it into this Excel sheet, run some crazy calcs, and then spit the Excel data back into Teradata, and so they're doing this whole... They're using Excel as a critical component in their business processes.
Tim Gasper: Excel is a transformer in this flow?
Bethany Lyons: Yeah, it's like a transformer, and it's like a reconciler, yeah.
Tim Gasper: Okay.
Bethany Lyons: And nothing leaves the bank unless the Excel sheet-
Juan Cicada: In a way, this is just like you have a person who needs to do some needs to implement an algorithm, and they're just implementing an algorithm in Excel.
Bethany Lyons: Yeah.
Juan Cicada: Python or Java or whatever, they would've done that thing, but the Excel is just a tool that they know. It just there's an algorithm.
Bethany Lyons: Yeah, exactly. And so it's, " How do you enable somebody with expertise in mortgages to own their business logic in a more sane way than what they're doing with Excel right now in a way that's way more robust, auditable, not so high risk for errors?" So that's the opportunity that I think we need to enable is situations like that where you have business operators have critical business knowledge that they need to implement in business logic to run a process, and the process is kind of niche enough that you wouldn't buy a system to manage it. It's like you just need a tool to help automate the process, and that's what we want to do at KAWA is enable that type of synergy.
Juan Cicada: This is an interesting point. I think the question here is how can business operators own their own business logic? And now we're talking about ownership. We're talking about accountability, right?
Bethany Lyons: Yes.
Juan Cicada: And frankly, better than Excel because that's how they're doing it today.
Bethany Lyons: Yes.
Juan Cicada: So yeah, no. I've seen past experiences. I've seen go into Excel and they're like, " We have all these macros and all this code." And they're just like, " Wow, that's important business logic of how you decide to go make decisions that it's going to make you millions of dollars every day." And then sometimes what I've seen people who after years, they're like, " Oh." They start as an analyst knowing Excel, and then little by little they start learning SQL, and then they're like, " Oh, that logic that I have implemented an Excel actually would make my life much easier if it was pushed down to SQL." So then they would actually write that logic in SQL inside of or in SQL it would operate over either an operational data store ODS or maybe a data warehouse. If it's over the data warehouse, then you know they're doing it over the analytics. If it's an operational data store, maybe it's still okay because it's directly over the operations, but then what happens is that these queries then get combined with some other queries and other queries, and suddenly, you have queries that are, I don't know, 10, 15 pages long that have all this critical business logic that then feeds into a spreadsheet that has more business logic in there, and that's how you decide to make a decision today, not analytics. This is the day- to- day operations of where am I going to investigate, so I mean that that's stuff I've seen in my career, right? Over and over again. Yeah.
Bethany Lyons: Yeah, and it's like there's a missing technology platform to formalize that that work is happening, and to enable data teams to be stewards and enabler of that work.
Juan Cicada: Hold on. Well, I guess, yes. Let's argue that there's something technical missing there, but then I would argue that another tool means that you then have a tool where you're centralizing that business logic; while if we keep it as is, I'm not saying it's a good thing, but let's say it's not a bad thing. You're enabling to decentralize all that business logic. I decide to keep my business logic in Excel and I'm accountable for it. I decide to keep my business logic in SQL and I'm accountable for it. Let that happen. I mean, isn't that decentralized accountability then?
Bethany Lyons: I think just because you have a technology platform doesn't mean you're centralized. You can still have decentralization on a technology platform. Yes. It's physically hosted on one server, but the ownership is decentralized. The accountability is decentralized. The things that matter are decentralized, the hosting is centralized.
Tim Gasper: So what we're trying to avoid is all that logic being owned by... It seems like centralization can mean a couple things here. One can be like we don't want it to own all the logic. Another interpretation is we don't want one person who owns that spreadsheet to own all the logic, so there's a sharing aspect and there's an owned by the business aspect.
Bethany Lyons: And you don't want it owning your business logic because then it owns your decision making process, and they are the business at that point.
Juan Cicada: I like this code. You don't want-
Bethany Lyons: They are making a business decision.
Juan Cicada: ...them to own your business logic because then they own your decision making.
Bethany Lyons: Exactly, so do you know what? When we were going through writing our website, we actually had own your business logic as our tagline and for exactly this reason, and then we were like, " I don't know if people will understand it, if it'll immediately resonate."
Juan Cicada: And yeah. I would say the same thing. I mean, I don't even think people realize that calculation that you're doing in Excel or whatever is business logic. It is something that important for them.
Bethany Lyons: Right.
Juan Cicada: Then it's like, "Duh, that's all I need to go do."
Bethany Lyons: Yeah.
Juan Cicada: It's like I have call business logic a decision model. I have a bunch of if then statements because that's how, " If this thing is this and this, and during this time, then we're going to go do this other thing, but if it's this other thing, blah, blah, blah, blah, blah, blah." A then you have this whole, entire business process that goes around this, how is that managed? I mean, sometimes it's not even written anywhere, just people know. " It's the way we do business here, and Bob and Alice have always done it here." And then when Bob and Alice are about to retire, they're like, " Oh, crap. We got to figure out how to go put this down." Right?
Bethany Lyons: Mm- hmm.
Juan Cicada: The problem is even more severe because at some point we're assuming that this logic exists somewhere, but sometimes it just continues to exist in people's head.
Bethany Lyons: In people's heads, exactly. Yeah. The amount of institutional knowledge that isn't captured anywhere is massive.
Juan Cicada: All right. Well, without getting too salesy, but let me take liberty here. I would argue that stuff should live in a catalog.
Bethany Lyons: We should talk. I think our products could play very well together.
Juan Cicada: All right.
Bethany Lyons: Because we're not a catalog at all.
Juan Cicada: Wow. I'll be very honest. This has been a fascinating episode because we have gone through so many routes and trying to understand this decentralization of governance, which I'll be frank. It's still kind of confusing, but there's something in here. There is something here really fascinating, and I'm sure... Actually, I saw a note here on the chat. Michael Lee said, " This totally makes sense, especially when I think about Google Drive allows users to manage the privacy of documents and drives."
Bethany Lyons: Yes.
Juan Cicada: So I'm like, " Wait. I have no central... Nobody tells me what I can go share on Google Docs or whatever, and we use Google Drive."
Bethany Lyons: Right.
Juan Cicada: Right? "So why don't we do that for data?"
Bethany Lyons: Yes.
Juan Cicada: "There's a lot of critical stuff in our documents." Right?
Tim Gasper: You don't go to it for every single drive you want to have access to, at least, not if you're doing it right.
Bethany Lyons: Right.
Juan Cicada: Yeah, so there's definitely something...
Bethany Lyons: You don't overlap the business. If they put that type of restriction in place, people couldn't do their jobs.
Juan Cicada: So there is a lot to unpack here. Now, hopefully, this is an episode where people can come in and I'm going to bet that this is going to be episode people are like, " You're crazy." Or people's like, " Oh, you got me really thinking about this stuff." But time flies. Look, we got to go to our lightning round, so let's kick this off, our lightning round presented by data. world. All right.
Bethany Lyons: Okay, I'm excited.
Juan Cicada: Number one, should your business logic live in DBT?
Bethany Lyons: No. Do I get to elaborate?
Juan Cicada: Yes, elaborate.
Tim Gasper: If you'd like, yeah.
Bethany Lyons: I think DBT is designed for technical people, and so if your business logic exists in DBT, then it's owned by engineers and engineers are... So the problem I experienced at Muse, so Muse just for context, is like a CRM plus an ERP in one system for the hospitality industry, and our customers were going berserk about the fact that their accounting logic had been decided by software engineers. Literally, it started as a check company and it was like these check engineers went and picked up a check accounting book and we're like, " Yeah, this kind of checks out. Let's implement this." And then the hotel, your accountants, were looking at it and they're like, " This doesn't make sense, and it doesn't make sense in France or the UK or Germany or the US." Because the accounting logic was owned by engineers and DBT is the same thing. It's just your business logic, and consequently your business itself ends up being owned by analytics engineers who don't understand the business, so no, your business logic should not live in DBT. It should live in a platform that is owned by business users.
Tim Gasper: Interesting. All right, second lightning around question. Does the business logic get developed by the act of creating the metrics in the dashboarding, or do you think it's going to happen some other way?
Bethany Lyons: I think there's two separate things. One is creating business logic to operate a process, and then one is creating, and then there's a second thing which is creating metrics to measure whether the things you did to the process actually impacted the performance of the process, so I don't really think of metrics as being synonymous with business logic at all. Business logic to me is I have invoices and payments, and I need to figure out some way to allocate the payments to the invoices because there's lots of different ways that I could do the assignments. That's like business logic. That has nothing to do with the metric. A metric is like, " What's the time it takes to repay the invoice?"
Juan Cicada: Yeah. No. I'm 100% with you on this and I think connected to your previous point of it shouldn't be in DBT. I think an implementation of that could be in DBT. It could be in so many other things. The definition of that should be somewhere else, and then you implement it, and then at the end, I think the metrics is going to be the logic over that business logic for what you're measuring. Measuring logic, which goes over that business logic, and the business logic effectively is going to be that semantic layer. This is your lightning round, not mine.
Tim Gasper: I feel like this is kind of actually controversial, like business logic.
Bethany Lyons: Oh, no. controversial.
Juan Cicada: This is the whole point. I would love to. This should be an awesome panel. Get a DBT analytics engineer people in here. Whoa. We'll have some interesting...
Tim Gasper: Lik a breakout session.
Bethany Lyons: ... piece of gossip.
Juan Cicada: I think this is a freaking honest, no bs.
Bethany Lyons: So I almost worked at DBT, and then I was like, "Do you know what? I don't believe in their mission or their product, so no."
Juan Cicada: I love it when we have such a super honest, no BS episode. Thank you for that. All right, third question. Self- service analytics, will it be replaced, disappeared, or is it going to be augmented?
Bethany Lyons: So the oddest answer is I think what people are calling self- service analytics today, i. e., let's call it Tableau, largely. Like 70% of Tableau deployments are requiring self- service operations, so I think that part will be replaced. I think Tableau will continue to be the best in class for management intelligence reporting. Tableau will be used for your CEO dashboards, but I don't think it'll be used for your frontline operations, like data work.
Juan Cicada: Right.
Tim Gasper: That's fair. Okay, so it's not going to completely go away, but there's still a use for it, but a lot of it's going to change.
Bethany Lyons: Yeah.
Tim Gasper: All right, last lightning round question for you. So I love your product background. Obviously, I have a lot of product background and I'm super excited around this topic around data product managers, analytics, product managers. Despite my excitement, do not let that bias your response. Do you think that data and analytics product managers is a role that we need?
Bethany Lyons: Yes. Yeah. I actually think it's the... I think if you look at what did self- service analytics do to the job market, it kind of gave rise to data engineering as a field. It birthed it out of IT departments, and I think that analysts today are going to transform into data product managers as a result of this shift to more self- service operations because that will be the more critical role is, " How can you go in and figure out how to use data to improve a process, as opposed to just finding insights in the data and emailing them to management to then do something with it?"
Tim Gasper: Interesting. I'm super intrigued by the statement that analysts will shift into data product management, so it's got a lot of things going around on for now.
Juan Cicada: Yeah. A lot of your comments are things that I was not expecting. I'm happily surprised.
Bethany Lyons: Really?
Juan Cicada: Yeah. Yeah. Yes.
Bethany Lyons: Oh, that's interesting. I always think that my ideas are not very forward- thinking or unique.
Juan Cicada: I mean, they're definitely very different, and the honest thing, that's why I think I really enjoyed all your interactions. I mean, not everything you wrote on LinkedIn I agreed with, but we don't need to always agree. That'd be boring.
Bethany Lyons: That would be boring.
Juan Cicada: We need diversity of thought.
Tim Gasper: Yeah. The goal is to break the mold here. Right? If all we do is just say, " Data mesh and DBT, and all those things, then that'd be boring. We got to break the mold."
Juan Cicada: All right, Tim, take away times.
Tim Gasper: All right. Taking away with my takeaways. So we really started off with some of the problems around self- service analytics and how we really need to move away from the previous paradigm, which has been very centralized and not super effective, ultimately, in being able to create value for all audiences that we need to rethink that a little bit, and we started off by talking about how self- service analytics often in today's world creates this backlog of requests, and it just constantly goes up and never goes down, and self- service really isn't happening that easily. There's a lot of waiting on IT and waiting on the data teams to have to help you. Ultimately, the biz doesn't care about self- service, though. They really care about operations. Self- service analytics got taken out of context. It was really about shifting the power from the technologists to the people, to the analysts, and ultimately, the goal was to get it to the business, and you talked about three different audiences. There's more of your business operators. There's more of your sort of the process, and then there's more of the measuring the process, and analysts have focused on designing and measuring the processes. So sort of the last two, but that first one, the business operators aren't actually being served that well, and so what happens? Lots of one- off requests, business logic, tons of it getting embedded into Tableau reports and things like that, and so that means business logic is in the analytics tools, huge red flag. That was a really big takeaway here. We live in this Google era and data is not just like that from a UX perspective, and so you said, " We don't yet have an appreciation that data engineering is actually more complex than software engineering, which I thought was a very interesting way to think of this problem. We're failing to apply product management principles around data. If you've got 10, 000 Tableau dashboards, that's probably a big problem, a big red flag, and it's a question that a user's asking and others will be asking the same question. Then that's operations and that's sort of the self- service operations approach. So tons more stuff, but that was some of my takeaways there. Juan, what about you?
Juan Cicada: All right. Well, obviously this whole self- service governance that, after this episode, I'm going to call out some folks on LinkedIn saying, " All right. Listen to this. I want to get your input." It is inspired by Clive Benford, you mentioned, and which Clive I think is going to be a guest in a month. The collaboration shouldn't be something defined by IT. It should be something driven by the business because this collaboration is critical for innovation, and he said it's impossible to model all the ways people need to collaborate on active directory. In a way, it's like we got to embrace the chaos somehow, and you made a really strong state statement that Tim and I were back kind of like, " is this really true? Spreadsheets data has some of the highest level security." And I think at the end of the day, the owners have a high degree of incentive not to screw up. They don't want to get fired, so they won't do anything bad or email it out, and guess what. We are doing self- service governance today. It's already happening, and if you look at the finance industry, which runs completely on spreadsheets, they do it fairly well, you can argue. So governance really should be more about ownership instead of control, and again, because people know they'll get fired if they share data. Like, We need to go. " These are the incentives they already have." We talked about decentralizing accountability. It's more important than decentralizing who's writing those SQL queries? And then talking about the self- service operations, right? I mean, one of the things is that, as you mentioned before, " Hey, if your analytics software like Tableau is really a calculation engine for the operations, you need to pull out that Logic." So the question is how can business operators own that business logic better than if it's in Excel better, better that's it in Tableau? And the issue is that you don't want the IT to own that business logic because then they would own your decision making; hence, the reason why you said-
Bethany Lyons: Exactly.
Juan Cicada: ...you don't want that business logic to be in DBT because that means that those engineers would be owning your decisions. All right, how did we do? What did we miss?
Bethany Lyons: I'm amazed at your ability to synthesize and summarize. You've clearly done this like 130 times.
Juan Cicada: We have.
Bethany Lyons: Because I'm so impressed. That's spot on.
Juan Cicada: Well, first of all, thank you very much. I'll be very honest. Tim and I think are very proud of our techniques to, on the fly, summarize, but ...-
Tim Gasper: We will not be replaced by ChatGBT.
Bethany Lyons: You will not, no.
Juan Cicada: As you tell everybody, we just repeated what you said, so thank you very much for all this very insightful discussion we had, and you have fans. I think I mentioned that. I don't agree with you, but Doug here is saying he agrees with you. All right, so Bethany, to wrap us up, three questions.
Bethany Lyons: Yeah?
Juan Cicada: What's your advice about data, about life? Second, who should we invite next? Clive's already invited, so another name, put you on the spot, and three, what did the resources that you follow? People, blogs, podcasts, conferences, books, whatever.
Bethany Lyons: All right, cool. So I've already forgotten the first one.
Juan Cicada: What's your advice?
Bethany Lyons: My advice? Oh, yes. I'm going to go for advice on life. Don't settle. Don't settle. Yeah, back in November I had a lot of job offers. They were good job offers, but they weren't like... I wasn't like, " Yes. I can love this company for 10 years kind of job offers." And so I was like, " I'm going to say no to all of them and just venture out into the unknown and see what happens." And I found my dream team, dream product, dream mission, and it was very scary and very risky, but so worth it to be here now, so yeah, don't settle. That's my life advice. Next one, who do I recommend for the podcast? Caroline Zimmerman. She was one of my former clients, and then became a very good friend, and it's just brilliant on all things data strategy and how do you drive value with data? How do you enable collaboration between the quants and the poets as she calls it, so I think she would be super interesting to have as a guest on the pod, the show, and then who do I... Yeah, resources. I guess I'll just mention, yeah, some people to follow, like Mike Rennick is super interesting in data. I don't know if you follow him, but I recommend him, and then Caroline Zimmerman, follow her. Clive Benford, follow him. They're all three really interesting people for some of the more like organizational people, process strategy aspects of data. So there we go.
Juan Cicada: Well, Caroline Zimmerman and Mike Rennick, and I highly recommend Clive, right now. I mean, everything that he's posting is just-
Tim Gasper: He's on a roll right now, right?
Juan Cicada: He's on a roll right now. He's on the roll. Bethany, this was awesome. We went on so many different areas and we challenged ourselves. Thank you very much for having this honest, no BS conversation with us.
Bethany Lyons: Thank you so much. All right, take care guys.
Juan Cicada: All right, cheers.
Bethany Lyons: Bye.
Speaker 1: This is Catalog and Cocktails. Special thanks to data. world for supporting the show, Harley Burgoff for producing, John Loyins, and Brian Jacob for the show music, and thank you to the entire Catalog and Cocktails fan base. Don't forget to subscribe, rate, and review wherever you listen to our podcast.
DESCRIPTION
Are we doing data and analytics correctly? Self service, centralization vs decentralization, analytics vs operations… so many aspects that data teams need to consider.
Join this week’s episode of Catalog & Cocktails with hosts Juan and Tim as they speak with special guest Bethany Lyons to discuss how we should have a separation between data operations an data analytics