Modern Data Stack

Author Interview: Mastering Modern Data Stack

Need a copy? You can get one here.

An interview with author Nick Jewell, PhD.

In our inaugural episode of TinyTechGuides Talks, I was fortunate to have the opportunity to interview Dr. Nick Jewell, a modern-day data aficionado and the author of Mastering the Modern Data Stack, which was ranked #6 on the Amazon U.K.’s best seller for data management books. You can watch the full episode on YouTube.

Transcript

Note: this has been edited for clarity.

David Sweenor:  

Good morning. Good evening. Good afternoon. Welcome to TinyTechGuides Talks–Live Dive with Dr. Nick Jewell and not a Dr. David Sweenor. So Nick, glad to have you on the show. Thanks for joining us today.

Nick Jewell:

Great to be here. Thanks for inviting me, David.

David Sweenor:

So, the reason we’re here is that you have just published a book Mastering the Modern Data Stack. It is ranked number six on the bestseller list on Amazon in the UK. Congratulations.

Nick Jewell:

It’s in a small niche, but we’re getting there. Yes, thank you very much. I couldn’t have done it without all of your help. So it’s been awesome.

David Sweenor:  

Awesome. All right, Nick. Well, let’s just start right into this, what’s Nick’s origin story? How did you get to where you are today?

Nick Jewell:

Oh, my superhero story. The radioactive spider. Absolutely. Basically, I had a data science background going way back. I worked in drug design and chemical information science while doing my PhD. I always like to give a really simple analogy. And a drug interaction when you’re doing drug design, it’s a bit like a lock and a key. So the key will fit into a lock, and it either causes something to happen or maybe it blocks that location and prevents other things, maybe bad things, from happening. So my job as a young Ph.D. student was to try and understand what the lock looked like by bringing together lots of different keys, lots of different chemicals, and drugs to try and understand which part of the key actually made a good fit. So I got to work with some pretty cutting-edge technology, some genetic algorithms that helped me align all of these keys. Looking at lots of different chemical properties along the way, and then a whole bunch of statistics. So I ended up building a lot of models that highlighted the feature importance of all the good keys, and then that can be used in drug design programs. So some of the things that we see in data science machine learning today, were done all those years ago.

David Sweenor:

Wow. That’s amazing. That reminds me of my first programming language. It was Mortran, a derivative of Fortran preferred by particle physicists worldwide.

Nick Jewell:

Oh, fantastic. Is that like the cheaper version of Fortran? 

David Sweenor:  

I don’t even know what it was. Don’t ask me to describe it.

Nick Jewell:

No worries. Well, you know that that got me into the world of analytics. I got really lucky. After that, I spent over a decade working with some really amazing teams at HSBC. So I first of all started off in the investment bank, then ended up working more widely for the Global Group, building out some large-scale analytics solutions. Got to work on financial crime analytics architecture and ended up learning a lot about how to describe all of the patterns that get involved with data management with analytics, and also data science and machine learning as well.

David Sweenor:  

Wow, that’s amazing. So now you’re a top-selling author in the UK. And we’re going to milk that one.

Nick Jewell: 

We’re going to work those caveats.

David Sweenor:  

A little asterisk there. What motivated you to write Mastering the Modern Data Stack: An Executive Guide to Unified Business Analytics?

Nick Jewell:

Well, first of all, I think there are probably two little stories about this. 

So the work that I was doing back at HSBC, really captured what was called functional architecture across the bank. So picture this if you had to explain to an alien what the bank actually did. A functional architecture diagram was a really great starting point, kind of broke down the business machinery into functions and into processes, and then you could start to see what the landscape looked like. So whether that was on a regional basis or maybe on a divisional basis. Some of the things we did–we look at retail banking–the stuff that we do day-to-day with checking or savings accounts. We look at the UK, and we see how did those functions compare to operations in Hong Kong or maybe the USA. And then likewise, you can take a function like customer onboarding, bringing customers into a company, how did that look for the investment bank, maybe versus wealth management? So we started to draw all of these diagrams, and you started to get some real aha moments in that process. Maybe you end up looking at all the different payment systems that you have across the world. Why did we have 150 payment systems? It really didn’t make any sense. Why do we have 10 different ways to manage customer relationship data? So it was a really fantastic tool for rationalization, but also looking at ways to understand how the company was investing, but also, maybe if the company was looking to streamline as well, you could highlight all of that complexity. So that’s story number one that got me these functional building blocks. 

Secondly, my last job was working at a company called Incorta, and we were doing some product marketing positioning and messaging. David, you’ve got a pretty good book on this as well. Basically, trying to understand how we fit with the concept of a data lake house. So open source storage, query engines, that kind of thing. And I discovered this brilliant reference article from the folks over at Andreessen Horowitz, that venture capital firm was very much talked about in Silicon Valley. They describe these reference models for data lake houses, and I just ended up using their models all the time for storytelling as I was building out my positioning. So treating this really as another form of functional model, Mastering the Modern Data Stack was just my effort to really flesh out what a lot of people were talking about trying to take these stories trying to go down into the components behind the scenes.

David Sweenor:

Okay. So, do you think the situation at the bank with 160 systems has changed? Have we gotten to nirvana? I’m surprised when we talk to customers, you and I both talked to over the years, how ugly the infrastructure is when you get in on the inside, and I’m just curious about your perspective on that?

Nick Jewell:

It’s a really tough one, right? I always picture this needle that swings one way to the other. So in some cases, you want to say, “Oh, yes, we’ll create a bespoke version of something to meet someone’s exact needs.” So the payment system in Vietnam needs to be absolutely bespoke… versus going the other way and saying, “Folks, we really need to standardize this is costing us far too much money in terms of keeping the lights on by managing 150 different systems that ostensibly are doing the same thing.” So centralization, decentralization, standardization, customization, these are all tensions that I think we see in the industry every day. So mapping it, and understanding the landscape is always the first step in making a better decision.

David Sweenor:

Okay. So you talked a little bit about this, but how do you think your experiences have shaped your perspective on the modern data stack?

Nick Jewell:

So, I think I’ve been really lucky. When I started working at Alteryx, the company we both worked at for a while, I was really lucky to get a lot of at-bats. Let’s use the baseball analogy. So got to experience lots of different companies, lots of different analytics challenges, and lots of different departments across my journey. I’ve worked in everything from finance to risk working directly within business units. I think that really shapes lots and lots of at-bats. When it comes to analytics architecture, you really get to see what works, and you probably learn the most from what doesn’t work. So, one of my greatest failures, in some ways, was trying to build a real-time data warehouse too early in the process. And you realize once you try and do this that the closer you get to “real-time”, the harder it gets, the more expensive it gets, the more likely it’s going to fail. But also, you really should stop and say to the business, the stakeholders, is this really what you want or what you need? So that brings all of those experiences together, the failings, the successes, just the diversity of this analytics career that I’ve tried to pour into the book.

David Sweenor:

That’s amazing. And you know, regarding real-time, I was a little spoiled. When I started my career, I was working in semiconductor manufacturing, and we had a real-time data warehouse at the time. Kimball designed 1500 users. And when I left that company, I was like, people do things in batch? It was so foreign to me because my whole experience of data and analytics was about real-time because you’re making you’re making widgets. And I won’t forget that. There’s still a lot of batch that goes on today, and it’s still befuddles me a little bit.

Nick Jewell:

Yeah, it’s really interesting because the book has taken the approach of trying to use these functional building blocks rather than just paying pure attention to the vendor hype. So, sometimes batch is perfectly acceptable for what 95% of business use cases need. But for those other cases, that’s when you need to know that real-time exists.

David Sweenor:

One thing I love about the book, and I don’t know if you’ve flashed it up yet, is that you have a lot of experiential knowledge in here; you have a ton of case studies and things like that. You really emphasize this real-world capability. So, what’s maybe a pivotal moment or experience that really drove home the importance of this approach to the modern data stack?

Nick Jewell:

Oh, well, absolutely. So I’ll give you a story from some work that I’ve done in my spare time. So I’ve worked really closely with a data science charity called Data Kind here in the UK, and they do some amazing work for social change projects. Because what you find is with charities, they have a ton of really interesting, often very sensitive data, but they don’t always have the skills and the talent with it. So to bring data scientists and data people together with this data in a safe, controlled area is absolutely brilliant. For the listeners today, if you can find the Data Kind event in your town or near you, it’s a brilliant way to grow connections in the industry. 

But going back to real-world capabilities, we were doing a Data Dive with a charity that was looking to buy small plots of land in the UK, it was all about sustainable farming. That’s an interesting data science case on its own. The land needed to be in certain parts of the country; it needed to have certain geological features to be suitable. And I’d say even from the outset of this project, we knew that there were some great open-source datasets available in geology, soil types, breakdowns, and even the little parcels of land that were actually available for sale. But rather than just dive right in, because we all know data scientists, they just want to roll up their sleeves and dive in. 

These weekends are more about cat herding than anything else, David, honestly.  What we did was, we put our heads together and actually mapped out the modern data stack components that would be needed to actually solve this challenge. So things like how do we store this data? How do we do geospatial analytics? How do we do data visualization and natural language processing? And that was actually really interesting because you had all of these like classified ads for pieces of land; we needed an automated way to work out if they were actual farmland, or just part of wealthy houses, real interesting stuff. But then the analytic apps that get produced so the charity can get hands-on at the end of the day. So instead of just diving right in and trying to solve the challenge, without giving the picture its full attention…breaking down the components really created this rich solution, and it was much more plug and play because we can start to have much more mature conversations, about well Python is better for this or maybe, GIS tools are better for this component. And we can start to see where all the pieces fit together.

David Sweenor:

Okay, well, natural language processing, doesn’t ChatGPT take care of that for us today? 

Nick Jewell:

The wellness games, right, exactly. But you know, it’s fascinating because previously, the charity literally would take a bicycle and cycle down an Old English lane and peer over a hedge to see if this field was good. This was like magic to them. And that’s the beauty of doing these kinds of data dives, taking amazing datasets, working with people with real challenges, and producing something magical.

David Sweenor:

Okay, okay. I like that story. So, for people who are just starting their journey and data and analytics, what’s the one takeaway from the book you want them to walk away with?

Nick Jewell:

Oh, crumbs, okay, great question. I’d say, first of all, my first bit of advice, read the book from cover to cover. It’s a TinyTechGuide. It will only take you 90 minutes with a good strong cup of coffee. I will say folks just starting their journey. There’s going to be some aspects around new terminology. I encourage you to dive deeper. Follow the links in the book, and follow the case studies. If you’re interested in learning more, basically, just be curious and follow your nose into areas that really interest you. 

So this could go one of two ways. You could find a niche that’s interesting to you. So in the book, we talk about all kinds of fascinating stuff like reverse ETL, headless BI, you name it, it’s in there, but maybe you can follow what we also cover as the data gravity within the system. So some of the mega-vendors, the cloud service providers, are doing some really interesting things around storing, querying, and processing this enterprise-scale data right at the heart of the stack. So there are so many interesting directions you could tak.

David Sweenor:

That’s great advice there. So, with all of the hoopla around generative AI and modern data stack, where do you see the future of the modern data stack?  And how do you anticipate it evolving?

Nick Jewell:

That’s a good one. So honestly, and you and I both know this, it’s a really, really exciting time to be working in data analytics and probably the modern data stack more generally. There’s probably a couple of big forces in play right now. 

So first of all, we talk about the modern data stack as being this tightly integrated set of different components, all of them best of breed, all of them working really smoothly together. And there’s also the mega-vendors, the cloud service providers wanting a bigger piece of the pie. So they will come to you with marketing that says, “Why risk overspending on integrating all of these smaller components together when you can have one vendor to rule them all?” So this is where we see this huge big-picture architecture coming from AWS from the Google Cloud Platform, even Microsoft.  In 2023 alone, Microsoft Fabric is a really interesting development. A set of really cohesive services that pull together. 

Secondly, we also talked about the modern data stack as being closely aligned with the concept of centralization. We talked about this needle earlier on. That’s existed since the earliest days of the data warehouse itself. In the last couple of years, a really strong opposing force has actually appeared in the form of the data mesh, which is much more around the decentralized delivery of data products and analytics insights to an organization, maybe even beyond an organization’s four walls. It’s early stages. 

I think we’re starting to see ourselves getting beyond the initial peak of hype. We’re starting to see folks like Eric Broder, who’s done some great work on deployment patterns for data mesh. It’s going to be really interesting to see if Gartner changed their minds on this topic. They called it and said this is going to be an obsolete technology before it reaches the plateau of productivity. We almost put some money on this. I’d like to see whether they’re right, whether they’re backing a different horse in this case, but that’s another one to watch. 

And then I’m gonna give you one more I’m gonna say, we can’t not talk about large language models, right? Every single vendor that I’ve seen, and Matt Turk has this amazing diagram. We’ve included a small snapshot of it in the book called The MAD, the machine learning AI and data landscape. 1000s of vendors all sprouted onto this screen. Nearly every single one of those right now is scrambling to include a natural language interface that’s based on some kind of large language model. So again, my recommendation here is to watch the mega-vendors. 

Snowflake is doing interesting stuff around partnerships in this space. Databricks has acquired Mosaic as a huge investment in their portfolio. Amazon Anthropic Microsoft and open AI. It’s fascinating, and it’s a really active space. So keep an eye on what every vendor is doing here. It’s gonna get a lot easier to analyze your data.

David Sweenor:

Yeah, and I think today is quite an exciting time. I do have a random question. What is it with data people? They’re very creative. We have data mesh, data fabric, data lake house, data swamp, and data blanket–are we running out of terms or any predictions on some new terms that we might get in the future?

Nick Jewell:

We’ve still got a data skip and a data refuse center where we might recycle something I don’t know. I’m calling it now.

David Sweenor:

There you go. You’ve coined it. Well, now, this has been a great interview. So I think we got the TinyTechGuides top three, rapid-fire questions. And let’s let’s see how you respond to these. 

If data analytics were a sandwich, what ingredients would have?

Nick Jewell:  

So we’re going to need a solid foundation. Let’s grab some homemade sourdough that we’ve somehow managed to keep bubbling away and brewing for the last 24 hours. We’ll talk about sourdough very quickly. Let’s think it’s scalable. We can make bread loaves, the whole works. It’s pretty elastic, and it’s pretty resilient. It can last for decades. So sourdough’s my foundation. My feeling, David, it’s got to be bacon, because data is the new bacon, right?  

David Sweenor:

Bacon, you put the quote in the book, I saw that.

Nick Jewell:

In the UK, we don’t just cremate bacon like you guys do in the States. So we’d recommend a couple of good dry-cure rashers. And you know what, the little pizzazz on top, let’s have some HP sauce because we’re coming from the UK. A little bit of data visualization, a little punch for our stakeholders as well. How about that?

David Sweenor:

I love that. That is exactly right. 

So okay, next question. What is more challenging, deciphering complex data or deciding what to watch next on Netflix?

Nick Jewell:

Okay, 100% Netflix, those folks really do themselves a disservice. They have so much diverse content, and you either end up watching what’s on that top banner? Yes, Love is Blind–series five, or you start to death scroll, and then it’s lady luck taking over. I want to give all of our listeners today a little tip. So if you go to netflix.com/browse/genre and then type a four or five-digit number, you will find carefully curated micro categories. So if you want to find wine and beverage appreciation content, you want to find understated horror movies. Netflix has done this amazing data cataloging behind the scenes, and they don’t talk about it. It’s so amazing. So if you were to find one of those categories, 1458 That’s your wine and beverage appreciations on get stuck in, it’s great. It’s amazing.

David Sweenor: 

That is awesome. All right. The last question is, if there was a data and analytics superhero, what would be their superpower? What would be their name?

Nick Jewell:

Are you a subscriber to Disney+ at all, David?

David Sweenor:

I used to be, I turn these services on and off. Almost on a monthly basis.

Nick Jewell:

Quite right, you probably should do that. But I’m on Disney+ right now. There is a series called Secret Invasion that is basically a spin-off of the Marvel Cinematic Universe. They have these aliens called scrolls, and scrolls can basically shapeshift into anyone and inherit all the powers of the Avengers. So it’s kind of a bit weird. It ends up with the actress who pays Khaleesi in the Game of Thrones, basically becoming a bit of Thor, a bit of Captain Marvel, a bit of Groot, very special effects heavy kind of a dog’s dinner in terms of a plot, but basically the strongest character in the multiverse. So this is my data analytics superhero for you. So you’ve got data management skills, you’ve got a little bit of architecture, you’ve got analytics chops, you’ve got a bit of data science, and you’ve probably got a good large language model helping you out throughout. So I’m going to call them here we go, not a data skip, no data refuse for us. Let’s go Data Dynamo, somebody, you know, that can really power things up. And they probably wear those Clark Kent specs as well, to carry on as a regular mortal for the rest of their days.

David Sweenor:

I love that. Well, Nick, I appreciate your time today. This has been an amazing discussion. Congratulations on the book launch again. I’d encourage everybody to grab a copy. You can get them wherever fine books are sold. If you’re nice to Nick, I might send you an autographed copy.

Nick Jewell:

Absolutely get in touch on LinkedIn for sure. 

David Sweenor:

That’s right. And if you have a copy, please leave a review on Amazon. It really does help. So Nick, thanks again, and have a great evening.

Nick Jewell:

David, thanks so much.


If you’d like a copy of Nick’s book, you can buy one here.