May 23, 202339 min
Share via:

S02 E13: Building a Data-Driven Fashion Empire: The Zalando Data Foundation Story with Dr. Alexander Borek, Director of Data and Analytics at Zalando

Step into the world of Zalando, Europe's leading online fashion retailer, where data drives innovation and enhances the customer experience. In this episode, join us as we interview Dr. Alexander Borek, the brilliant mind behind Zalando's data and analytics strategy. Discover how Dr. Borek and his team have revolutionized the company's approach to data by implementing the cutting-edge concept of data mesh. Learn how Zalando successfully strikes the perfect balance between decentralization and structure, unleashing the full potential of data while maintaining collaboration with various business units. Dr. Borek also unveils the secrets to leveraging data for innovation and value creation in the dynamic world of online fashion. Tune in now for an eye-opening exploration of data management, leadership, and the future of data-driven decision-making at Zalando.

Available On:
google podcast
Amazon Music
apple podcast

About the guest

Dr. Alexander Borek
Director of Data Analytics

Dr. Alexander Borek is the Director of Data and Analytics at Zalando, a leading European online fashion retailer. With extensive experience in data management and analytics, he has played a key role in establishing the Zalando Data Foundation, driving data strategy, and promoting cross-functional collaboration. Dr. Borek's expertise extends beyond Zalando, as he has also built a centralized data analytics and AI unit at Volkswagen Group. He is actively involved in the data community through his co-founding and co-leading of Data Masterclass, a platform that gathers insights from data leaders. With his deep knowledge and passion for data, Dr. Borek provides valuable insights into the world of data-driven decision-making and digital transformation.

In this episode

  • Role of data in Zalando's operations and the importance of leveraging it for decision-making.
  • The data mesh transformation, Zalando addressing decentralization challenges and emphasizing structure and governance.
  • Strategic data product management and optimizing the data value chain.
  • Zalando's modern data stack includes microservice architecture, data lakes, Spark, and more.
  • The significance of generative AI in data teams.


Hello everyone, and welcome to the latest episode of the modern data show. Today, we are joined by Dr. Alexander Borek, who's a data leader, author, and speaker who assists techs and established firms with their data and AI journeys. He has a wealth of expertise in establishing global platforms and scaling software products. Dr. Borek is currently the director of data and analytics of Zalando, where he manages the central data analytics team, driving data mesh transformation across the business units. Previously, he has also served as a global head of data analytics and AI at Volkswagen financial services, where he ran an extremely extremely scalable international operating unit, developing scaling, high value AI products. Alex has also launched data masterclass, which is a community for data leaders. Which we'll get into more details in a while. Welcome to the show, Alex.
Hey, great to see you. Thank you for inviting me.
It's a Pleasure, Alex. So Alex, let's start with a very basic question. How did you get into the space of data? How did that happen? You had a kind of experience in terms of working from data strategy consultant at Gartner and IBM to now data, the director of data analytics is Orlando. Tell us a little bit more about your journey.
Yeah. So the funny thing is that it started actually much before that I was still, I was at my final year at high school. I was thinking of what should I study. I was thinking about physics or computer science, but then I ran into a friend in the gym and he told me about his new program, like at university, which is focusing on data and information. Because it was an interdisciplinary program having business computer science and law are all around data because that our university thought this will be the future in the 21st century. And I said, wow, I want to study the future and that's why I signed up for that.
Amazing. And tell us a little bit about your role at Zalando now and to all of our listeners who are probably outside of Europe tell us a little bit more about Zalando do?
So Zalando is it's the number one online fashion retailer in Europe. We serve 50 million customers, consumers 51 million now even in in I think 27 markets now in Europe every month with fashion goods, with beauty goods. And we created an ecosystem of over 5, 000 brands that use us also as a marketplace where they use us as the digital transformation platform for them, these traditional brands to connect to their consumers online. So very amazing place. And I think the very exciting thing there is that we have a lot of data. It's very tech driven. And I'm have the honor and pleasure to have established the, together with a colleague, the central data organization. So I'm running the data analytics side. My colleague, all of us are running the data platform side, but together we have over one half years ago, established a central unit that, that sort of starts to orchestrate between all the data and analytics and ML teams, a data strategy.
Amazing. And tell us a little bit difference about the difference between data analytics and data platform team. What does that mean? What, how are these two teams different.
It's one unit. So we call it the Zalando Data Foundation. So the whole idea of the entire unit is to provide the foundation for data across Zalando. The teams build stuff, all across the business that drives value, we have, really a lot of analysts that drive insights every day. We have a lot of machine learning people in the business units that creates some big ML models that run on the website or, are really customer facing or steal our logistics. But what we do at Zalando Data Foundation is to provide the infrastructure and the ways of working essentially to work with each other, because, you come only so far when you work with your data inside your organization, you always need the data of the other organization. If you're working in logistics, you need the data of marketing and vice versa. And we're providing the ways of working and infrastructure to connect all of this, and when it comes to the, and then we split it up because it's quite big and we had to somehow, we overall nearly 150 people, we had to somehow divide and conquer all of this complexity because it's quite complex. And so I'm focusing on the analytics stack. And I'm focusing also on, driving central use cases, for central functions like finance, like central steering, business steering and things like that as well while data platform is really managing the lake and the warehouses on top of that..
Right. And I have a follow up question on the Zalando Data Foundation, which is something that you co founded and have co led. But before I do that, let me chip in with another basic question is what's the role of data in a fashion company like that of Zalando? What are the, key levers does the data play in, in your business?
I would say that the DNA, that the nature of Zalando is that we are both a fashion and a tech company. So you could say arts meets science, and I think that's our strength. So I think what you would see is very similar to all the other large tech companies. We have our management is very data driven, which means that at every corner of the company, people take decisions based on data every day. And we have hundreds of analyst teams, across the company to leverage that data and to build insights for everyone making decisions, that is the higher management, but it's also, operation managers that have a customer, facing managers that have to drive decisions on behalf of our customers or our partners so I would say that's number one. Number two is we have embedded, we have put ML into the heart of our of our business. We have probably like 10 really big algorithms, where we have large teams around it, whole departments, or even orgs around that one algorithm, at the heart of what we do. So size and fit is one of these areas and now, like recommending the right size is very important for us out of many reasons. It's about costs. It's about customer experience. It's about environmental goals, all of these play a big role. Pricing is, it's typical, a typical use case for e commerce, which is really important. We have lots of goods on the platform, recommendation, calculating the right. things, to show on the website, so it's more personalized. So you have a journey that inspires you that's another, area that really matters to us. And there are a bunch more like that, that are probably not surprising, but I think the key thing here is an amount that we found out the sweet spots and we invest quite a lot in those. And those are really. Things that run at the heart, like you cannot pluck them out anymore.
Amazing. And, now let's talk a little bit about Zalando data foundation. Now that sounds a little different, right? This is the first time probably I'm hearing a data foundation within an organization that you've co founded. And from what I've read it's main goal is to drive the, the data mesh transformation at the whole group. Two questions. One, what is Zalando data foundation? Like it's a slightly different term to here. And the second thing is help us understand the whole concept of data mesh at Zalando. And why did you take, why did you even take this approach?
So let me take a step back here. So I would like to dive a bit into an experience that I had, beforehand, at Volkswagen Group and Volkswagen Financial Service, as a global head of data analytics and AI I was, I was charged to really build up digital transformation across the group when it comes to data, I was part of a larger unit that was driving digital transformation. And my task was to bring in data strategy and a platform strategy to drive that, across all pillars across globally to bring in more, more advanced analytics into p lay. And I can tell you from that experience, when you start off with something like that, usually you have most companies, all companies that I know have BI teams. So business intelligence teams out there. So when you start to scale this, you want to do two things. You wanted to reach more people. So you start the self service analytics play, you could say. And then the other thing is you want to also build really like killer ML products, like things that really bring you a lot of value. And then scale that across different markets, across different business units to adopt those ML models. So you start usually with, when you start something new, you start usually quite from the center. You build, what I've done at financial, Volkswagen financial services was I built a data analytics and AI unit at the heart of the company. And then we only slowly created teams in the US and in China to have a global footprint, but we were driven a lot, driving a lot from quite central teams, you could say, and then there would be some hubs, a few fighters here and there that would do data science or analytics somewhere in the markets, around the world. We operating globally. But that thing doesn't scale. It's very good to start off like that in the beginning but over time you want to put people closer to the business, so you want to have people in the markets, in each market doing data science running their own models that are more specific to that market or on each business unit, running very business unit oriented stuff. During that journey, it was not a problem because the area of opportunities were endless and we just focused on the most important things and we grew the central unit, but at some point that doesn't scale, you have to put things into close to the business to make it scale. And at Zalando when I joined, it was a very different picture because that journey happened that I described at Volkswagen Financial Service has happened already the years before. There was a very central data team, end to end, very large. And at some point it became the bottleneck. So what was done, those teams were put really into the business units. And from one day to the other, the whole pendulum swifted from the very centralized org to very decentralized org. Where all business units would create dashboards would create a report, would create a model, like there would be no, basically it was like a party, like a data party or like a tech party. Everyone was building, started to build stuff. You can imagine that at first this created tremendous amount of value because you were very close to the business. You knew what your people wanted and you built what they wanted. Now, that's great and you really increase the speed of value until you don't, because at some point it swings back because what happens is that you end up with a lot of teams building products every day, very decentrally finding some data here and there. And putting it, in the data lake. Yeah, but still from here and there copied from here and there, they do something, they put it back into somebody else takes it, they create something new out of that. Another person takes it, they create something new out of that and becomes quite of a maze, and you create complexity. You, you create complexity that. It's very costly, not only infrastructure wise, but very costly because you don't know what to use. There's a lot of tribal knowledge, but as you grow, and we were growing quite a lot during the last years, that tribal knowledge is, doesn't scale as well anymore. So you have to create a new system, another system of how we work with data. And this is where I came in, and I was asked, can you find a balance between to create a little bit more structure towards how we work together with data and together with my colleague, we thought through how can we do that? And we worked for the first time in the history of Zalando on a data strategy and published it last year. And that is a strategy was really more of a ways of working around data. That's very different to the data strategies I created. Volkswagen group or earlier on when I was working at Gartner and IBM as a strategy consultant, I was doing, a lot of these data studies, it was always about how much money is in there. Where should we put our, dollars on or euros on? While at at Zalando, that was not the point. Everyone saw every day how much value there is in data because there were so many products built. The problem was if I wanted to change something, if you wanted to change the technology, you wanted to build something new, which required stuff from other people. It was incredibly costly, difficult and you had to often you had to create quality controls at the point of consumption and not the point of creation. And this is where we made that change where we said, okay, the central data teams are not just providing infrastructure where people were all data's loaded into that infrastructure. And we just use whatever makes sense for individual people. Now we said, let's build now out of that, a real data platform and a real operating model. Where everyone can build whatever they want in the same speed. So the decisions what to build are still close to the business, but how we build it and how we share, that would be something like a contract between each other. So we brought the whole company together to discuss how can we do that best. And, and during that journey, that started two and a half years ago, we really found that data mesh and data as a product is a paradigm shift that actually helps in there because it can, we were already a lot of people that think of data mesh. They are thinking of decentralization, they are thinking we want to put people everywhere and do data. We were at that point already. What we needed is more structure, more governance without centralizing everything again, because that would create massive bottlenecks that wouldn't work and you couldn't put the genie back into the bottle. So that's why we came to data mesh, because we wanted to use the ways of working and the governance mechanism of data mesh. And that led us, a few years back, we, to become one of the first companies that started to experiment with these concepts and quite early on. And I think it was when also the Mark started writing about it and we actually worked, it was before my time, but we worked with ThoughtWorks back then as well, closely. So it's quite intertwined.
That's quite an interesting journey. Tell us about how did you actually go ahead and implement a data machine? It's more of a conceptual, it's more of a philosophical concept rather than a framework that you could adopt, right? So how did you went from taking that concept and actually implementing it? Tell us a little bit more about that journey.
So I think, you know, it for us, because it was before the time when it became really popular, when we started the journey. I think it was really something that was more a natural evolution. When you think about when you have all these BI teams sitting, and I'm not referring only to analysts are everywhere, but we put the BI teams which are taking care of the data warehouses, also in the different business units. Like we have a logistics data warehousing team. We, in each unit, we have a domain team, you could say. And so what you need to do is when you want to report stuff out, you need to come together and come up with some rules. So in our data, where our data warehouse is not is, we have a team that helps to keep it safe and, and create a community around it, but most of the engineering happens in the domains for years already now. So whenever they wanted, they needed some standardization, they came together, they agreed on some rules. On, on how things should be interoperable, how to, how to make sure that we can, you can trust on each other's work in some instances where it's important. I think at Zalando we have a very strong culture of collaboration. So this in a way naturally started to happen. So we had decentralized ownership. We had people from other domains needing data from other domains. We had a central platform Lake plus data warehousing. We had a central Kafka event driven architecture where all systems were communicating to each other. So that, that is where we naturally came into as we, we created that decentralized model and then we started to see that I think one, one big realization though we had when we were doing the data strategy was that we need to differentiate data more you cannot treat all data equally. Now, data is not all because that would mean that it's interchangeable. Data as a product is a very great concept because it really tells you, Hey, I want to build a product, which means that product needs to have value. So I need to be very selective. No company builds millions of products. Every company thinks which product should I invest in? It's a strategic decision. So I think bring that element of strategic product management into data. Was a game changer, because that mean, it meant for us to shift from big data to lean data to focus on the data sets that really matter. And we've done a lot of analysis then which data sets are really crucial and then started to double down on the standards. And that's a process that we're currently in, to increase the standards for the stuff that really matters. And I think that the very interesting thing is, and I don't think it's covered enough in the data mash concepts. And yet, I think that as a data leader, you need to look at it even more strategically. And I call that strategic data product management, because at the end, it's not only about the data set being just pushed out randomly to lots of people. I think it's about working backwards and understanding where do you have to impact? So if you have certain ML, AI products that run on the website, so 50 million of customers every day in every transaction that we do, you need to secure these. You need to work backwards and think, okay, what do I need for that is can I improve something on the supply chain to make that algorithm better or more secure or more reliable or better audible, more regulatory compliant or make it easier to make it regulatory compliant, all of these things, and so I would really encourage these days, every data leader to think of data analytics and ML products, but also the platform products that you build as a value chain. And I think that's a thinking that we were missing a bit, as we decentralize as we, it's very natural in any industry when you mature to start to separate the work and specialize. But the risk is that you miss out, really, the value chain, and that in many disciplines, even in hardware manufacturing, where some companies manage their products more holistically like Apple, and they are really better than the other manufacturers. And I think you have that the same in data. When you manage that value chain most strategically, if you find ways of working in product management. Because it's far more than engineering. I think the engineering part is not the biggest problem. I think we still are relatively immature in product management. If you fix that, I think I think you can create 10 times more value than you do now as a company.
Wow, I absolutely love that analogy. And I also love the phrase from big data to lean data, I think so this is the episode title probably he would go, we'll end up with this one. So moving on tell us a little bit about the. Modern data stack at Zalando or your data stack. How, what are the kind of tools and technologies you're using from right from the data consumption, right from data production to, data storage to the data consumption. Walk us through your stack.
I think it's a very classic modern data stack. And I think the particular thing that, that some people from outside don't see when you look at tech companies that are a bit more mature. Cause we were founded 13, 14, 13, 14 years back. That we had already to migrate a few times. Our stack, and because it's so fast moving. So for any startup out there, for any scale up out there, as you scale, maybe you're scaling right now for, from a hundred people to 500 people. Think about the next years. You will need to migrate at some point your technologies, plan for that. And a proper, managing of your products is one really, we learned really when you manage data as a product, it's so much easier to migrate from one technology to the other. If you don't, if you think about pipelines, data pipelines, you have this maze of cables running around and so much harder to migrate anything. So I can tell you, we had a few migrations already, but right now I think we operate in a very classical, modern data stack, which is, we have our microservice architecture on the engineering side that is communicating via an event driven architecture which is I would say an improved Kafka version of Kafka where we have. And we basically hysterize all the data and data lake, which is basically a cloud based storage very typical. And on top, we have technologies for, for Spark and then for for data warehousing on top, basically, so That's quite a, and we are, we have some instances where we work already with some Lakehouse type of technologies. We found that not always they are already at the scale that we operate and they don't always have the performance that we need. At the end we have an excess layer, fast serving layer that, that needs to provide the data basically to a large community. So my, I have over 3, 500 active users actually that are using. Analytics every month. And they are using it quite intensely with lots of data. Wow.
Wow. And moving on from Zalando we also happened to notice that you also run a community called data masterclass, which is basically a community for data leaders. Tell us a little bit more about that community. And I also heard that you have event coming up in June. So tell us more about that as well.
Yeah. I was thinking really, I've been working as a data leader for now roughly eight years and before I was doing strategy consultancy for data leaders where we build up big data, large data organizations for other clients. And I thought really, there is, there is a need to accelerate in the industry. The adoption of and leveraging data at scale, I, what is like form as a so in particular in Europe, what we see is that over the last 10 years we wasted too much time in building POCs. So much money, many money was really wasted, which never got into production. We see that there are huge barriers of rolling stuff out basically, like it's okay to, it's great that people have good data organizations already that know what they do, but it's really a cultural thing. It's really a transformation thing. When you want to have more people being part of that game. In the organization, and when you want to put data analytics and AI at the heart of what you do as an organization, it's a transformation and most companies in Europe, I can tell you have not gone through that transformation yet. They are at the beginning of that transformation. You have some tech companies that have further along the journey, definitely. But when you look at the mass, vast majority of companies that employ people in Europe, they are, they have not undergone this transformation. So what I have seen is, and I was thinking, how can I help? And first I thought, okay, maybe I, started consulting on the side, do a bit of advisory, not, maybe not consulting, but more advisory kind of gig I help you guys to run this. But I thought and I really realized there are patterns, and I think that's quite an important thing. When you do transformation and that Volkswagen group, I was doing one of the largest digital transformations on this planet. It's a group of 625, 000 people. It's 12 large brands and the brands are massive like Audi, Porsche, Bentley, those are the brands, Volkswagen, Skoda, MAN, Scania, so really large companies within one group. And I had to scale rollout data transformation across these brands and all these companies, and so I was a serial transformer, you could say, and I realized there are really patterns that work and a lot of and you can waste a lot of time when you don't approach it the right way. So I was thinking about starting a community to collect these patterns, a community for data leaders like myself, a community that would be be including data leaders that are more seasoned, more experienced, or have a really expert in one, one thing or the other that would share their story, would share some of these breakthroughs that they had, maybe some of these patterns, some of these recipes that they experienced. With each other, and and I don't want to, and I didn't want to create a form where just people come and talk and that's it, and podcast and that's it. It's also great. I wanted to go one level further with data masterclass. I have the ambition to build a practical body of knowledge for data leaders, by data leaders, so data leaders contribute to that practical body of knowledge. And we I have started a podcast, which is data masterclass. I'm working on some online coarses. I'm doing research art, I'm creating articles to capture interviewing a lot of data leads every day. I'm interviewing several data leaders on what has worked. And in June I'm running the first large flagship event, Data Masterclass Europe, which, where we spent three days with 70 data leaders in Berlin and for three days, we go really strategically through all the areas that you need to make a transformation succeed. And it will be not a conference, but it will be a series of workshops. We will have eight workshops. Each workshop is two and a half hours and each workshop I have been preparing with a few experts, a comprehensive briefing for that area. Like things like data mesh or metric store or data fabric. Or simply creating a great strategy and convincing people to come along. Those are all topics for each of the workshops and each workshop, we really tell data leaders, we've been researching now for several months, this is what you need to know in 2023, those are the things that we learned throughout the years, but those are also the current developments. That we see and we also invited data leaders to share their story. In each workshop, we will have for data leaders that have created a breakthrough in that particular area where they, created maybe a strategy that rescaled globally. And they tell us the story and tell us this, those were the obstacles. This is how I overcame those. This is what I learned and this is my advice to you. And we then have these practical parts where data leaders then, in peer groups, we split them up in groups, really come together and think through what does it mean for me? Going through that checklist and think, where can I improve my stuff? And maybe how can I support the others in my peer group, to to succeed.
Wow. That sounds really exciting. And how can someone sign up for that? How can some data leader who are probably listening to the show, how can they sign up for this event?
So everyone, every data leader on this panel is invited to join. The event is in Berlin on June 21st to 23rd, so three days. And you can simply go on the website www. datamasterclass. eu. So it's data masterclass. eu like Europe, because we call that event data masterclass Europe because it's taking place in Europe. Awesome. And,
we'll be very sure to put this on the show, show footnote as well. So that if anyone wants to sign up, they would be able to do that. All the best for that event, Alex, and we wish you heartiest congratulations for doing this. And I think, so this is something that a lot of people would benefit from. All the best for that. Thank you. And now, now as we are inching closer towards the end of this episode, let me talk to you one hot thing that kind of that is up to every single, leader's mind or, individual's mind over the past six months is generative AI. Where do you think generative AI or what role do you think generative AI would play for the data teams?
Very good question. And of course, that's something that keeps me very busy these days. So look, I think it will force us as data teams, data leaders, and data teams to recalibrate our work this year. So most of us had a plan, like this was the plan for this year, for the next years. And I think right now is the time to go back to your plan and think through, is there other certain changes in my plan because of what's going on. Now. I don't think that means that those plans will fundamentally change in terms of, okay, we, we drop everything and we start from scratch. I would rather think every one of us needs to go back and make some adjustments. That's very different to scrapping your plan. Why? Because data will be still very important in the next years, independent of the tech AI technologies that you use it on top. In fact, when AI comes out of the box and that's what we see with generative AI, we see APIs where you put in some data and you get results back. So you don't need your data scientists anymore for these things. A lot of AI will come and what we see, of course, is not only the large language models, what we see is a whole ecosystem, like we saw when we had e commerce, mobile commerce when we had social media and we had smartphones coming, during all these stages and IOT, you saw a whole ecosystem of apps being built and we see that right now evolving, every day there is new, like it's almost crazy. It's like gold rush right now. Everyone is building something because it's a platform shift because it's a paradigm shift and a platform shift which means that more AI functionality comes really out of the box, which means that the only thing that differentiates you from the others is either you are really like the next OpenAI maybe for your discipline, you maybe no need to be so big. Maybe you do that for your particular industry, but you create an AI model, which is really outperforming everyone else, but most of the cases you will not be that company. Most of the cases, the thing that is really unique is your data. That's why I still believe, after those adjustments, you will find doubling down probably on your investments in your data foundation, data as a product and creating platforms that allow you to to really operate data as a product in a very lean manner and plugging into this AI ecosystem very. Now that doesn't mean that. That generative AI is not important. I think it is. And there will be use cases in your company that are game changer. And there will be at least one or two people in your boards that come to you as a data leader or to your data team and say, okay, guys, help us to leverage generative AI. So I would be rather starting to prepare for that. And one way to prepare is, first understand the technologies, understand how to leverage it. There's a lot of good stuff also for, if you, even if you're corporate, you can use OpenAI Studio on Azure. There's good stuff on Google. There's good stuff on AWS. There's Hugging Face, as open source, there's a lot of stuff that is actually quite good to use also for corporates these days with a. High standard. So start playing around and see how we can incorporate that. But then, it's back to these big data times where you start these kind of design thinking workshops, going to the business, thinking through, okay, which part of the value chain could benefit most from generative AI and creating, and running these workshops. Where you have a lot of sticky notes and you think, and you inspire people. This is what you can do with generative AI. Now what should we do in our business? And just starting off some of the stuff. The final thing I wanted to add is maybe you should also do that yourself as a data team. Because I think one of the killer use cases of generative AI is really helping you to code, helping you to create, to analyze data sets and of course, you cannot do just throw your data in there. You need to. Take care of security, but I would really recommend, data engineering is something that is always a scarce resource, make your data engineers more productive or maybe you as data engineer, think about making yourself more productive in a safe and secure way, data engineers, that's a good thing because we can build stuff, we are used to building stuff that needs to be safe. So I think that's really the third area where I would look into. And I would be prepared, I would really recommend to be prepared to have conversations with your C level because they will become very curious. Probably they are already now and they, if they haven't reached out to you, they will reach out soon. Start creating some slides or a document, put together some ideas, interview people. So you come with a, with first ideas when they, they approach you.
That's very insightful and very inspirational. As we get closer to the, end of the show would love to extend my heartfelt gratitude for doing the show Alex. It has been a pleasure having you on the show and I'm sure we all learned a lot from this whole episode. Thank you so much.
It was really a pleasure and honor to be here.