#193 The Hidden, Pesky Persistent Challenges in Data-Intensive Applications/Service/ML - Interview w/ Ebru Cucen

Data Mesh Radio - A podcast by Data as a Product Podcast Network

Categories:

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Ebru's Twitter: @ebrucucen / https://twitter.com/ebrucucenEbru's LinkedIn: https://www.linkedin.com/in/ebrucucen/In this episode, Scott interviewed Ebru Cucen, Lead Consultant at Open Credo. To be clear, Ebru was only representing her own views on the episode.Some key takeaways/thoughts from Ebru's point of view:It's far too hard for data producers to actually reliably produce clean, trustworthy, and well-documented data. We need to give them a better ability to do that, whether that is tooling or ways of working remains to be seen. Scott note: It's no wonder it's been hard for many teams to get their domains to own their own data ;)There is a hidden challenge in data-intensive service/application development. The version of the data - the schema, the API, and the data itself version - need to be understood and coordinated as the developers don't control their own data sources unlike software development of the past. But we don't have good ways of doing that right now on the process or tooling front - data product approaches help but fall short.We are lacking the tooling to easily manage data quality for producers. While there are so many data related tools, there is a real lack of things that make it easy to manage the quality. We are getting there on observing or monitoring quality, but not managing and maintaining quality.Fitness functions can help you measure if you are doing well on your data quality/reliability.As the speed to reliably ship changes on the application side increased - microservices and DevOps -, that just made the data warehouse, the data monolith that