635: The Perils of Manually Labeling Data for Machine Learning Models

Super Data Science: ML & AI Podcast with Jon Krohn - A podcast by Jon Krohn

Categories:

Hand labeling data and information bias: Jon Krohn speaks with Watchful CEO Shayan Mohanty about the pitfalls of data analysis when bias comes into the equation (spoiler alert: it always does), the importance of the Chomsky hierarchy in data management, and the importance of simulation engines for returning real-time results to users. This episode is brought to you by Iterative (https://iterative.ai), your mission control center for machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • Why bias in general is good [04:06] • The arguments against hand labeling [09:47] • How Shayan solves the problem of labeling at his company [24:26] • Misconceptions concerning hand-labeled data [43:25] • What the Chomsky hierarchy is [52:38] • Watchful’s high-performance simulation engine [1:04:51] • What Shayan looks for in his new hires [1:08:15] Additional materials: www.superdatascience.com/635