Madelon Hulsebos on Tabular Machine Learning - Weaviate Podcast #72!
Weaviate Podcast - A podcast by Weaviate
Categories:
Hey everyone! Thank you so much for watching the 72nd episode of the Weaviate Podcast with Madelon Hulsebos!! Madelon is one of the world's experts on Machine Learning with Tables and Tabular-Structured Data, this was such an eye-opening conversation! We discussed all sorts of topics from the relationship of tabular data and embeddings, to searching through tables, semantic joins, more complex Text-to-SQL, using machine learning for query execution, using tabular data in search and recommendation reranking, and many more! This was easily one of the most knowledge packed episodes of the Weaviate podcast so far, please don't hesitate to leave any questions or ideas you have related to the content discussed! You can learn more about Madelon's incredible research career and publications / talks here: https://www.madelonhulsebos.com/! Papers such as GitTables are listed here! Another nice nugget form the podcast - Madelon introduced me to the BIRD-SQL benchmark which really expanded my understanding of Text-to-SQL (https://arxiv.org/pdf/2305.03111.pdf. Chapters 0:00 Welcome Madelon! 0:58 Tabular Data and Embeddings 3:10 Tabular Representation Learning 5:48 Semantic Type Detection 9:50 Pandas as an LLM Tool 11:52 Table-Based Question Answering and Text-to-SQL 19:35 Joins with Machine Learning 21:38 Query Execution with Machine Learning 22:45 Graph Neural Networks 24:07 XGBoost 28:28 Merging Tables 32:10 Fact Representation 35:50 GPT-4V and Tables 39:00 Metadata in Embeddings 42:45 Table Retrieval in Weaviate 46:25 Exciting future directions!!