Finding meaning in data, with Caroline Keep

Future-Proof Your Career - A podcast by Tom Cheesewright | Podcast.co

Categories:

In this episode of Future-Proof Your Career, we speak to Caroline Keep, a data scientist, a teacher, a maker, and a researcher in machine learning. She is the recipient of multiple awards, including the Times Education Supplement teacher award, and a founder of Liverpool Makerfest. We spoke to Caroline about how you extract meaning from data, and how we can all be more engaged in the effort to decipher the world around us. Here’s what we learned. Data is the real world, quantified Don’t think of data as just endless spreadsheets and numbers. It’s a representation of the real world and the things that matter. Understanding the data is a way to understand the world. Understanding data is a process Caroline talked about multiple steps in the ‘data cycle’: Start with discovery: play with the data at your disposal to get a feel for itCreate a hypothesis: what are you trying to test?Discuss your idea with other people and gather perspectives, check your reasoningClean your data: the real world is messy and full of bias and noiseTest your idea: does your hypothesis hold true? Build domain knowledge  Understanding the space you’re exploring is critical to give you a reference point. Otherwise you won’t know if the results you find are nonsense! If the data you want doesn’t exist, you can get it There are lots of sources of interesting data, but the Internet of Things makes it cheaper and easier than ever to collect data that doesn’t exist. Whether you want to track temperature, movement, light or pollution, or anything for that matter, simple sensors and cheap computers like the Raspberry Pi allow anyone to experiment (see links below) Caroline referenced some great resources and projects, including: Kaggle: a data science community - https://www.kaggle.com/NodeRed: a drag and drop IoT platform: https://nodered.org/Kettle Companion: A connected kettle that helps carers keep an eye on vulnerable people - https://kettlecompanion.com/Rstudio: software for data science - https://posit.co/products/open-source/rstudio/Python: a powerful but accessible programming language - https://www.python.org/Jupyter Notebook: https://jupyter.org/