#93: K-means clustering: machine learning algorithm to easily split observations into multiple buckets

Around IT in 256 seconds - A podcast by Tomasz Nurkiewicz

Categories:

K-means clustering is an algorithm for partitioning data into multiple, non-overlapping buckets. For example, if you have a bunch of points in two-dimensional space, this algorithm can easily find concentrated clusters of points. To be honest, that’s quite a simple task for humans. Just plot all the points on a piece of paper and find areas with higher density. For example, most of the points are located on the top-left of the plane, some at the bottom and a few at the centre-right. However, this is not that straightforward once you can no longer rely on graphical representation. For instance, when your data points live 3-, 4- or 100-dimensional space. Turns out, this is not that uncommon. Let me clarify. Read more: https://nurkiewicz.com/93 Get the new episode straight to your mailbox: https://nurkiewicz.com/newsletter