Member-only story
Introduction to Unsupervised Machine Learning: Clustering Techniques
1. Introduction to Unsupervised Machine Learning
- Definition:
- Unsupervised learning deals with unlabeled data, finding hidden patterns or intrinsic structures without predefined outcomes.
- Difference from Supervised Learning:
- No target variable or ground truth labels.
- Focuses on data exploration and pattern discovery.
- General Applications:
- Market segmentation, anomaly detection, recommendation systems, and more.
2. K-Means Clustering
2.1. What is K-Means Clustering?
Intuition:
- Partitions data into K distinct, non-overlapping clusters based on feature similarity.
Algorithm Steps:
- Initialization: Select K initial centroids randomly.
- Assignment Step: Assign each data point to the nearest centroid.
- Update Step: Recalculate centroids as the mean of assigned points.
- Iteration: Repeat steps 2 and 3 until convergence.
2.2. Choosing K: The Elbow Method
Concept:
- Plot the explained variance (or within-cluster sum of squares) against the number of clusters.