Posts

Intro to k-Means

 K-Means clustering is a popular unsupervised learning algorithm used for customer segmentation and various other applications. It falls under the category of partitioning clustering algorithms, which divide the data into K non-overlapping subsets or clusters without any internal structure or labels. To apply K-Means clustering, we first need to determine the number of clusters, denoted as K. This can be a challenging task and is often based on domain knowledge or trial and error. Once the number of clusters is decided, K-Means initializes K centroids, which are representative points for each cluster. There are two common approaches to choose these centroids: Random Initialization: Selecting K random observations from the dataset and using them as initial centroids. Random Point Creation: Generating K random points as centroids within the range of the feature space. After initializing the centroids, the algorithm proceeds iteratively through the following steps: Assigning Data Poin...

Intro to Clustering

  Customer segmentation is a crucial strategy for businesses to effectively allocate marketing resources by grouping customers with similar characteristics together. Clustering, an unsupervised learning technique, is commonly used for this purpose. It partitions customers into mutually exclusive groups based on their similarities. These groups can then be used to create profiles and tailor marketing strategies accordingly. Clustering finds clusters in a dataset based on similarity among data points, with each cluster containing objects similar to each other but dissimilar to those in other clusters. Unlike classification, which predicts categorical class labels using labeled data, clustering operates on unlabeled data. For example, clustering algorithms like k-means can group similar customers based on attributes like age and education. In addition to customer segmentation, clustering has various applications across different domains: Retail Industry : Identifying buying patterns o...

Multiclass Prediction

Image
  SoftMax Regression, One-vs-All & One-vs-One for Multi-class Classification In Multi-class classification, we classify data into multiple class labels. Unlike classification trees and nearest neighbors, the concept of Multi-class classification for linear classifiers is not as straightforward. We can convert logistic regression to Multi-class classification using multinomial logistic regression or SoftMax regression; this is a generalization of logistic regression. SoftMax regression will not work for Support Vector Machines (SVM); One vs. All (One-vs-Rest) and One vs One are two other multi-class classification techniques that can convert most two-class classifiers to a multi-class classifier. SoftMax Regression SoftMax regression is similar to logistic regression, the SoftMax function converts the actual distances i.e. dot products of 𝑥 with each of the parameters � � θ i ​ for 𝐾 classes in the range from 0 to 𝐾-1. This is converted to probab...