UNSUPERVISED LEARNING APPROACHES FOR CUSTOMER SEGMENTATION: A PRACTICAL FRAMEWORK WITH VALIDATION METRICS

Main Article Content

Aiysha Siddiqui Aiysha

Abstract

A dataset for retail marketing that includes demographics, purchase recency, and category-level expenditure is used in this study to investigate data-driven costumer segmentation. Building a repeatable preprocessing pipeline that encodes categorical, scales numeric features, and imputes missing values is the next step after we have completed the systematic cleaning and feature engineering. Using internal validation measures (Silhouette, Calinski–Harabasz, and Davies–Bouldin), we carry out a comparison of K-Means, Agglomerative Clustering, Gaussian Mixture Models, and DBSCAN across a whole spectrum of cluster counts. Visual diagnostics consist of distribution graphs, a heatmap illustrating the association between spending and income, and principal component analysis forecasts. A limited number of coherent and behaviorally unique segments are revealed by the study. These segments have significant distinctions in terms of income, spend composition, and household characteristics. These segments provide practical recommendations for targeting and personalization endeavors.

Downloads

Download data is not yet available.

Article Details

Section
Articles