Main Article Content

Parthajit Roy
Swati Adhikari


Two novel clustering techniques, based on Principal Component Analysis (PCA), have been proposed in this paper that use Self Organizing Map as clustering model. The proposed models are differed by the number of principal components selection techniques in PCA and are applicable on clustering of non-categorical data. The present paper proposes, either to cluster the eigenvalues or to cluster the eigenvectors of the covariance matrix of the associated dataset in order to determine the number of principal components to be selected in PCA. It is also proposed that it is possible to further improve the performance of the SOM based clustering model by using either of the proposed techniques to select number of principal components. The benchmark wine dataset is used for testing purpose. Two existing principal components selection methods are used to evaluate the proposed clustering models.


Download data is not yet available.

Article Details

Author Biography

Parthajit Roy, The University of Burdwan

Assistant Professor


Yujing Zeng and J. Starzyk, "Statistical approach to clustering in pattern recognition," in Proceedings of the 33rd Southeastern Symposium on System Theory (Cat. No.01EX460), Athens, OH, 2001, pp. 177-181. doi: 10.1109/SSST.2001.918513

Neha D. and B. M. Vidyavathi, “A Survey on Applications of Data Mining using Clustering Techniques,†International Journal of Computer Applications (0975 – 8887), vol. 126, no. 2,, pp. 7 – 12, September 2015.

Tomasz Tarczynski, “Document Clustering – Concepts, Metrics and Algorithms,†INTL Journal of Electronics and Telecommunications, vol. 57, no. 3, pp. 271–277, 2011.

Md. Khalid Imam Rahmani, Naina Pal and Kamiya Arora, “Clustering of Image Data Using K-Means and Fuzzy K-Means†International Journal of Advanced Computer Science and Applications (ijacsa), vol. 5, no. 7, pp. 160 – 163, 2014,

M. G. Malhat, H. M. Mousa and A. B. El-Sisi, "Clustering of chemical data sets for drug discovery," in Proceedings of 2014 9th International Conference on Informatics and Systems, Cairo, pp. DEKM-11-DEKM-18, 2014, doi: 10.1109/INFOS.2014.7036702

Rui Xu and Donald Wunsch II, “Survey of Clustering Algorithms,†IEEE Transactions on Neural Network, vol. 16, no. 3, pp. 645-678, May 2005.

Kadim Tasdemir, Pavel Milenov, and Brooke Tapsall, “Topology-Based Hierarchical Clustering of Self-Organizing Map,†IEEE Transactions On Neural Networks, vol. 22, no. 3, pp. 474-485, March 2011.

Jolliffe I. T., “Principal Component Analysis,†2nd edition, Springer, 2002.

Dongkuan Xu and Yingjie Tian, “A Comprehensive Survey of Clustering Algorithms,†Annalysis of Data Science, vol. 2, no. 2, pp. 165 – 193, June 2015.

Kohonen T., “Self-Organizing Maps,†3rd edn., New York: Springer-Verlag, 2001.

Juha Vesanto and Esa Alhoniemi, “Clustering Of The Self-Organizing Map,†IEEE Transactions On Neural Networks, vol. 11, no. 3, pp. 586-600, May 2000.

Jian Yu, “General C-Means Clustering Model,†IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1197-1211, Aug 2005.

Ye Wenyu, Li Gang, Lin Ling and Yu Qilian, "ECG analysis based on PCA and SOM," in Proceedings of the 2003 International Conference on Neural Networks and Processing, Nanjing, pp. 37 – 40, 2003, vol. 1. doi: 10.1109/ICNNSP.2003.1279207

Suwardi Annas, Takenori Kanai and Shuhei Koyama, “Principal Component Analysis and Self-Organizing Map for Visualizing and Classifying Fire Risks in Forest Regions,†Agricultural Information Research, vol. 16, no. 2, pp. 44 – 51, 2007.

Narayan C. Giri, “Multivariate Statistical Analysis,†CRC Press, 2nd edn., 2003.

Gibbs Y. Kanyongo, “Determining The Correct Number of Components to Extract from A Principal Component Analysis: A Monte Carlo Study of the Accuracy of the Scree Plot,†Journal of Modern Applied Statistical Methods, vol. 4, no. 1, pp. 120 – 133, May 2005.

Y. Fu, H. Tao and H. Yang, "Simultaneous estimation of the number of principal components and kernel parameter in KPCA," 2017 6th International Symposium on Advanced Control of Industrial Processes (AdCONIP), Taipei, 2017, pp. 149-154. doi: 10.1109/ADCONIP.2017.7983771

R. A. Fisher, “UCI machine learning repository,†1936. [Online]. Available: