CLUSTERING MULTI ATTRIBUTE SIMILARITY INDEX FOR CATEGORICAL DATA STREAMS

MANDRU DEENA BABBU, Dr.YK Sundara Krishna

Abstract


Data mining is an aggressively concept in information retrieval based on different attributes from different data sources. For effective data collection from data sources with respect to relevant data, one-class learning is required to perform labeled based classification with individual training sequences on attributes. In clustering, uncertain data with different data set visualization. Uncertain One Class Clustering (UOCC) with support vector machine to explore data summarization in terms of user preference. UOCC process single attributes from reliable data streams for inconsistent data. So that in this paper we propose Clustering with Multi-Attribute Framework (CMAF) to group multiple attributes to explore uncertain data from reliable data. CMAF construct matrix with different reliable attributes based on relevant features. Proposed approach defines effective data summarization for relevant data with attribute partitioning and constructs user profile based on relative attributes. Experimental results come out for proposed approach gives better and expressive results with comparison of state of art methods.

Keywords


K-Means, Uncertain One Class Classifier, Multi attribute, Support Vector Machine, Feature Representation.

Full Text:

PDF

References


K. Balasubramanian, P. Donmez, and G. Lebanon. Unsupervised supervised learning ii: Margin-based classification without labels. JMLR, 12:3119–3145, 2011.

M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR, 7:2399– 2434, 2006.

J. Bi and T. Zhang. Support vector classification with input data uncertainty. In NIPS 17, 2004.

E. J. Cand`es, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? Journal of the ACM, 58(3):Article 11, 2011.

Y. Chen, X. S. Zhou, and T. S. Huang. One-class svm for learning in image retrieval. In Proc. ICIP, 2001.

K. Crammer and G. Chechik. A needle in a haystack: Local one-class optimization. In Proc. ICML, 2004.

E. Elhamifar, G. Sapiro, and R. Vidal. See all by looking at a few: Sparse modeling for finding representative objects. In Proc. CVPR, 2012.

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from internet image searches. Proceedings of the IEEE, 98(8):1453–1466, 2010.

W. Gander, G. H. Golub, and U. von Matt. A constrained eigenvalue problem. Linear Algebra and its Applications, 114/115:815–839, 1989.

G. Gupta and J. Ghosh. Robust one-class clustering using hybrid global and local search. In Proc. ICML, 2005.

J. Kim and C. D. Scott. Robust kernel density estimation. JMLR, 13:2529– 2565, 2012.

J. Krapac, M. Allan, J. Verbeek, and F. Jurie. Improving web image search results using query-relative classifiers. In Proc. CVPR, 2010.

Wei Liu† Gang Hua†‡ John R. Smith, “Unsupervised One-Class Learning for Automatic Outlier Removal”, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

Smith Tsang†, Ben Kao†, Kevin Y. Yip‡, Wai-Shing Ho†, Sau Dan Lee,” Decision Trees for Uncertain Data”, IEEE Trans. Knowl. Data Eng., 1993.

C. L. Tsien, I. S. Kohane, and N. McIntosh, “Multiple signal integration by decision tree induction to detect artifacts in the neonatal intensive care unit,” Artificial Intelligence in Medicine, vol. 19, no. 3, 2000.

Bo Liu, Yanshan Xiao, Philip S. Yu,” Uncertain One-Class Learning and Concept Summarization Learning on Uncertain Data Streams”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 2, FEBRUARY 2014.

F. Bovoloa, G. Camps-Vallsb, and L. Bruzzonea, “A Support Vector Domain Method for Change Detection in Multitemporal Images,” Pattern Recognition Letters, vol. 31, no. 10, pp. 1148-1154, 2010.

L. Chen and C. Wang, “Continuous Subgraph Pattern Search over Certain and Uncertain Graph Streams,” IEEE Trans. Knowledge and Data Eng., vol. 22, no. 8, pp. 1093-1109, Aug. 2010.

X. Lian and L. Chen, “Similarity Join Processing on Uncertain Data Streams,” IEEE Trans. Knowledge and Data Eng., vol. 23, no. 11, pp. 1718-1734, Nov. 2011.

B. Geng, L. Yang, C. Xu, and X. Hua, “Ranking Model Adaptation for Domain-Specific Search,” IEEE Trans. Knowledge and Data Eng., vol. 24, no. 4, pp. 745-758, Apr. 2012.

S. Hido, Y. Tsuboi, H. Kashima, M. Sugiyama, and T. Kanamori, “Statistical Outlier Detection Using Direct Density Ratio Estimation,” Knowledge and Information Systems, vol. 26, no. 2, pp. 309-336, 2011.

S.V. Huffel and J. Vandewalle, The Total Least Squares Problem: Computational Aspects and Analysis. SIAM Press, 1991.

S.R. Gunn and J. Yang, “Exploiting Uncertain Data in Support Vector Classification,” Proc. 14th Int’l Conf. Knowledge-Based and Intelligent Information and Eng. Systems, pp. 148-155, 2007.

B. Jiang, M. Zhang, and X. Zhang, “OSCAR: One-Class SVM for Accurate Recognition of CIS-Elements,” Bioinformatics, vol. 23, no. 21, pp. 2823-2828, 2007.

R. Jin, L. Liu, and C.C. Aggarwal, “Discovering Highly Reliable Subgraphs in Uncertain Graphs,” Proc. ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, pp. 992-1000, 2011.

B. Li, K. Goh, and E. Chang, “Using One-Class and Two-Class SVMs for Multiclass Image Annotation,” IEEE Trans. Knowledge and Data Eng., vol. 17, no. 10, pp. 13330-1346, Oct. 2005.

B. Kao, S.D. Lee, F.K.F. Lee, D.W. Cheung, and W. Ho, “Clustering Uncertain Data Using Voronoi Diagrams




DOI: https://doi.org/10.26483/ijarcs.v9i3.5931

Refbacks

  • There are currently no refbacks.




Copyright (c) 2018 International Journal of Advanced Research in Computer Science