Cancer Classification in Microarray Data Using Gene Expression with KNN and FNN

Main Article Content

A.K. Selvanayaki


Classification is used for predicting class labels. In the accurate cancer classifications information gain is highly effective ranking scheme, gene subset selection using Euclidean distance metrics. K-Nearest Neighbor and Fuzzy Neural Network are used as good classifiers. Many other gene importance ranking schemes and classifiers may also be used in this approach. Classification involves four steps. In the first step, top genes are selected using a feature importance ranking scheme. In the second step, gene subset is generated using distance metrics. In the third step, the classification capability of all genes within the subset is classified by a good classifier. In the fourth step, the top genes selected using ranking scheme (without subset selection) is classified by a same classifier. Two data sets are used for classification, 1.Lymphoma dataset and 2.Liver dataset. In the two datasets, a small part of the data is missing. A k-nearest neighbor algorithm should be applied to fill the missing values. This research suggests a unified criterion for gene ranking and gene subset selection. In the micro array technology to find specific cancer –related genes that can be used to diagnose and predict cancer stage.



Key Words: Classification, KNN, FNN, Euclidean Distance, Information Gain.


Download data is not yet available.

Article Details