Dr. B. Lavanya, T. Madhumitha


The most important aim of data mining is to extract useful information from the datasets. Data mining can extract meaningful patterns from large datasets and it can analyze the dataset to predict and classify the dataset based on user specification. This paper deals with medical database called Gene Expression Omnibus from NCBI database, analysed using data mining techniques. The Microarray data of Autism Spectrum Disorder (ASD), contains 100 genes from 21 ASD children, analysed using unsupervised pattern mining algorithm called PREFIXSPAN to find the sequence pattern and dimensionality reduction as Principal Component Analysis (PCA) algorithm, to find the positively and negatively correlated genes for ASD. From the comparison of algorithms, it infers the genes that are Highly Influence by Autism Spectrum Disorder from the 100 genes.


Pattern Mining; Prefixspan; Positively Correlated; Negatively Correlated; Data Mining

Full Text:



Yin Li, Yan Cong, Yun Zhao, “Network motif-based for identifying coronary artery disease”, Experimental and therapeutic medicine (12)(1): 257-261, Jul; 2016.[online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4907106/. [Received 2015 May 22; Accepted 2016 Apr 1]. doi: 10.3892/etm.2016.3299 .

Yin Wang, Rudong Li, Yuhua Zhou, Zongxin Ling, Xiaokui Guo, Lu Xie and Lei Liu , “Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling data for disease Classification”,

BioMed Research International Volume , Hindawi publishing corporation, 2016, ArticleID 6598307, 11pages, [Recevied 28 October 2015; Accepted 12 January 2016], http://dx.doi.org/10.1155/2016/6598307.

Ian Fox, Lynn Ang, Mamta Jaiswal, Rodica, Pop-Busui, Jenna wiens, “Contextual Motifs- Increasing the utility of Motifs using Contextual Data”, in KDD ’17 Proceeding of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada – August 13 - 17, 2017, Pages 155 – 164.

Jiawei Lu, Di Dai,Buwen Cao, Ying Yin, “Inferring human miRNA functional similarity based on gene ontology annotation”, IEEE, 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Changsha, China, 13-15 Aug. 2016.

Padmavathi. S, Ramanujam. E , “Naïve Bayes Classifier for ECG abnormalities using Multivariate Maximal Time Series Motif”, Elsevier Procedia Computer Science 47:222 - 228, December 2015, DOI: 101016/j.procs2015.03.201

Shameek Ghosh, HungNguyen and jinyan Li, “Predicting short-term ICU outcomes using a sequential contrast motif based classification framework”, IEEE 38th Annual International Conference of the IEEE Engineering in Medical and Biology Society (EMBC), Orlando, FL, USA, 16-20 Aug. 2016.

Kai Shi, Lin Gao, Bingbo Wang, “Systematic tracking of coordinate differential network motifs identifies novel disease-related genes by integrating multiple data”, Elsevier Science Publishers B.V. Amsterdam, The Netherlands, Neurocomputing, Volume 206 Issue c, September 2016 Page 3-12.

Adnan Ferdous Ashrafi, A.K.M Iqtidar Newaz, Rasif Ajwad Moin, Mahmud Tanvee, M.A Mottalib, “A Modified Algorithm for DNA Motif Finding and Ranking Considering Variable Length Motif and Mutation” Conference: Recent Trends in Information Systems, Kolkata, India, 2015.

J.Sivaranjani, A.Neela Madheswari, “A Novel Technique of Motif Discovery for Medical Big data using Hadoop” 2017 Conference on Emerging Devices and Smart Systems (ICEDSS), Tiruchengode, India.

Duc-Hau Le, Vu-Tung Dang, Springer Berlin Heidelberg, “Ontology-based disease similarity network for disease gene prediction”, Vietnam Journal of Computer science , Volume 3 Issue 3, August 2016.

DOI: https://doi.org/10.26483/ijarcs.v9i5.6326


  • There are currently no refbacks.

Copyright (c) 2018 International Journal of Advanced Research in Computer Science