Comprehensive Analysis of Data Mining Classifiers using WEKA

Hemlata Chahal


Data Mining or knowledge extraction from a large amount of data i.e. Big Data is a crucial and important task now a days. Data Mining and its applications are the most promising and rapidly emerging technologies. A number of Open Source Big Data Mining tools are available. Users or researchers must have the knowledge of the characteristics, advantages, capabilities of the tools. This paper gives an experimental evaluation of the algorithms of WEKA. The classification algorithms are analysed on the basis of accuracy and precision by taking the real dataset. The paper presents the comprehensive evaluation of different classifiers of WEKA. It will help the future researchers or data analysing business organisation to select the best available classifier while using WEKA.


Data Mining; Big Data; Classifiers; Big Data Mining Tools; Weka

Full Text:



Hand David, Mannila Heikki, Smyth Padhraic.: “Principles of data mining”, Prentice hall India, pp.1, 2004.

Witten, I.H., Frank, E.: “Data Mining: Practical machine Learning tools and techniques”, 2nd addition,Morgan Kaufmann, San Francisco(2005).

Chen, X., Ye, Y., Williams, G., & Xu, X. (2007). A survey of open source data mining systems Emerging Technologies in Knowledge Discovery and Data Mining (pp. 3-14): Springer.

Kumari, Subita, and Pankaj Gupta. "Proposed Architecture of MongoDB-Hive Integration." International Journal of Applied Engineering Research 12.15 (2017): 5000-5004.

Davenport, T. H., & Patil, D. (2012). Data scientist. Harvard Business Review, 90, 70-76.

Nurdatillah Hasim, Norhaidah Abu Haris “A Study of Open-Source Data Mining Tools for Forecasting”, in the proceedings of IMCOM’15, January 08 – 10, 2015, ACM 2015.

Luís C. Borges, Viriato M. Marques and Jorge Bernardino, “Comparison of Data Mining Techniques and Tools for Data Classification”, in the proceedings of C3S2E13, Jul 10-12 2013, Portugal, ACM, 2013.

Hemlata, Dr. Preeti Gulia, “Comprehensive Study of Open- Source Big Data Mining Tools”, International

Journal of Artificial Intelligence and Knowledge Discovery, e-ISSN: 2231- 0312, Vol. 6, Issue 1, January, 2016

A. Jović*, K. Brkić* and N. Bogunović, “An overview of free software tools for general data mining”, in the proceedings of 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), IEEE, 2014.

Hemlata, Gulia, Preeti. "Novel Algorithm for PPDM of Vertically Partitioned Data." International Journal of Applied Engineering Research 12.12 (2017): 3090-3096.

Ahmad Al-Khoder, Hazar Harmouch, “Evaluating four of the most popular Open Source and Free Data Mining Tools”, International Journal of Academic Scientific Research (272-6446), Volume 3, Issue 1, PP 13-23.

Kumari, Subita, and Pankaj Gupta. "Implementation of CouchDBViews." Big Data Analytics. Springer, Singapore, 2018. 241-251.

Hemlata, Dr. Preeti Gulia “Techniques and Algorithms of PPDM" International Journal for Scientific Research & Development Vol. 3, Issue 04, 2015 ISSN (online): 2321-0613.

Aggarwal C, Philip S Yu, "A General Survey of Privacy-Preserving Data Mining Models and Algorithms", Springer Magazine, XXII, 11-52, 2008.

Min Chen, Shiwen Mao and Yunhao Liu (2014). Big Data: A Survey, © Springer Science+Business Media New York 2014, published online: 22 january.

Duren Che, Mejdl Safran and Zhiyong Peng (2013). From Big Data to Big Data Mining: Challenges, Issues and Opportunities, © Springer-Verlag Berlin Heidelberg.

Hemlata, Gulia, P. (2018). DCI3 Model for Privacy Preserving in Big Data. In Big Data Analytics (pp. 351-362). Springer, Singapore.

Wei Fan, Albert Bifet (2012). Mining Big Data: Current Status, and Forecast to the Future, SIGKDD Explorations, 14(2).

Xindong Wu, Xingquan Zhu, Gong-Qing Wu, Wei Ding (2014). Data Mining with Big Data, IEEE Transactions On Knowledge And Data Engineering, 26(1).

Manyika J, McKinsey Global Institute, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C (2011). Byers AH Big data: the next frontier for innovation, competition and productivity. McKinsey Global Institute.

Duren Che, Mejdl Safran and Zhiyong Peng (2013). From Big Data to Big Data Mining: Challenges, Issues and Opportunities, © Springer-Verlag Berlin Heidelberg..

Wenliang Du, Zhijun Zhan, 2002, Building Decision Tree Classifier on Private Data ‘ Proceedings of IEEE International Conference on Data Mining’, Maebashi City, Japan, Vol 14.

Sumana M, Hareesh K.S. and Shashidhara H.S., “An Approach of Private Classification on Vertically Partitioned Data”, in the proceedings of International Conference and Workshop on Emerging Trends in Technology(ICWET 2010), February 26-27, ACM 2010.

Sheng Zhong and Zhiqlang Yang, “Guided perturbation: towards private and accurate mining” The VLDB Journal(2008) 17:1165-1177, Springer-Verlag 2007.

J. Vaidya, C. Clifton, “Privacy Preserving Association Rule Mining in Vertically Partitioned Data”, In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 639–644, 2002.

Weiwei Fang, Bingru Yang, Dingli Song, Zhigang Tang, “A New Scheme on Privacy-preserving Distributed Decision-tree Mining”, in the proceedings of First International Workshop on Education Technology and Computer Science, IEEE 2009.

H.R.Jalla and P.N. Girija, “A Novel Approach for Horizontal Privacy Preserving Data Mining” , Advances in Intelligent Systems and Computing, pg 101-111, Springer 2016.

[10] Nasrin Irshad Hussain, Bharadwaj Choudhury and Sandip Rakshit, “A Novel Method for Preserving Privacy in Big-Data Mining”, International Journal of Computer Applications(0975-8887) Volume 103- No 16, October 2014.

Hemlata Chahal, “ID3 Modification and Implementation in Data Mining” International Journal of Computer Applications (0975-8887) Volume 80- No7, October 2013.

Vikas Ashok and Ravi Mukkamala, “Data Mining Without Data: A Novel Approach To Privacy-Preserving Collaborative Distributed Data Mining” in the proceedings of WPES’11, October 17, ACM 2011.



  • There are currently no refbacks.

Copyright (c) 2018 International Journal of Advanced Research in Computer Science