Defect Prediction by Pruning Redundancy in Association Rule Mining

Amarpreet Kaur

doi:10.26483/ijarcs.v8i5.3881

PDF

Published: Jun 23, 2017

DOI: https://doi.org/10.26483/ijarcs.v8i5.3881

Keywords:

Software engineering, Defect Prediction, Data Mining, Association Rule Mining

Amarpreet Kaur

Central University of Punjab

http://orcid.org/0000-0002-5469-148X

Abstract

Defect prediction is a major problem during software maintenance and evolution. It is important for the software developers to identify defective software modules to improve the software quality. Many organizations want to predict the defects in software systems, before they are deployed, to improve and measure the quality of software. Different researchers proposed various approaches to extract the defect-prone modules in the specific software system. This paper focuses on an effective model, called Apriori, which uses the approach of association rule mining. Association rule mining remains a very popular and effective method to extract meaningful information from a large data set. Apriori algorithm is based on the discovery of association rules for predicting whether a software module is defective or not. Different algorithms perform in a different manner on distinct datasets. This paper analyzes the shortcomings of Apriori algorithm and studies the improvement strategies to improve the performance of Apriori algorithm by removing the redundancy of rules generated on the basis of different parameters. In this paper, we use a new method to find the best â€˜nâ€™ association rules out of the pool of â€˜kâ€™ association rules based on heuristic analysis. This study will help improve the existing software defect prediction models in terms of precision, performance and other aspects.

Downloads

Download data is not yet available.

Issue

Vol. 8 No. 5 (2017): May-June 2017

Section

Articles

COPYRIGHT

Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
The journal allows the author(s) to retain publishing rights without restrictions.
The journal allows the author(s) to hold the copyright without restrictions.

Author Biography

Amarpreet Kaur, Central University of Punjab

Research Scholar, Computer Science and Technology

References

X. Amatriain, A. Jaimes, N. Oliver, and J. M. Pujol, Data Mining Methods for Recommender Systems. 2011.

R. Agrawal, â€œMining Association Rules between Sets of Items in Large Databases,â€ no. May, pp. 1â€“10, 1993.

P. He, â€œThe Research of improved Association Rules Mining Apriori,â€ Proc. - 3rd Int. Conf. Converg. Hybrid Inf. Technol. ICCIT 2008, no. August, pp. 0â€“2, 2004.

R. B. Diwate and A. Sahu, â€œData Mining Techniques in Association Rule : A Review,â€ vol. 5, no. 1, pp. 227â€“229, 2014.

V. Mangla, C. Sarda, and T. Nadu, â€œImproving the efficiency of Apriori Algorithm in Data Mining,â€ Int. J. Eng. Innov. Technol., vol. 3, no. 3, pp. 393â€“396, 2013.

L. C. Briand, V. R. Basili, and W. M. Thomas, â€œA pattern recognition approach for software engineering data analysis,â€ IEEE Transactions, vol. 18, no. 11. pp. 931â€“942, 1992.

X. Deng and X. Wang, â€œMining Rank-Correlated Associations for Recommendation Systems,â€ IEEE, no. 062112065, pp. 625â€“629, 2009.

R. Agrawal and R. Srikant, â€œFast algorithms for mining association rules,â€ Proceedings of the 20th International Conference on Very Large Databases. pp. 487â€“499, 1994.

J. Tian, â€œAn empirical comparison and characterization of high defect and high complexity modules,â€ vol. 67, pp. 153â€“163, 2003.

A. Bhandari, A. Gupta, and D. Das, â€œImprovised Apriori Algorithm Using Frequent Pattern Tree for Real Time Applications in Data Mining,â€ Procedia Comput. Sci., vol. 46, no. Icict 2014, pp. 644â€“651, 2015.

S. Rathee, M. Kaul, and A. Kashyap, â€œR-Apriori: An Efficient Apriori based Algorithm on Spark,â€ ACM, pp. 27â€“34, 2015.

A. H. Yousef, â€œExtracting software static defect models using data mining,â€ Ain Shams Eng. J., vol. 6, no. 1, pp. 133â€“144, 2014.

R. Mishra, â€œComparative Analysis of Apriori Algorithm and Frequent Pattern Algorithm for Frequent Pattern Mining in Web Log Data .,â€ Int. J. Comput. Sci. Inf. Technol., vol. 3, no. 4, pp. 4662â€“4665, 2012.

T. M. Khoshgoftaar and N. Seliya, â€œSoftware Quality Classification Modeling Using The SPRINT Decision Tree Algorithm Taghi,â€ pp. 365â€“374, 2002.

T. M. Khoshgoftaar, B. Raton, and R. M. Szabo, â€œAn Application of Zero-Inflated Poisson Regression for Software Fault Prediction,â€ pp. 66â€“73, 2001.

G. Czibula, Z. Marian, and I. G. Czibula, â€œSoftware defect prediction using relational association rule mining,â€ Inf. Sci. (Ny)., vol. 264, pp. 260â€“278, 2014.

J. Leskovec, Mining of Massive Datasets. 2014.

M. Zhang and C. He, â€œSurvey on Association Rules Mining Algorithms 2 Basic Principles of Association Rules,â€ pp. 111â€“118, 2010.

B. Goethals, â€œSurvey on Frequent Pattern Mining,â€ pp. 1â€“43, 2003.

S. Veeramalai, N. Jaisankar, and A. Kannan, â€œEfficient Web Log Mining Using Enhanced Apriori Algorithm with Hash Tree and Fuzzy,â€ vol. 2, no. 4, pp. 60â€“74, 2010.

R. Karthik and N. Manikandan, â€œDefect association and complexity prediction by mining association and clustering rules,â€ ICCET 2010 - 2010 Int. Conf. Comput. Eng. Technol. Proc., vol. 7, pp. 569â€“573, 2010.

S. Deepa and M. Kalimuthu, â€œAn Optimization of Association Rule Mining Algorithm using Weighted Quantum behaved PSO,â€ vol. 3, pp. 80â€“85, 2012.

S. Agarwal, â€œPrediction of Software Defects using Twin Support Vector Machine,â€ pp. 128â€“132, 2014.

Q. Wang and B. Yu, â€œExtract Rules from Software Quality Prediction Model Based on Neural Network,â€ no. Ictai, pp. 0â€“2, 2004.

I. Qureshi, J. Ashok, and V. Anchuri, â€œA Survey on Association Rule Mining Algorithm and Architecture for Distributed Processing,â€ Int. J. Comput. Sci. Inf. Technol., vol. 5, no. 3, pp. 4674â€“4678, 2014.

T. M. Khoshgoftaar, â€œTree-Based Software Quality Estimation Models For Fault Prediction,â€ 2002.

D. Kumari and K. Rajnish, â€œA new approach to find predictor of software fault using association rule mining,â€ Int. J. Eng.Technol., vol. 7, no. 5, pp. 1671â€“1684, 2015.

J. Manimaran and T. Velmurugan, â€œAnalysing the quality of Association Rules by Computing an Interestingness Measures,â€ vol. 8, no. July, 2015.

Z. Rong, D. Xia, and Z. Zhang, â€œComplex statistical analysis of big data: Implementation and application of apriori and FP-growth algorithm based on MapReduce,â€ Proc. IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS, no. 2012, pp. 968â€“972, 2013.

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

Amarpreet Kaur, Central University of Punjab

References

Most read articles by the same author(s)