A STUDY ON EARLY PREVENTION AND DETECTION OF BREAST CANCER USING THREE-MACHINE LEARNING TECHNIQUES
Main Article Content
Abstract
The size of the Medical data repositories is increasing rapidly. Thus, we cannot easily analyze these data for finding the valuable and hidden knowledge. There are several machine learning techniques that are used for medical analysis. Breast cancer is the most common cancer particularly diagnosed in women. It is one of the leading causes of death worldwide. Only early detection can prevent the breast cancer’s mortality. Breast cancer is a cancer that forms in the cells of the breasts. Now a days Breast cancer had become a very major disease not only in India but also in other countries. The main objective of this paper is to early diagnosis of the breast cancer patients. For early prevention and detection of the breast cancer patients, three machine learning techniques (i.e. Decision tree, Support Vector Machine, Random Forest) are used, that also eliminates the waiting time and reducing the human and technical errors in diagnosing the breast cancer. Earlier detection of Breast Cancer gives more lives and falling the death rate. Its cure rate and expectation depend on the early identification and finding of the infections. The selection of suitable machine learning technique is a challenge for the diagnosis of breast cancer. Thus, we have created a model for a breast cancer prediction system to analyze risk levels which help in prognosis. This paper becomes very helpful to doctor for diagnosis breast cancer and helpful to patients for early treatment.
Â
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
Breast cancer facts and figs 2015-2016. American Cancer Society(2015).
Zribi M, Boujelbene Y(2016) The Neural Network with an Incremental Learning Algorithm Approach for Mass Classification in Breast Cancer.5: 2090-4924.
Karabatak M, Cevdet M (2009) An expert system for detection of breast cancer based on association rules and neural network. Expert Systems with Applications 36: 3465-3469.
Kovalerchuc B, Triantaphyllou E, Ruiz JF, Clayton J (1997) Fuzzy logic in computer-aided breast- cancer diagnosis: Analysis of lobulation. Artif Intell Med11: 75-85.
Zhou ZH, Jiang Y (2003) Medical diagnosis with C4.5 Rule preceded by artificial neural network ensemble. IEEE Trans Inf Technol Biomed 7: 37-42.
Delen D, Walker G, Kadam A (2005) Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34:113-127.
A. Sahar “Predicting the Serverity of Breast Masses with Data Mining Methods†International Journal of Computer Science Issues, Vol. 10, Issues 2, No 2, March 2013 ISSN (Print):1694-0814| ISSN (Online):1694-0784 www.IJCSI.org
Pendharkar PC, Rodger JA, Yaverbaum GJ, Herman N, Benner M (1999) Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Systems with Applications 17: 223-232.
Rajashree Dash “A hybridized K-means clustering approach for high dimensional dataset†International Journal of Engineering, Science and Technology Vol. 2, No. 2, 2010, pp. 59-66
Zakaria Suliman zubi “Improves Treatment Programs of Lung Cancer using Data Mining Techniques†Journal of Software Engineering and Applications, February 2014, 7, 69-77
Warren J. Cancer death rates falling, but slowly. WebMD medical news; 2003 (http://aolsvc.health.webmd.aol.-com/content/Artcile/73/82013.htm).
Progress shown in death rates from four leading cancers (http://cancer.gov/newscenter/pressreleases/2003Report Release).
The ABCs of breast cancer–—types of research studies (http://www.komen.org/bci/abs/chap_01.asp).
Ohno-Machado L. Modeling medical prognosis: survival analysis techniques. J Biomed Inform 2001; 34:428—39.
Brenner H, Gefeller O, Hakulinen T. A computer program for period analysis of cancer patient survival. Eur J Cancer 2002;38(5):690—5.
SEER Cancer Statistics Review. Surveillance, Epidemiology, and End Results (SEER) program (www.seer.cancer.gov) public-use data (1973—2000). National Cancer Institute, Surveillance Research Program, Cancer Statistics Branch, released April 2003. Based on the November 2002 submission. Diagnosis period 1973—2000, Registries 1—9.
Hankey BF. The surveillance, epidemiology, and end results program: a national resource.. Cancer Epidemiol Biomarkers Prev 1999; 8:1117—21.
Dempster AP, Laird NM, Rubin DB (1977) Maximum Likelihood from Incomplete Data via the EM Algorithm. J R Stat Soc Series B 39: 1-38.
Rubin DB, Schenker N (1991) Multiple Imputation in Health-Care Databases - an overview and some applications. Stat Med 10: 585-598.
Quinlan J. C4.5: programs for machine learning. San Mateo, CA: Morgan Kaufmann; 1993.
Cristianini N, Shawe-taylor J (2000) An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, London: Cambridge University Press.
Joachims T (1998) Making large-scale support vector machine learning practical. Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, MA, 169-184.
Ziegel, E. R. (2012). The Elements of Statistical Learning. Technometrics.
Kotsiantis, S. B. (2013). Decision trees: a recent overview. Artificial Intelligence Review, 39(4), 261-283.
Montano-Gutierrez, L. F., Ohta, S., Kustatscher, G., Earnshaw, W. C., & Rappsilber, J. (2016). Nano Random Forests to mine protein complexes and their relationships in quantitative proteomics data, 050302.
Pudlo, P., Marin, J. M., Estoup, A., Cornuet, J. M., Gautier, M., & Robert, C. P. (2016). Reliable ABC model choice via random forests. Bioinformatics, 32(6), 859-866.
Afanador, N. L., Smolinska, A., Tran, T. N., & Blanchet, L. (2016). Unsupervised random forest: a tutorial with case studies. Journal of Chemometrics, 30(5), 232-241.
Geoffrey McLachlan and Thriyambakam Krishnan. The EM Algorithm and Extensions. John Wiley & Sons, New York, 1996.
Geoffrey McLachlan and David Peel. Finite Mixture Models. John Wiley & Sons, New York, 2000.
Yair Weiss. Bayesian motion estimation and segmentation. PhD thesis, Massachusetts Institute of Technology, May 1998.