A STUDY ON EARLY PREVENTION AND DETECTION OF BREAST CANCER USING THREE-MACHINE LEARNING TECHNIQUES

Main Article Content

Nafees Akhter Farooqui
Ritika .

Abstract

The size of the Medical data repositories is increasing rapidly. Thus, we cannot easily analyze these data for finding the valuable and hidden knowledge. There are several machine learning techniques that are used for medical analysis. Breast cancer is the most common cancer particularly diagnosed in women. It is one of the leading causes of death worldwide. Only early detection can prevent the breast cancer’s mortality. Breast cancer is a cancer that forms in the cells of the breasts. Now a days Breast cancer had become a very major disease not only in India but also in other countries. The main objective of this paper is to early diagnosis of the breast cancer patients. For early prevention and detection of the breast cancer patients, three machine learning techniques (i.e. Decision tree, Support Vector Machine, Random Forest) are used, that also eliminates the waiting time and reducing the human and technical errors in diagnosing the breast cancer. Earlier detection of Breast Cancer gives more lives and falling the death rate. Its cure rate and expectation depend on the early identification and finding of the infections. The selection of suitable machine learning technique is a challenge for the diagnosis of breast cancer. Thus, we have created a model for a breast cancer prediction system to analyze risk levels which help in prognosis. This paper becomes very helpful to doctor for diagnosis breast cancer and helpful to patients for early treatment.

 

Downloads

Download data is not yet available.

Article Details

Section
Articles

References

Breast cancer facts and figs 2015-2016. American Cancer Society(2015).

Zribi M, Boujelbene Y(2016) The Neural Network with an Incremental Learning Algorithm Approach for Mass Classification in Breast Cancer.5: 2090-4924.

Karabatak M, Cevdet M (2009) An expert system for detection of breast cancer based on association rules and neural network. Expert Systems with Applications 36: 3465-3469.

Kovalerchuc B, Triantaphyllou E, Ruiz JF, Clayton J (1997) Fuzzy logic in computer-aided breast- cancer diagnosis: Analysis of lobulation. Artif Intell Med11: 75-85.

Zhou ZH, Jiang Y (2003) Medical diagnosis with C4.5 Rule preceded by artificial neural network ensemble. IEEE Trans Inf Technol Biomed 7: 37-42.

Delen D, Walker G, Kadam A (2005) Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34:113-127.

A. Sahar “Predicting the Serverity of Breast Masses with Data Mining Methods†International Journal of Computer Science Issues, Vol. 10, Issues 2, No 2, March 2013 ISSN (Print):1694-0814| ISSN (Online):1694-0784 www.IJCSI.org

Pendharkar PC, Rodger JA, Yaverbaum GJ, Herman N, Benner M (1999) Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Systems with Applications 17: 223-232.

Rajashree Dash “A hybridized K-means clustering approach for high dimensional dataset†International Journal of Engineering, Science and Technology Vol. 2, No. 2, 2010, pp. 59-66

Zakaria Suliman zubi “Improves Treatment Programs of Lung Cancer using Data Mining Techniques†Journal of Software Engineering and Applications, February 2014, 7, 69-77

Warren J. Cancer death rates falling, but slowly. WebMD medical news; 2003 (http://aolsvc.health.webmd.aol.-com/content/Artcile/73/82013.htm).

Progress shown in death rates from four leading cancers (http://cancer.gov/newscenter/pressreleases/2003Report Release).

The ABCs of breast cancer–—types of research studies (http://www.komen.org/bci/abs/chap_01.asp).

Ohno-Machado L. Modeling medical prognosis: survival analysis techniques. J Biomed Inform 2001; 34:428—39.

Brenner H, Gefeller O, Hakulinen T. A computer program for period analysis of cancer patient survival. Eur J Cancer 2002;38(5):690—5.

SEER Cancer Statistics Review. Surveillance, Epidemiology, and End Results (SEER) program (www.seer.cancer.gov) public-use data (1973—2000). National Cancer Institute, Surveillance Research Program, Cancer Statistics Branch, released April 2003. Based on the November 2002 submission. Diagnosis period 1973—2000, Registries 1—9.

Hankey BF. The surveillance, epidemiology, and end results program: a national resource.. Cancer Epidemiol Biomarkers Prev 1999; 8:1117—21.

Dempster AP, Laird NM, Rubin DB (1977) Maximum Likelihood from Incomplete Data via the EM Algorithm. J R Stat Soc Series B 39: 1-38.

Rubin DB, Schenker N (1991) Multiple Imputation in Health-Care Databases - an overview and some applications. Stat Med 10: 585-598.

Quinlan J. C4.5: programs for machine learning. San Mateo, CA: Morgan Kaufmann; 1993.

Cristianini N, Shawe-taylor J (2000) An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, London: Cambridge University Press.

Joachims T (1998) Making large-scale support vector machine learning practical. Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, MA, 169-184.

Ziegel, E. R. (2012). The Elements of Statistical Learning. Technometrics.

Kotsiantis, S. B. (2013). Decision trees: a recent overview. Artificial Intelligence Review, 39(4), 261-283.

Montano-Gutierrez, L. F., Ohta, S., Kustatscher, G., Earnshaw, W. C., & Rappsilber, J. (2016). Nano Random Forests to mine protein complexes and their relationships in quantitative proteomics data, 050302.

Pudlo, P., Marin, J. M., Estoup, A., Cornuet, J. M., Gautier, M., & Robert, C. P. (2016). Reliable ABC model choice via random forests. Bioinformatics, 32(6), 859-866.

Afanador, N. L., Smolinska, A., Tran, T. N., & Blanchet, L. (2016). Unsupervised random forest: a tutorial with case studies. Journal of Chemometrics, 30(5), 232-241.

Geoffrey McLachlan and Thriyambakam Krishnan. The EM Algorithm and Extensions. John Wiley & Sons, New York, 1996.

Geoffrey McLachlan and David Peel. Finite Mixture Models. John Wiley & Sons, New York, 2000.

Yair Weiss. Bayesian motion estimation and segmentation. PhD thesis, Massachusetts Institute of Technology, May 1998.