DIABETES DISEASE PREDICTION USING MACHINE LEARNING ENSEMBLE METHOD
Main Article Content
Abstract
Machine learning incorporates AI, and is used to solve many problems in data science. The machine reads patterns from existing databases, and then inserts them into an unknown database to predict the outcome. Classification can be a powerful machine learning method commonly used for prediction. Some classification algorithms provide satisfactory accuracy, while others provide restricted accuracy. This paper examines a method called ensemble classification, which is often used to improve the accuracy of weak algorithms by combining multiple categories. Tests for this tool are performed using a diabetic database. A comparative analytical approach was performed to find out how the ensemble process is often used to improve diabetes prognosis. The goal of this paper is not only to increase the accuracy of weak classification algorithms, but also to implement an algorithm on a medical database, to demonstrate its ability to detect the disease at an early age. The results of the study indicate that integrated strategies, such as the random forest, are effective in increasing the predictive accuracy of weak classifiers, and have shown satisfactory effectiveness in identifying the risks of diabetes. A seven-point increase in the accuracy of the weak classifiers was achieved with the help of an ensemble classification.
Â
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
Ayman Mir, Sudhir N. Dhage. (2018).Diabetes Disease Prediction using Machine Learning on Big Data of Healthcare. Naive Bayes, Support Vector Machine, Random Forest and Simple CART algorithm in WEKA to predict Diabetes. Random Forest turns out to be an accuracy of 78% over Naive Bayes, SVM and Simple CART.
V Mohan, R Deepa, M Deepa, S Somannavar, M Datta (2015).A Simplified Indian Diabetes Risk Score for Screening for Undiagnosed Diabetic Subjects. The Indian Diabetes Risk Score is developed based on results of many logistic regression analysis. Internal validation is performed on the identical data. IDRS has mainly four risk factors - abdominal obesity, family history of diabetes, age and physical activity.
Rajawat, P. S., Gupta, D. K., Rathore, S. S., & Singh, A. (2018). Predictive Analysis of Medical Data using a Hybrid Machine Learning Technique. Hybrid Machine learning approach to predict if a person is in risk of diabetes. Hybrid Technique turns out with an accuracy of 87.33% better than SVM,ANN,KNN.
Yahyaoui, A., Jamil, A., Rasheed, J., & Yesiltepe, M. (2019). A Decision Support System for Diabetes Prediction Using Machine Learning and Deep Learning Techniques. Machine learning algorithms (SVM,RF) and Deep Learning is based on algorithms which are used for predicting of diabetes. The results have showed that R F is more effective for the classification of diabetes which produced overall accuracy for diabetic prediction to be 80.67%.
Ramzan, M. (2016). Comparing and evaluating the performance of WEKA classifiers on critical diseases. Naive Bayes, Random Forest and J48 Decision Tree are the ones used to compare classifiers to predict critical diseases using the WEKA tool. Random forest however turns out with a higher accuracy which is more than both J48 and Naïve Bayes.
Ashwinkumar.U.M and Dr. Anandakumar K.R, "Predicting Early Detection of cardiac and Diabetes symptoms using Data mining techniques", International conference on computer Design and Engineering, vol.49, 2012.