Well-calibrated Probabilistic Machine Learning Classifiers for Multivariate Healthcare Data

Akram Pasha, Latha P. H.


The healthcare applications frequently collect and store the patient data (mostly multivariate) to examine the history of the treatment and thereby enhance the effectiveness of treatment. The efficient treatment to the patient depends on the performance of the machine learning models used for analytics tasks of patient data. It is convenient to have a machine learning classification model in a healthcare application to predict the probability of an observation belonging to each possible class rather than predicting a class value directly for any disease classification problem. Such predicted probabilities are required to be calibrated to assist the overall support and confidence of any machine learning classification model used in many healthcare applications. In this paper, the predicted probabilities are studied to diagnose and improve the calibration of models used for probabilistic classification. The general performance of selected classification models on the two latest wart skin disease treatment data is also reported.


Data Mining, Machine Learning, Classification, Data Analytics, Calibration of Classifiers, Healthcare Systems.

Full Text:



Ismail A, Shehab A, El-Henawy IM. Healthcare Analysis in Smart Big Data Analytics: Reviews, Challenges and Recommendations. InSecurity in Smart Cities: Models, Applications, and Challenges 2019 (pp. 27-45). Springer, Cham.

Ottenbacher KJ, Graham JE, Fisher SR. Data Science in Physical Medicine and Rehabilitation: Opportunities and Challenges. Physical Medicine and Rehabilitation Clinics. 2019 Mar 2.

Milenkovic MJ, Vukmirovic A, Milenkovic D. Big data analytics in the health sector: challenges and potentials. Management: Journal of Sustainable Business and Management Solutions in Emerging Economies. 2019 Mar 19.

Delen D, Davazdahemami B, Eryarsoy E, Tomak L, Valluru A. Using predictive analytics to identify drug-resistant epilepsy patients. Health informatics journal. 2019 Mar 12:1460458219833120.

Khozeimeh F, Alizadehsani R, Roshanzamir M, Khosravi A, Layegh P, Nahavandi S. An expert system for selecting wart treatment method. Computers in biology and medicine. 2017 Feb 1; 81:167-75.

Pedregosa et al.,Scikit-learn: Machine Learning in Python, JMLR 12, pp. 2825-2830, 2011.

Arjaria SK, Rathore AS. Heart Disease Diagnosis: A Machine Learning Approach. InAdvanced Classification Techniques for Healthcare Analysis 2019 (pp. 161-181). IGI Global.

Ghorbani R, Ghousi R. Predictive data mining approaches in medical diagnosis: A review of some diseases prediction. International Journal of Data and Network Science. 2019;3(2):47-70.

Sharma M, Singh G, Singh R. An Advanced Conceptual Diagnostic Healthcare Framework for Diabetes and Cardiovascular Disorders. arXiv preprint arXiv:1901.10530. 2019 Jan 13.

Singh AK. A Comparative Study on Disease Classification using Machine Learning Algorithms. Available at SSRN 3350251. 2019 Mar 11.

Vashistha R, Yadav D, Chhabra D, Shukla P. Artificial Intelligence Integration for Neurodegenerative Disorders. InLeveraging Biomedical and Healthcare Data 2019 Jan 1 (pp. 77-89). Academic Press.

Syed L, Jabeen S, Manimala S, Elsayed HA. Data Science Algorithms and Techniques for Smart Healthcare Using IoT and Big Data Analytics. InSmart Techniques for a Smarter Planet 2019 (pp. 211-241). Springer, Cham.

Kari V, Amalanathan GM. Synthesis of Classification Models and Review in the Field of Machine Learning. InAdvanced Classification Techniques for Healthcare Analysis 2019 (pp. 18-51). IGI Global.

Razzak MI, Imran M, Xu G. Big data analytics for preventive medicine. Neural Computing and Applications.1-35.

Bucholc M, Ding X, Wang H, Glass D, Wang H, Prasad G, Maguire L, Bjourson A, McClean P, Todd S, Finn D. A practical computerized decision support system for predicting the severity of Alzheimer's disease of an individual. bioRxiv. 2019 Jan 1:573899.

Dehkordi SK, Sajedi H. Prediction of disease based on prescription using data mining methods. Health and Technology. 2019 Jan 24; 9(1):37-44.

Gambhir S, Kumar Y, Malik S, Yadav G, Malik A. Early Diagnostics Model for Dengue Disease Using Decision Tree-Based Approaches. InPre-Screening Systems for Early Disease Prediction, Detection, and Prevention 2019 (pp. 69-87). IGI Global.

Ramana BV, Boddu RS. Performance Comparison of Classification Algorithms on Medical Datasets. In2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC) 2019 Jan 7 (pp. 0140-0145). IEEE.

Ghiasi MM, Zendehboudi S. Decision tree-based methodology to select a proper approach for wart treatment. Computers in Biology and Medicine. 2019 Apr 4.

Pasha, Akram, and P. H. Latha. "Bio-inspired dimensionality reduction for Parkinson’s disease (PD) classification." Health information science and systems 8.1 (2020): 1-22.

DOI: https://doi.org/10.26483/ijarcs.v12i2.6696


  • There are currently no refbacks.

Copyright (c) 2021 International Journal of Advanced Research in Computer Science