COMPARATIVE ANALYSIS OF MACHINE LEARNING MODELS FOR PREDICTING HEART DISEASE
Main Article Content
Abstract
Cardiovascular diseases, particularly heart disease, remain among the leading causes of morbidity and mortality globally [1]. As the prevalence of heart-related conditions increases, there is a growing demand for early diagnostic systems that can support clinical decision-making and preventive care [2]. This whitepaper presents a comparative analysis of multiple machine learning algorithms applied to the UCI Heart Disease dataset [3], leveraging both statistical and predictive modeling approaches to identify patterns associated with heart disease. The study begins with an in-depth exploratory data analysis (EDA) to uncover trends, outliers, and correlations among clinical attributes such as age, cholesterol levels, resting blood pressure, and electrocardiographic results [4]. Following EDA, a suite of machine learning models—including Logistic Regression, Random Forest, and Gradient Boosting—are implemented to classify patients based on the likelihood of heart disease presence. Each model is evaluated using robust metrics including accuracy, precision, recall, F1-score, and the area under the ROC curve (AUC), enabling a performance-driven comparison [5]. Our findings indicate that ensemble-based models such as Gradient Boosting and Random Forest consistently outperform baseline models in predictive accuracy and sensitivity, making them ideal candidates for integration into clinical diagnostic tools [6]. The insights from this study highlight the critical role of feature selection, preprocessing, and model interpretability in healthcare AI applications [7]. This work contributes to the ongoing advancement of data-driven health technologies by demonstrating the potential of machine learning in enhancing the early detection and risk stratification of heart disease.
In the broader context, the United States' healthcare expenditure underscores the urgency for efficient diagnostic tools. In 2023, U.S. healthcare spending reached $4.9 trillion, accounting for 17.6% of the nation's Gross Domestic Product (GDP), with per capita spending at $14,570 [8]. Notably, hospital care, physician and clinical services, and retail prescription drugs collectively accounted for 60% of total spending [9]. Despite this substantial investment, the U.S. continues to face challenges in achieving optimal health outcomes, particularly in managing chronic diseases like heart disease. Implementing effective machine learning models for early detection can play a pivotal role in improving patient outcomes and reducing healthcare costs [10].
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.