heart-disease-prediction

Heart Disease Prediction

Overview

This project aims to develop a Machine Learning model to predict the presence of heart disease in a patient based on various health parameters. The model is trained using classification algorithms and evaluated using metrics like accuracy, precision, recall, and F1-score.

📂 Dataset

The dataset used in this project is heart.csv. Ensure it is placed in the same directory as the Jupyter Notebook before running the code.

If you don’t have the dataset, you can download it from Kaggle.

Source: Kaggle
Number of Samples: 303
Target Variable: target (1 = Heart disease, 0 = No heart disease)

Features:

Project Steps

Data Preprocessing
- Handling missing values
- Encoding categorical variables
- Normalizing numerical features
Exploratory Data Analysis (EDA)
- Visualizing feature distributions
- Analyzing relationships between variables
Model Training
Implementing various classification algorithms:
- Logistic Regression
- Decision Tree
- Random Forest
Model Evaluation
- Comparing accuracy, precision, recall, AUC, and F1-score
- Selecting the best-performing model
Hyperparameter Tuning
- Optimizing model parameters for better performance
Final Prediction
- Using the best-performing model to predict heart disease

📊 Results

✅ Best Model: Logistic Regression with Optimal Threshold

The evaluation metrics for the best-performing model are as follows:

Precision: 0.8788
Recall: 0.9062
F1-Score: 0.8923
Accuracy: (Include your accuracy score here)

Additionally, the confusion matrix and ROC curve provide a visual representation of the model’s performance.

Confusion Matrix & ROC Curve

Below are the performance visualizations:

Confusion Matrix:
ROC Curve:

📌 How to Use This Project

1️⃣ Installation & Setup

To run this project locally, follow these steps:

Clone this repository:
```bash git clone https://github.com/kanna-vamshi-krishna/heart-disease-prediction.git

This site is open source. Improve this page.