Heart disease describes a range of conditions that affect your heart. With growing stress, the cases of heart diseases are increasing rapidly. Can you find a way to help out the doctors by detecting the presence of the disease?
To determine if heart disease is present or not i.e if
target is 1 or 0.
Submissions are evaluated using F1 Score. How do we do it?
Once you generate and submit the target variable predictions on evaluation dataset, your submissions will be compared with the true values of the target variable.
The True or Actual values of the target variable are hidden on the DPhi Practice platform so that we can evaluate your model's performance on unseen data. Finally, an F1 score for your model will be generated and displayed.
About the dataset
This database contains 14 attributes. The "target" variable refers to the presence of heart disease in the patient (0 = not present, 1 = present).
To load the training data in your jupyter notebook, use the below command:
import pandas as pd heart_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/Heart_Disease/Training_set_heart.csv" )
- age: Age in years
- sex: 1 = male, 0 = female
- cp: Chest pain type
- trestbps: Resting blood pressure (in mm Hg on admission to the hospital)
- chol: serum cholesterol in mg/dl
- fbs: fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
- restecg: Resting electrocardiographic results
- thalach: Maximum heart rate achieved
- exang: Exercise induced angina (1 = yes; 0 = no)
- oldpeak: ST depression induced by exercise relative to rest
- slope: The slope of the peak exercise ST segment
- ca: Number of major vessels (0-3) colored by fluoroscopy
- thal: 3 = normal; 6 = fixed defect; 7 = reversible defect
- target: 1 = Heart disease present, 0 = Heart disease not present
Load the evaluation data (name it as 'evaluation_data'). You can load the data using the below command.
evaluation_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/Heart_Disease/Testing_set_heart.csv" )
target column is deliberately not there as you need to predict it.
This dataset is downloaded from UCI Machine Learning Repository -
To participate in this challenge either you have to create a team of atleast members or join some team