Heart Disease



31 Submissions

Heart disease describes a range of conditions that affect your heart. With growing stress, the cases of heart diseases are increasing rapidly. Can you find a way to help out the doctors by detecting the presence of the disease?


To determine if heart disease is present or not i.e if target is 1 or 0.

Evaluation Criteria

Submissions are evaluated using F1 Score. How do we do it? 

Once you generate and submit the target variable predictions on evaluation dataset, your submissions will be compared with the true values of the target variable. 

The True or Actual values of the target variable are hidden on the DPhi Practice platform so that we can evaluate your model's performance on unseen data. Finally, an F1 score for your model will be generated and displayed.

About the dataset

This database contains 14 attributes. The "target" variable refers to the presence of heart disease in the patient (0 = not present, 1 = present).

To load the training data in your jupyter notebook, use the below command:

import pandas as pd
heart_data  = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/Heart_Disease/Training_set_heart.csv" )

Data Description
  • age: Age in years
  • sex: 1 = male, 0 = female
  • cp: Chest pain type
  • trestbps: Resting blood pressure (in mm Hg on admission to the hospital)
  • chol: serum cholesterol in mg/dl
  • fbs: fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
  • restecg: Resting electrocardiographic results
  • thalach: Maximum heart rate achieved
  • exang: Exercise induced angina (1 = yes; 0 = no)
  • oldpeak: ST depression induced by exercise relative to rest
  • slope: The slope of the peak exercise ST segment
  • ca: Number of major vessels (0-3) colored by fluoroscopy
  • thal: 3 = normal; 6 = fixed defect; 7 = reversible defect
  • target: 1 = Heart disease present, 0 = Heart disease not present

Evaluation Dataset

Load the evaluation data (name it as 'evaluation_data'). You can load the data using the below command.

evaluation_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/Heart_Disease/Testing_set_heart.csv" )

Here the target column is deliberately not there as you need to predict it.


This dataset is downloaded from UCI Machine Learning Repository -




You need to choose a submission file.

File Format

Your submission should be in CSV format.


This file should have a header row called 'prediction'.
Please see the instructions to save a prediction file under the “Data” tab.

To participate in this challenge either you have to create a team of atleast members or join some team