In this dataset the sales price of houses in King County (Seattle) are present. It includes homes sold between May 2014 and May 2015.

To load the dataset in your jupyter notebook, use the below command:

import pandas as pd
house_data = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/kc_house_data/kc_house_data.csv')

Before doing anything we should first know about the dataset what it contains what are its features and what is the structure of data.

price: the price of the house. This is our target variable.
bedrooms: Number of bedrooms
bathrooms: Number of bathrooms
sqft_living: Square footage of the house
sqft_lot: Square footage of the lot
floors: Number of floors/ Level
waterfront: 1 = Waterfront view; 0 = No waterfront view
view: 1 = House been viewed; 0 = House has not been viewed
condition: 1 indicates worn-out property and 5 excellent
grade: Overall grade given to the housing unit, based on the King County grading system. 1 poor,13 excellent
sqft_above: Square footage of house apart from the basement
sqft_below: Square footage of the basement
yr_built: Year of the house built
yr_renovated: Year of the house renovated
zipcode: Zipcode
lat: Latitude coordination
long: Longitude coordination
sqft_living15: Square footage of the house in 2015 (implies-- some renovations)
sqft_lot15: Square footage of lot in 2015 (implies-- some renovations)

Load the evaluation dataset (name it as 'eval_data'). You can load the data using the below command.

eval_data = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/kc_house_data/kc_house_new_test_data.csv')

Here the price column is deliberately not there as you need to predict it.

This dataset was downloaded from Kaggle.

Predict the House Prices - King County