Predict the price of an Airbnb listing

Easy

|

30 Submissions

Context

Airbnb, Inc. is an American vacation rental online marketplace company based in San Francisco, California, United States. Airbnb offers arrangements for lodging, primarily homestays, or tourism experiences. The company does not own any of the real estate listings, nor does it host events; it acts as a broker, receiving commissions from each booking. Reference

Since 2008, guests and hosts have used Airbnb to travel in a more unique, personalized way.


Objective

Imagine you are Data Scientist who would help find the price for lodging or homestays based on different attributes mentioned in their listings. Oh wait, what are listings? Listings can include written descriptions, photographs with captions, and a user profile where potential guests can get to know a bit about the hosts.

And you are given the listings of one of the most popular cities in central Europe: Amsterdam. Now your job is to build a machine learning model that will automatically predict the price for lodging or homestays.

About the Data

To load the training data in your jupyter notebook, use the below command:

import pandas as pd
airbnb_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/airbnb%20_data/airbnb_listing_train.csv" )

The data sets have not been cleaned since this is an important step in creating a predictive model. Note also that there are many opportunities to engineer your own features. It is allowed to merge your data with other online data as long as you can make sure that your model also works on the validation data. Note also that the data sets are large so be aware of overfitting your models.


Feature Description
  • id: The id of each lodge/home/listing
  • name: The name/description of the lodge/home
  • host_id: The id of the host
  • host_name: Name of the host
  • neighbourhood: Name of the neighbourhood place
  • neighbourhood_group: Group in the neighbourhood
  • latitude: Latitude of the location
  • longitude: Longitude of the location
  • room_type: Type of the room that consumer booked, for example, private room or an entire home, etc.
  • minimum_nights: The minimum number of nights customer will stay
  • number_of_reviews: Number of reviews given to the lodge/home
  • last_review: The date of the last review given to the lodge/home
  • reviews_per_month: Average reviews per month
  • calculated_host_listings_count: The count of the listing that each host has
  • availability_365: The number of days (out of 365 days) for which lodge/home is available
  • price: Price for the lodging/homestays in USD - the target variable

Acknowledgement

This dataset is part of Airbnb Inside, and the original source can be found here.

loading...

You need to choose a submission file.

File Format

Your submission should be in CSV format.

Predictions

This file should have a header row called 'prediction'.
Please see the instructions to save a prediction file under the “Data” tab.

To participate in this challenge either you have to create a team of atleast members or join some team