Predict the price of an Airbnb listing
Airbnb, Inc. is an American vacation rental online marketplace company based in San Francisco, California, United States. Airbnb offers arrangements for lodging, primarily homestays, or tourism experiences. The company does not own any of the real estate listings, nor does it host events; it acts as a broker, receiving commissions from each booking. Reference
Since 2008, guests and hosts have used Airbnb to travel in a more unique, personalized way.
Imagine you are Data Scientist who would help find the price for lodging or homestays based on different attributes mentioned in their listings. Oh wait, what are listings? Listings can include written descriptions, photographs with captions, and a user profile where potential guests can get to know a bit about the hosts.
And you are given the listings of one of the most popular cities in central Europe: Amsterdam. Now your job is to build a machine learning model that will automatically predict the price for lodging or homestays.
About the Data
To load the training data in your jupyter notebook, use the below command:
import pandas as pd airbnb_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/airbnb%20_data/airbnb_listing_train.csv
The data sets have not been cleaned since this is an important step in creating a predictive model. Note also that there are many opportunities to engineer your own features. It is allowed to merge your data with other online data as long as you can make sure that your model also works on the validation data. Note also that the data sets are large so be aware of overfitting your models.
- id: The id of each lodge/home/listing
- name: The name/description of the lodge/home
- host_id: The id of the host
- host_name: Name of the host
- neighbourhood: Name of the neighbourhood place
- neighbourhood_group: Group in the neighbourhood
- latitude: Latitude of the location
- longitude: Longitude of the location
- room_type: Type of the room that consumer booked, for example, private room or an entire home, etc.
- minimum_nights: The minimum number of nights customer will stay
- number_of_reviews: Number of reviews given to the lodge/home
- last_review: The date of the last review given to the lodge/home
- reviews_per_month: Average reviews per month
- calculated_host_listings_count: The count of the listing that each host has
- availability_365: The number of days (out of 365 days) for which lodge/home is available
- price: Price for the lodging/homestays in USD - the target variable
This dataset is part of Airbnb Inside, and the original source can be found here.
To participate in this challenge either you have to create a team of atleast members or join some team