Bikeshare Trip History Dataset
Capital Bikeshare member and casual users have a great taste for bike travel through the bicycle sharing system that serves Washington, D.C. This system has been available since 2010 to provide a service that encourages group and community trips, guaranteeing meetings at defined stations and with registered bicycles.
Your task is to use machine learning models to predict the type of user who uses the service, whether user was a registered member (1 target value) or a casual rider (0 target variable).
Once you generate and submit the target variable predictions on evaluation dataset, your submissions will be compared with the true values of the target variable.
The True or Actual values of the target variable are hidden on the DPhi Practice platform so that we can evaluate your model's performance on unseen data. Finally, an F1 score for your model will be generated and displayed.
About the dataset
This database contains seven attributes of Bikeshare trips from January to December, 2019. The target variable refers to the type of user who uses the service, whether user was a registered member (1 target value) or a casual rider (0 target variable).
Download the training set from the following link: https://drive.google.com/file/d/1d28Os-L0vCyFxlhievQp9__1nKsllR3G/view, unzip the file and load the training data in your jupyter notebook, use the below command:
import pandas as pd bikes_data = pd.read_csv("Training_set_bike.csv" )
- Duration – Duration of trip (minutes)
- Start Date – Includes start date and time
- End Date – Includes end date and time
- Start Station – Includes starting station name and number
- End Station – Includes ending station name and number
- Bike Number – Includes ID number of bike used for the trip
- Member Type – Indicates whether user was a "registered" member (1) (Annual Member, 30-Day Member or Day Key Member) or a "casual" rider (0) (Single Trip, 24-Hour Pass, 3-Day Pass or 5-Day Pass)
Download the testing set from the following link: https://drive.google.com/file/d/1d28Os-L0vCyFxlhievQp9__1nKsllR3G/view, unzip the file and load the testing data in your jupyter notebook, use the below command:
bikes_data = pd.read_csv("Testing_set_bike.csv" )
target column is deliberately not there as you need to predict it.
Dataset taken from:
Capital Bikeshare (CaBi). Trip History Data. Washington, D.C.; Arlington County, Virginia; Alexandria, Virginia and Falls Church, Virginia; Montgomery County, Maryland and Fairfax County, Virginia. 2016. Availabe at: https://www.capitalbikeshare.com/system-data.
To participate in this challenge either you have to create a team of atleast members or join some team