Bikeshare Trip History Dataset

Analyzing Capital Bikeshare Trip data

Medium

|

4 Submissions

Context

Capital Bikeshare member and casual users have a great taste for bike travel through the bicycle sharing system that serves Washington, D.C. This system has been available since 2010 to provide a service that encourages group and community trips, guaranteeing meetings at defined stations and with registered bicycles.


Objective

Your task is to use machine learning models to predict the type of user who uses the service, whether user was a registered member (1 target value) or a casual rider (0 target variable).


Evaluation Criteria

Once you generate and submit the target variable predictions on evaluation dataset, your submissions will be compared with the true values of the target variable. 

The True or Actual values of the target variable are hidden on the DPhi Practice platform so that we can evaluate your model's performance on unseen data. Finally, an F1 score for your model will be generated and displayed.

About the dataset

This database contains seven attributes of Bikeshare trips from January to December, 2019. The target variable refers to the type of user who uses the service, whether user was a registered member (1 target value) or a casual rider (0 target variable).

Download the training set from the following link: https://drive.google.com/file/d/1d28Os-L0vCyFxlhievQp9__1nKsllR3G/view, unzip the file and load the training data in your jupyter notebook, use the below command:

import pandas as pd
bikes_data  = pd.read_csv("Training_set_bike.csv" )

Data Description
  • Duration – Duration of trip (minutes)
  • Start Date – Includes start date and time
  • End Date – Includes end date and time
  • Start Station – Includes starting station name and number
  • End Station – Includes ending station name and number
  • Bike Number – Includes ID number of bike used for the trip
  • Member Type – Indicates whether user was a "registered" member (1) (Annual Member, 30-Day Member or Day Key Member) or a "casual" rider (0) (Single Trip, 24-Hour Pass, 3-Day Pass or 5-Day Pass)

Evaluation Dataset

Download the testing set from the following link: https://drive.google.com/file/d/1d28Os-L0vCyFxlhievQp9__1nKsllR3G/view, unzip the file and load the testing data in your jupyter notebook, use the below command:

bikes_data = pd.read_csv("Testing_set_bike.csv" )

Here the target column is deliberately not there as you need to predict it.


References

Dataset taken from:

Capital Bikeshare (CaBi). Trip History Data. Washington, D.C.; Arlington County, Virginia; Alexandria, Virginia and Falls Church, Virginia; Montgomery County, Maryland and Fairfax County, Virginia. 2016. Availabe at: https://www.capitalbikeshare.com/system-data.

loading...

You need to choose a submission file.

File Format

Your submission should be in CSV format.

Predictions

This file should have a header row called 'prediction'.
Please see the instructions to save a prediction file under the “Data” tab.

To participate in this challenge either you have to create a team of atleast members or join some team