Data Sprint #3: Abalone
What is Abalon?
Abalone is a common name for any of a group of small to very large sea snails, marine gastropod molluscs in the family Haliotidae. Other common names are ear shells, sea ears, and muttonfish or muttonshells in Australia, ormer in the UK, perlemoen in South Africa, and paua in New Zealand.
The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope -- a boring and time-consuming task. Other measurements, which are easier to obtain, are used to predict the age.
Your objective is to determine the age of Abalone from the physical measurements.
Submissions are evaluated using Root-Mean-Squared-Error (RMSE).
How do we do it?
Once you generate and submit the target variable predictions on the testing dataset, your submissions will be compared with the true values of the target variable.
The True or Actual values of the target variable are hidden on the DPhi Practice platform so that we can evaluate your model's performance on testing data. Finally, a Root-Mean-Squared-Error (RMSE) for your model will be generated and displayed.
Start Date: 21st August 2020, 21:00 hours IST / 17:30 hours CET (please locate your time here)
End Date: 24th August 2020, 21:00 hours IST / 17:30 hours CET (please locate your time here)
Do you like to understand the problem through code?
Don't worry! Understand through code! Here is your getting started code
Problem Setter: Manish KC
About the Data
The data set has 9 columns which have information related to physical measurements of abalones and the number of rings (representing age).
To load the training data in your jupyter notebook, use the below command:
import pandas as pd
abalone_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/abalone_data/training_set_label.csv" )
Sex: Sex (M: Male, F: Female, I: Infant)
Length: Longest Shell measurement (millimetres - mm)
Diameter: Diameter - perpendicular to length (mm)
Height: Height - with meat in shell (mm)
Whole weight: Weight of whole abalone (grams)
Shucked weight: Weight of meat (grams)
Viscera weight: Gut weight after bleeding (grams)
Shell weight: Shell weight - after being dried (grams)
Rings: Rings - value + 1.5 gives age in years (eg. 4 = 5.5 years)
Load the test data (name it as test_data). You can load the data using the below command.
test_data = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/abalone_data/testing_set_label.csv')
Here the target column is deliberately not there as you need to predict it.
This dataset is downloaded from the UCI Machine Learning Repository.
To participate in this challenge either you have to create a team of atleast None members or join some team