Data Sprint #4: Compressive Strength of Concrete
Civil engineering is a professional engineering discipline that deals with the design, construction, and maintenance of the physical and naturally built environment, including public works such as roads, bridges, canals, dams, airports, sewerage systems, pipelines, structural components of buildings, and railways.
Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients. Compressive strength or compression strength is the capacity of a material or structure to withstand loads tending to reduce the size, as opposed to which withstands loads tending to elongate. In other words, compressive strength resists being pushed together, whereas tensile strength resists tension (being pulled apart). In the study of strength of materials, tensile strength, compressive strength, and shear strength can be analyzed independently.
The concrete compressive strength is a highly nonlinear function of age and ingredients. These ingredients include cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, and fine aggregate. Your objective is to build a machine learning model that would help Civil Engineers to estimate the compressive strength of the concrete and they can further take a decision whether the concrete should be used in their current project or not.
Submissions are evaluated using Root-Mean-Squared-Error (RMSE).
How do we do it?
Once you generate and submit the target variable predictions on the testing dataset, your submissions will be compared with the true values of the target variable.
The True or Actual values of the target variable are hidden on the DPhi Practice platform so that we can evaluate your model's performance on testing data. Finally, a Root-Mean-Squared-Error (RMSE) for your model will be generated and displayed.
Start Date: 28th August 2020, 21:00 hours IST / 17:30 hours CET (please locate your time here)
End Date: 31st August 2020, 21:00 hours IST / 17:30 hours CET (please locate your time here)
Do you like to understand the problem through code?
Don't worry! Understand through code! Here is your getting started code
Problem Setter: Manish KC
About the Data
The dataset has 9 columns which tell you different measurements related to the concrete.
To load the training data on your notebook, use the below command:
import pandas as pd
concrete_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/concrete_data/training_set_label.csv" )
- Cement (component 1)(kg in a m3 mixture): Cement (component 1) -- Kilogram in a meter-cube mixture -- Input Variable
- Blast Furnace Slag (component 2)(kg in a m3 mixture): Blast Furnace Slag (component 2) -- kg in a m3 mixture -- Input Variable
- Fly Ash (component 3)(kg in a m3 mixture): Fly Ash (component 3) -- kg in a m3 mixture -- Input Variable
- Water (component 4)(kg in a m3 mixture): Water (component 4) -- kg in a m3 mixture -- Input Variable
- Superplasticizer (component 5)(kg in a m3 mixture): Superplasticizer (component 5) -- kg in a m3 mixture -- Input Variable
- Coarse Aggregate (component 6)(kg in a m3 mixture): Coarse Aggregate (component 6) -- kg in a m3 mixture -- Input Variable
- Fine Aggregate (component 7)(kg in a m3 mixture): Fine Aggregate (component 7) -- kg in a m3 mixture -- Input Variable
- Age (day): Age -- Day (1-365) -- Input Variable
- Concrete compressive strength(MPa, megapascals): Concrete compressive strength -- MegaPascals -- Output Variable
Load the test data (name it as test_data). You can load the data using the below command.
test_data = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/concrete_data/testing_set_label.csv')
Here the target column is deliberately not there as you need to predict it.
This dataset has been sourced from the UCI Machine Learning Repository.
To participate in this challenge either you have to create a team of atleast None members or join some team