On going

Ends on 1st Mar 15:30 UTC

Data Sprint #25: Flower Recognition

Build a model to recognize the name of a flower



146 Submissions


There are many species of flowers in the world. Some species have many colors, such as  roses, and tulips. It is hard to remember the names of all the flowers, and their information.  Furthermore, there are some flowers from similar species that can confuse anybody. For  example, white Champaka and Champak have similar names and petal shapes, but they have different colors and petal lengths.

Problem Statement

Babu Rao, the owner of a flower boutique, is a person who faces difficulty in recalling the names of the flowers. He remembers that he had practised deep learning during his college days which can help him in building a machine that would recognize the different flowers. But he hardly remembers all the tools to be used. So he seeks help from his old friend, that is you. Will you help him build a deep learning model?

Usually he sells 5 different flower collections - daisy, dandelion, roses, sunflowers and tulips.


You are required to build a machine learning model that would recognize the name of the flower.

What you will learn?
  • Practical applications of Deep Learning Algorithms, optimizing neural networks, CNN, etc. (Learn it here)

Evaluation Criteria

Submissions are evaluated using Accuracy Score.

How do we do it? 

Once we release the data, anyone can download it, build a model, and make a submission. We give competitors a set of data (training data), with both the independent and dependent variables. 

We also release another set of data (test dataset) with just the independent variables, and we hide the dependent variable that corresponds with this set. You submit the predicted values of the dependent variable for this set and we compare it against the actual values. 

The predictions are evaluated based on the evaluation metric defined in the datathon.


The baseline notebook is available here.

About the Data

The dataset contains raw jpeg images of five types of flowers. 

The dataset can be downloaded from the given link:

From the above link you will be able to download a zip file named ‘’. After you extract this zip file, you will get four files:

  • train - contains all the images that are to be used for training your model.  In this folder you will find five folders namely - ‘daisy’, ‘dandelion’, ‘rose’, ‘sunflower’ and ‘tulip’ which contain the images of the respective flowers
  • test - contains 924 flowers images. For these images you are required to make predictions as the respective flower names - ‘daisy’, ‘dandelion’, ‘rose’, ‘sunflower’ and ‘tulip’
  • Testing_set_flower.csv - this is the order of the predictions for each image that is to be submitted on the platform. Make sure the predictions you download are with their image’s filename in the same order as given in this file.
  • sample_submission: This is a csv file that contains the sample submission for the data sprint.


All images provided in this data sprint are licensed under the Creative Commons By-Attribution License, available at:

The photographers are listed in this file, thanks to all of them for making their work available. However, you will observe the image file names are different in this file than those we have provided. The file names are changed solely for the purpose of the data sprint.


All notebooks will be visible upon the Datathon completion.

You can still upload your notebooks during and after the datathon.

Practice Leaderboard will be enabled upon the Datathon completion.

All the submission rankings of a problem that are made during the Datathon & after the Datathon will be listed under the practice leaderboard.


You need to choose a submission file.

File Format

Your submission should be in CSV format.


This file should have a header row called 'prediction'.
Please see the instructions to save a prediction file under the “Data” tab.

To participate in this challenge either you have to create a team of atleast None members or join some team