Data Sprint #18: Food Recognition

Build a model to recognize the food



Problem Overview

Deep learning has been proved to be an advanced technology for big data analysis with a large number of successful cases in image processing, speech recognition, object detection, and so on. It has also been introduced in food science and engineering. There are some research papers already published for the applications like food recognition, identification of healthier food, etc.

We have provided you with more than 5000 images of three different foods - burgers, pizzas and soft drinks.


You are required to build a machine learning or deep learning model that would recognize if a given food is a pizza or a burger or a soft-drink.

Evaluation Criteria

Submissions are evaluated using Accuracy Score.

How do we do it? 

Once we release the data, anyone can download it, build a model, and make a submission. We give competitors a set of data (training data) with both the independent and dependent variables. 

We also release another set of data (test dataset) with just the independent variables, and we hide the dependent variable that corresponds with this set. You submit the predicted values of the dependent variable for this set and we compare it against the actual values. 

The predictions are evaluated based on the evaluation metric defined in the datathon.


The baseline notebook is available here.

About the Data

The training dataset contains 5400 images of burgers, pizza and soft drinks. You can download the training and testing dataset from the given link:

Dataset Link:

From the above link you will be able to download a zip file named ‘’. After you extract this zip file, you will get four files:

  • train - contains all the food images that are to be used for training your model.  Each image has a unique name.
  • Training_set_covid.csv - this csv file contains all the image ids present in the train folder with their respective label of burgers, pizza or softdrinks
  • test - contains food images. For these images you are required to make predictions as burger, pizza or softdrink
  • Testing_set_covid.csv - this is the order of the predictions for each image that is to be submitted on the platform. Make sure the predictions you download are with their image’s filename in the same order as given in this file.
  • sample_submission: This is a csv file that contains the sample submission for the data sprint.

Saving Prediction File & Sample Submission

You can find more details on how to save a prediction file here:

Sample submission: You should submit a CSV file with a header row and the sample submission can be found below.








Note that the header name should be prediction else it will throw an evaluation error. A sample submission file can be found in the zip file that you have downloaded from the above link.


The images are downloaded from multiple sources.



File Format

Your submission should be in CSV format.


This file should have a header row called 'prediction'.
Please see the instructions to save a prediction file under the “Data” tab.

