You are here

SnakeCLEF 2021



Developing a robust system for identifying species of snakes from photographs is an important goal in biodiversity and global health. With over half a million victims of death and disability from venomous snakebite annually, understanding the global distribution of the >3700 species of snakes and differentiating species from images (particularly images of low quality) will significantly improve epidemiology data and treatment outcomes. The goals and usage of image-based snake identification are complementary with those of other challenges: classifying snake species in images, predicting the list of species that are the most likely to be observed at a given location, and eventually developing automated tools that can facilitate the integration of changing taxonomies and new discoveries.

Data collection

We prepared a large dataset with 414,424 photographs belonging to 772 snake species and taken in 188 countries. The majority of the data were gathered from online biodiversity platforms (i.e.,iNaturalist, HerpMapper) and were further extended by data scraped from Flickr. Furthermore, we have assembled a total of 28,418 images from private collections and museums. The final dataset has a heavy long-tailed class distribution, where the most frequent species (Thamnophis sirtalis) is represented by 22,163 images and the least frequent by just 10 (Achalinus formosanus). Such a distribution with small inter-class variance, high intra-class variance, and a high number of species (classes) creates a challenging task even for current state-of-the-art classification approaches.

Task description

Given the set of images and corresponding geographic locality information, the goal of the task will be to return for each image a ranked list of species sorted according to the likelihood that they are in the image and might have been observed at that location.

How to participate ?

A direct link to the related challenge on AIcrowd platform will be provided soon. In the meantime:

  1. Each participant has to register on AICrowd ( with username, email and password. A representative team name should be used
    as a username.
  2. In order to be compliant with the CLEF requirements, participants also have to fill in the following additional fields on their profile:
    • First name
    • Last name
    • Affiliation
    • Address
    • City
    • Country
  3. This information will not be publicly visible and will be exclusively used to contact you and to send the registration data to CLEF, which is the main organizer of all CLEF labs. Once set up, participants will have access to the dataset tab on the challenge's page. A LifeCLEF participant will be considered as registered for a task as soon as he/she has downloaded a file of the task's dataset via the dataset tab of the challenge.


Participants are allowed to submit up to 10 submissions. Please send up to 10 CSV files to the organizer's emails.
Testing will be done via CSV file evaluation. The simple format is expected 1. column with image UUID and 2. column with a prediction as class_id (provided in the training metadata).
Test data are available at TEST IMAGES program.
Test metadata are available below.


The winner of each of the four LifeCLEF 2021 challenges will be offered a cloud credit grant of 5k USD as part of Microsoft's AI for earth program.


1. BME-TMIT 0.903
2. Gokul 0.877
3. CMP 0.860
4. FHDO-VCSG 0.829
5. UAIC AI 0.785
6. SSN 0.724
7. SSN-MLRG 0.269


  • Lukas Picek, Dept. of Cybernetics, FAV, University of West Bohemia, Czechia, lukaspicek(replace-by-an-arrobe)
  • Andrew Durso, Department of Biological Sciences, Florida Gulf Coast University, Fort Myers, USA
  • Rafael Ruiz De Castaneda, University of Geneva, Switzerland, Rafael.RuizDeCastaneda(replace-by-an-arrobe)