You are here

PlantCLEF 2019


Registration and data access

  • Each participant has to register on ( with username, email and password. A representative team name should be used
    as username.
  • In order to be compliant with the CLEF requirements, participants also have to fill in the following additional fields on their profile:
    • First name
    • Last name
    • Affiliation
    • Address
    • City
    • Country
  • Once set up, participants will have access to the CrowdAI PlantCLEF challenge's page

  • Usage scenario

    Automated identification of plants has improved considerably in the last few years. In the scope of LifeCLEF 2017 and 2018 in particular, we measured impressive identification performance over 10K species. However, these 10K species, mostly living in Europe and North America, only represent the tip of the iceberg. The vast majority of the species in the world (~369K species) actually lives in data deficient countries and the performance of state-of-the-art machine learning algorithms on these species is unknown and presumably much lower. Thus, the main novelty of the 2019 edition of PlantCLEF will be to extend the challenge to the flora of such data deficient countries.

    Data Collection

    In addition to the 10K species dataset provided last year (built from data abundant regions), we will provide a new 10K species dataset built from four distinct data-deficient regions in the world (South Africa, French Guyana, Laos and tropical belt). The average number of images per species in that new dataset will be much lower (about 10 vs. 100 for data-abundant regions one). Many species will contain only a few images and some of them might even contain only 1 image.

    Task overview

    The goal of the task will be to return the most likely species for each observation of the test set (an observation being a set of images of the same individual plants+metadata). The test set will be composed of two subsets, one covering data-abundant species and one covering data-deficient species.


    The used metric will be the Mean Reciprocal Rank (MRR). The MRR is a statistic measure for evaluating any process that produces a list of possible responses to a sample of queries ordered by probability of correctness. The reciprocal rank of a query response is the multiplicative inverse of the rank of the first correct answer. The MRR is the average of the reciprocal ranks for the whole test set:

    MRR formula

    where |Q| is the total number of query occurrences in the test set.

    Plain text icon fakerun.txt476 bytes